Taipei sharding meetup reflections

Greener pastures

I was cleaning out my Google doc folder the other day to move on to greener pastures, and found this little brain dump that never made it to anywhere - it's fun mainly as a historical document. This was pre-shasper, pre-beacon chain, and for me, one of the first things I did joining Status and the world of Ethereum... pre a lot of stuff in other words! Take it for what it is, link rot included - the most up-to-date Ethereum 2.0 plans can be found here.

Reflections

I just had the good privilege to meet with a large group of Ethereum developers for a 3-day workshop in Taipei. This was an excellent opportunity to understand what’s going on in the Ethereum sharding research space - even though all information is available online, hearing it live and having the ability to interrupt and ask questions always gives things special quality.

Hsiao-Wei Wang, one of the organizers of the event has already written up an excellent post as to what was talked about, so I won’t repeat that here - check out https://medium.com/@icebearhww/ethereum-sharding-workshop-in-taipei-a44c0db8b8d9

There’s also a great deal more information on the subject - the entry points being:

http://notes.ethereum.org/s/BJc_eGVFM# - this is the compendium of research having been done up to the event, and was the recommended reading list before arrival
https://github.com/ethereum/wiki/wiki/Sharding-FAQ - this is more general, about sharding

Instead, I’ll focus on two things - the talk between the talks and the things that appeared most relevant to status right now.

On sharding clients, and their timelines

There’s a diverse group of developers working on sharding implementations that are all at different stages of development - ranging from “working on it” to “will start in a few weeks or months, maybe”. The full list is here: https://github.com/ethereum/wiki/wiki/Sharding-introduction-and-implementations

As far as timelines go, Vitalik&Co are planning to work in phases, where each phase adds more features - each phase representing useful step. The rough estimate is that each of the 5-6 phases will take about a year to complete, with Phase 1 being underway already.

Between the talks, we informally agreed on using Gitter and https://github.com/ethereum/sharding to discuss sharding matters in general - mainly because most people already seemed to have accounts on gitter - there’s a good group of people there already so let’s see how it pans out.

Another thing discussed were test suites - the idea here was to build up a suite of test cases that all clients can share and use.

There was a general desire to move on to libp2p - devp2p was created as an interim solution while libp2p was being developed (by the same developers, apparently), and speaking libp2p would bring many advantages for future-proofing the protocol - parity are already moving in this direction.

As a final note on parity, they were in a similar situation as us in 2015 when they started - programming language (rust) incomplete and in flux, libraries nonexistent and underdeveloped. They highlighted that although this caused some extra work early on, they were quite happy with the end result - notably, there is a big upside to hiring developers that are not afraid of experimenting with a new language.

On sharding itself

Overall, the general feeling was that there’s still a lot to do. Sharding messes with the base layer, so it’s going to be done safe and slow - that said, it is also being envisioned as a vehicle to deliver features that might be controversial or difficult to add to the main chain.

Two things prominently stood out as such, eWASM and rent - eWASM is a new execution engine, and instead of simply deploying it on the main chain, the idea is to decouple execution from block proposal and collation, such that the lower layers are unaware of what kind of data they’re processing. This would allow shards to execute code using current EVM, eWASM or some other, yet to be discovered engines (perhaps based on some properties in the data).

Rent is similar - because it’s a such a controversial topic, the sharding release was seen as one possible point where it could be added as “part of the deal”, rather than introducing it separately.

Of course, the most interesting part for us right now is Phase 1, and that one is actually pretty limited - the majority of it seems to be a Validator Manager Contract on the main chain, coupled with two initial node types: Proposers and Collators. Notably, execution is not part of it (neither EVM or eWASM)! It’s very bare bones, and it’s not decided yet if it will make it past testnet.

What is part actually is an initial Proof-of-Stake system, where a stake on the main net provides security for the side chains.

On security

Peter and Felix that are working on geth did a great presentation on pitfalls that ethereum client developers can run into when deploying their client into the wild. This looked to me like a gold mine of information that would be extremely useful for our own client development effort, and talking more to Peter revealed that the stuff presented was just the tip of the iceberg.

Both parity and go-ethereum run bug bounty programs and it is often the case that the same bug will get reported to both projects. After the bug is fixed, it seems that the information is mostly lost in commit logs however, making it hard to find. For someone like us that’s implementing a client from scratch, this is obviously not a good situation - likewise for users that have a hard time knowing exactly what kind of security issues there have been and are out there in the wild.

Another security issue / potential DoS avenue was the proliferation of sync modes - full, fast, light and warp - they all have different issues and perhaps they could be unified in a sync mode to rule them all? This is part of a bigger issue where clients have implemented lots of features on their own without them ever being brought “back” to the other clients, leading to feature fragmentation in the network.

There were three interesting ideas floated here in the in-between talks to combat this, both of which we could meaningfully contribute to:

Do an ethereum client implementer meetup that would focus on client implementation issues and cooperation - this would be an opportunity to cross-pollinate the community with ideas - for us, it would be an excellent opportunity to learn from the devs that have written battle-tested clients for real
Work on a shared exploit database that would be used to keep track of past mistakes - this is a fantastic benefit to all players involved - client implementers can check their own implementations, users have clarity on what’s going on and all is done in a transparent and trust-building manner
Keep working on the shared, json-based test suite. At the same time, investigate if there are other test environments where clients could potentially be tested for compatibility / interoperability - there apparently exist docker images with all clients in them suitable for running such tests

Another way to look at this section from the perspective of a fresh client developer is this:

A year from now, we’re going to have an ethereum client in our hands that’s used by millions of users.

What steps do we need to take in order to ensure that it becomes a highly reliable, safe and trusted client? How do we gain the trust of our users? How do we prepare ourselves for the crises that will inevitably come, as a result of us making mistakes / bugs?