Nimbus: Medalla update - August 21 (take two)

Nimbus: Medalla update - August 21 (take two)

The first thing to say is that our validators are doing much better than last week. We are currently tracking tracking the head of the chain on Medalla and attesting! 🎉

Keep your client updated!

If there's one takeaway from this update it's that there have been significant improvements in both stability and memory usage over the past few days, so please make sure you keep your client update.

To update to the latest version on a Mac or Linux device, run:

git pull && make update

If you're on windows,

git pull && mingw32-make update

should do the trick.

Handling sync problems and resource leaks

As things stand, we recommend restarting your node every 6 hours to make up for resource leaks (an issue we are currently investigating). If you have a local Grafana setup, you can try monitoring the severity of your leaks and playing around with the restart interval.

If you’re experiencing sync problems, or have been running an old version of medalla, we recommend running make clean-medalla to restart your sync (make sure you’ve updated to the latest devel branch first though).

Note: the clean-medalla command has been updated to only reset the sqlite database (your validator keys won't be touched).

Main highlights

Highlights over the last few days (and hours!) include: a significant performance fix for attestation processing which helps us retain peers (we're now only losing peers during periods of increased cpu load), a fix to drastically speed up processing times during long periods of non-finality and ensure memory usage is kept low, and  a fix for password issues that arose when importing validator keys on windows.

Side note: We’re happy to see more Windows users interested in Nimbus. Thanks for giving us a try! We realise our getting started docs are less than adequate, and we're working on making sure they are up-to-date and as intuitive as possible. Until then, please feel free to ask us any questions you may have on our discord.

Syncing performance

Currently, we have one main outstanding issue :  Syncing performance isn’t great.

As it stands, we subscribe to eth2’s gossip protocol as soon as we start Nimbus. Subscribing too early pollutes our quarantine system – which we use to store blocks from the network that can’t be attached to the chain yet – with blocks thousands of epochs in the future. This creates extra processing during sync.

We have a PR in progress (the first part has been merged), so that we don’t subscribe to topics too early. We hope this will significantly improve both the stability and the syncing speed of Nimbus.

Besides this, we’ve revived the old multinet scripts (the ones that were used in the original interop event). We're using them to create small testnets – just Nimbus and Lighthouse for now – and to debug specific gossip issues.

To those of you running Nimbus, thank you for the invaluable feedback, it's thanks to you that we're able to keep improving – please keep it coming! If you haven’t tried Nimbus yet, please join our discord and give us a spin. The importance of client diversity has never been clearer! 💛

Edit: Friday 21 August 16.00 UTC – removed the other two main issues – high memory usage and peer loss– since we seem to have fixed them.

Edit: Friday 21 August 17.30 UTC - added client update commands