Kusama Validator Offline Offence - Post Mortem

One of our Validator nodes went offline at the end of era 3281 until the start of era 3282 as the result of the corruption of a blockchain database that was being used. This led to the inability to produce blocks for 2h:45m.

What happened

A database that we were using to run our validator node crashed. We immediately took action and started deploying a new node using a recent snapshot of our disk. Once fully synced, new keys were generated and signed off before turning the validation service back on. As we created a new validator with the rotation of session keys, we were forced to chill for an additional epoch.

Customer Impact

Delegators who nominated this validator and had their stake allocated to it will receive lower rewards than it could have potentially earned for 2 epochs.

What went wrong?

Inadequate tools for this particular event. We did not deploy the db from our backups (pruned) because on occasion, we faced issues with our automation running too long. Additionally, we had not recovered the session keys since it’s always a risky option and the situation at hand did not warrant taking such risks.

What went well?

It was immediately notified that the node stopped producing blocks and the root cause was identified almost instantly. This allowed us to swiftly amend the occurrence, and with the help of fully automated key rotation/verification and signing cycle, we were able to get a fully operational node running in an effective manner.

Lessons learnt and action plan

Improve our incident handling procedures and find a faultless and rapid fix for these kinds of events. We already improved and reached faster spin up for fully synced nodes and will implement a solution for safe session keys management.

P2P takes full responsibility for the event that led to the weak performance and we are sorry for the inconvenience. Please be assured that P2P is taking actions to eliminate even a small probability of such an event occurring in future.

If you have any questions feel free to join our Telegram chat, we are always open for communication.

