Running Firedancer
The previous article discussed why PoS networks strive for multiple validation clients and how Firedancer helps secure Solana. Below, we will share our experience of running Firedancer in Testnet. We will run through the official documentation for building Firedancer, monitoring options, compatibility with our current infrastructure, and check the client's stability.
What is Frankendancer?
During the Solana Breakpoint 2023 event, it was announced that “Frankendancer” (Firedancer with runtime and consensus modules borrowed from Solana labs client) successfully acts as a validator node in the Solana Testnet. We were excited about this announcement and the rest of the community.
The Firedancer team provides great build & run documentation, which can be found here. The documentation is decent, so we don’t think it is necessary to repeat it here. You must update your Linux kernel to >5.7 to successfully build Firedancer (it should not be a problem even on Ubuntu 20 since its mainline repo contains this kernel). We successfully built Firedancer on top of our test server with Ubuntu 20.04.
Configuration
In contrast to the Solana Labs client, Firedancer uses a toml configuration format—some configuration directives bypass Solana Lab components so that you can use this feature in your favor. For example, you can bypass only the public key of your vote key with a directive vote_account_path
and keep your vote key out of a server with a validator in a safe place. The key configuration section may look like this:
[consensus]
identity_path = "~/identity-keypair.json"
vote_account_path = "DGjZLMwYmQU7vCPqyTxdnyKwaf1uu8eMdLeSmzqv2SZj"
If you have experience running the Solana Labs client, the basic config for Firedancer will not raise many questions for you except for section layout. Let’s revise the “must-have” minimum options of this section:
affinity
- number or range of logical CPUs available for Firedancer. According to documentation, it is advised to allocate some cores for Solana Labs client and the rest for Firedancer. We had 32 cores on the server, enough to provide up to 9 cores to make Firedancer work (affinity = “0-27”). We also tried to play with affinity
directive and found that Firedancer starts successfully with 12 cores at least.
net_tile_count
refers to the number of network queues on your network card. You need to adjust this directive precisely to the amount of network queues exposed by the network driver to OS. If net_tile_count
exceeds the number of network queues, you will lose some cores for nothing. If net_tile_count
is lower than the number of network queues, Firedancer starts losing traffic (higher skip rate, lower vote success, etc.).
verify_tile_count
- in the official documentation, it is said that this directive controls how many tiles are dedicated to transaction signature verification. In our case, the command /opt/fdctl configure init all --config /home/firedancer/config.toml
crashed if verify_tile_count
was not equal to net_tile_count
, and as a consequence, Firedancer could not be started.
To figure out the number of network queues, you will first need to get the name of your network interface by using the commands ip address
or ifconfig
. After that, you can use the utility ethtool
to retrieve the number of network queues:
# ethtool -l {network_interface_name} | grep -A4 "Current hardware settings"
Current hardware settings:
RX: 0
TX: 0
Other: 0
Combined: 2
If you didn’t apply special tuning to your network stack you likely see a count of network queues in row Combined.
Startup unit
If you plan to run Firedancer for an extended period, you will probably need configuration for the process manager. You are likely using Systemd, so we are happy to share our Systemd unit for Firedancer, which we used for running our instance of Firedancer in Testnet:
[Unit]
Description=Firedancer Validator Service
After=network.target
[Service]
ExecStartPre=/opt/fdctl configure init all --config /home/firedancer/config.toml ExecStart=/opt/fdctl run --config /home/firedancer/config.toml
ExecStop=/opt/fdctl stop
Restart=on-failure
LimitNOFILE=100000
LimitNPROC=100000
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
It is a very basic Systemd unit without hardening, so it is not production-ready and works for testing only. If you plan to use this Systemd unit, you need to adjust paths of config and fdctl in your environment. It is worth mentioning that Firedancer requires root rights for starting, so it makes no sense to run this unit under a non-root user. After starting the application and configuring the necessary system parameters, Firedancer will downgrade its privileges to the user specified in the configuration file. Note that the configure
command must be executed before each Firedancer startup and after each system restart.
Monitoring
Next, we would like to discuss monitoring-related questions. Solana Labs provides CLI utilities for monitoring validators, but there is no built-in Prometheus target (Prometheus is the de facto standard nowadays). We developed an in-house Prometheus exporter to monitor our Solana nodes (both validators and RPC), and we were thrilled to see that our prom exporter could fetch all metrics from Firedancer RPC with no issues and no changes required (you can find our Grafana dashboard with metrics fetched from Firedancer on Picture 1). Presumably, it happened because of the implementation of the REST-API interface located in one of Solana Labs' modules integrated into Firedancer. Firedancer is not equipped with tools for monitoring the node's performance in the cluster. However, it was announced that some monitoring tools will be added in the future. Before that, you can use Solana CLI tools to monitor Firedancer (solana-validator --ledger={ledger_path} monitor
or solana catchup --our-localhost --follow
); we checked it works smoothly.
Results
We ran the Firedancer client (version 1.17.1004) on our Testnet validator for 4 days, and no accidental crashes were observed. It would be great to run some performance tests (Solana Labs client vs Firedancer). Still, it doesn't make much sense because Firedancer performance will be bottlenecked on the runtime/consensus stage.
Solana Validator has an “on-the-fly” key change capability, which we often use in our work. We asked the Firedancer developers about this feature, and they said it's not in the plans right now. Still, they have a long-term idea to make restarts as fast as possible so you can restart with the new identity key rather than trying to support dynamic configuration.
We are delighted to confirm that Frankendancer can act as a Solana validator at this stage! It feels like Jump Trading invested a lot of work and passion in creating Firedancer, and we in P2P appreciated it so much. P2P validator has been validator in Solana almost from the start of mainnet, and the upcoming Firedancer release in mainnet feels like one of the most significant events in Solana's ecosystem. It is a big step forward for the whole Solana community, and we look forward to testing out genuine Firedancer (with no Solana Labs client as a dependency) later next year.
Authors:
Anton Yakovlev Lead SRE @ P2P validator Solana team
Ilya Shatalov DevOps Engineer @ P2P validator Solana team