/
Ethereum

Ethereum censorability monitor

Post preview image

Table of Contents

Introduction

This article is our submission to Lido’s Ethereum censorability monitor grant.

Ever since the Ethereum merge, MEV-boost has become a significant part of the ecosystem. At the same time, the US government via the Office of Foreign Assets Control (OFAC) have imposed sanctions on certain digital addresses. MEV-relays are now divided between those which are OFAC-compliant and those which are not.

The main goal of this article is to demonstrate the influence of this censorship on blockchain degradation and propose a solution to monitor the censorship problem in the Ethereum blockchain.

For the purpose of this paper, we will refer to the time difference between when a transaction enters the mempool and is included in a block as "delay".

Dataset and dashboard

Our sources of data:

  1. Mempool public data.
    This data was collected via a web3 python package and kept in our Data warehouse (DWH) and only one node was used (located in Europe). The data was streamed 24/7 and we parsed approximately 1-1.2m potential transactions per day. Our mempool sample covers about 95% of all transactions in the public Ethereum dataset.
  2. Public Ethereum Dataset
  3. Level of censorship applied by relays
  4. Block information obtained directly from Relays (e.g. flashbots)
  5. Government-sanctioned list of digital addresses
  6. Lido validator pubkeys. We used the validator dataset from Lido.

After the data was processed, we created a large dataset. The table below contains a description of the main variables:

Column name

Data type (units)

Description

block_hash

STRING

Unique block identifier from the public Ethereum dataset

transaction_hash

STRING

Unique transaction identifier from the public Ethereum dataset

to_address

STRING

Transaction receiver

from_address

STRING

Transaction sender

block_timestamp

TIMESTAMP

Timestamp at which the block was created

mempool_timestamp

TIMESTAMP

Timestamp of when we parsed the mempool transaction

time_diff

BIGINT (seconds)

Time difference between when a transaction enters the mempool and is included in a block

block_diff

INT 

Number of blocks produced between when a transaction enters the mempool and is finalized 

gas

BIGINT

Gas allocated to the transaction 

gas_price

BIGINT

Gas price

gas_fact

BIGINT

Gas spent 

max_fee_per_gas

BIGINT

base_fee + max_priority_fee

max_priority_fee_per_gas

BIGINT

Additional fee to speed up transaction

relay

STRING

Name of relay

num_transaction

INT

Number of transactions within a block

height

BIGINT

Serial number of the block

builder_pubkey

STRING

Unique address of MEV-builder

lido_validator

STRING

Company of Lido validator (null if it is not a Lido validator)

transaction_censured_from

BOOLEAN

True if the sending address is under the sanctioned list

transaction_censured_to

BOOLEAN

True if the receiving address is under the sanctioned list

error_dummy

BOOLEAN

True if the transaction failed

censured_relay

BOOLEAN

True if the relay is censuring transactions

lido_validator_dummy

BOOLEAN

True if the Lido validator produces block

mev_dummy

BOOLEAN

True if the block is produced by MEV-builders

This dataset is available on Google BigQuery in the table `p2p-data-warehouse.p2p_public.eth_mev_censored`.

Based on the data described above, we’ve created a dashboard with the main characteristics of Ethereum transactions.

Dashboard description

Our dashboard has 6 parts:

  1. General data info. Data about transactions and blocks within the Ethereum blockchain, delay and transaction cost.
  2. Censorship between relays. Here we showcase the share of blocks for each MEV relay and the average delay for censorship and non-censorship MEV relays.
  3. Censorship addresses. Here we showcase all the available information about addresses that are under the OFAC-sanctioned list.
  4. Lido vs Other validators. Here we divide transactions between those validated by Lido and other validators for comparison.
  5. Censorship between MEV-builders. Here we showcase a few metrics for every MEV-builder in the Ethereum ecosystem.

Building hypothesis

Full sample dataset

Our main goal is to estimate the level of blockchain degradation, i.e. longer time to verify transactions and higher transaction costs, that may be caused by censorship. We hypothesize that longer delays and higher transaction costs could be a sign of censorship.

We hypothesize that our main metrics (delay and transaction costs) could be statistically different in the following subgroups:

We also want to check the level of censorship employed by Lido validators so we are going to check the following hypothesis:

  1. MEV transactions may be under censorship and it can lead to a slowdown in operations compared to non-MEV.
  2. Delay and cost of transactions could be different between Lido validators and other validators.
  3. Relays that censor transactions can have a longer delay than other relays.
  4. OFAC-compliant relays could take longer to process transactions compared to other relays
  5. The probability of some transaction being included in the Nth block in the case of OFAC/not-OFAC could be different.
  6. The probability to be included in the OFAC block for Lido validators could be different, compared to non-Lido validators.

Truncated dataset for potentially censored transactions

We want to highlight a certain amount of transactions whose high time delay could not be explained by normal network conditions. Such transactions will be suspect of being subject to censorship. To realise this, we must take into account the following transaction properties.

High delay

We will start by choosing all the transactions over a certain threshold for the time delay in seconds.

Successful transactions

Next, we will only consider successful transactions since failure could be a reason for the delay.

Low transaction fees

Another reason for a transaction to have a high delay could be low fees. That is why we should account for that and start by checking the transaction fee.

Previous transaction pending

Sometimes transactions could delayed simply because a previous transaction from the same sender had not yet finished. We use nonce parameters to exclude these transactions in our analysis.

After forming the truncated dataset, we will try to find out the reasons for the high delay in censored transactions: government-driven or ethical censorship. We will check the receiver and sender addresses against the sanctioned list and share OFAC/ethical censoring MEV-relays. Results across the full daily dataset can be found in our dashboard.

Censorship analysis (10.02.2023 - 14.02.2023)

The code required to reproduce our results is available here

Exploratory Data Analysis

Our sample dataset has 4 798 993 transactions and of those, 153 847 (~3%) are failed transactions.

The delay for most transactions does not exceed 27 seconds (95% quantile) and almost every transaction has been delayed for only one block (block_diff = 1).

Most transactions have fees with a skewness of zero. The difference between the 99.5% quantile and the 95% quantile is greater than 4 times the fee. The table below shows the transaction time delay and fees.

Main quantiles for delay and fees

Variable

5%

10%

25%

50%

75%

90%

95%

97.5%

99%

99.5%

Delay, secs

1

2

4

8

10

15

27

122

972

11984

Subscribe to P2P-economy

Get the latest posts delivered right to your inbox

Subscribe
Read more