Data Availability Layer - One of the critical layers to Blockchain Scalability & Security

Jan 29, 2024

Cross-post from ЯG Bytes

The term "data availability" (DA) in the context of Ethereum was first introduced in the Ethereum Improvement Proposal (EIP) 4844 at ETHDenver 2022. It is a key concept that significantly influences both scalability and security of the ethereum blockchain. Data availability refers to the accessibility of transaction data for each block within the blockchain, a critical factor for network participants, like full node operators, to validate blocks effectively. However, requiring every participant in the network to download complete transaction data for each block is not scalable as it imposes a substantial hardware and network burden on participants, hindering the blockchain's ability to scale efficiently.

Data availability can be broken down into two fundamental aspects: firstly, the guarantee that the transaction data representing transactions in each block is accessible; and secondly, the ability to verify blocks without the need to download the entire transaction dataset. While the full nodes do download the entire transaction data for the block and execute the individual transactions to ensure the correctness of a block, it is not feasible to do the same for light nodes, stateless nodes or L2 rollups. Hence the data availability solutions that are being developed by different projects are extremely crucial for the success of such participants and scalability of the blockchain.

In this post, I focus on establishing a framework for assessing the current landscape of Data availability solutions, particularly in the context of their application in Layer 2 (L2) solutions. To illustrate this framework, I will examine three distinct projects: Avail, Celestia and EigenDA. (Note: An affiliate of Protagonist Management LLC is an investor in Layr Labs, Inc., the company that is developing EigenDA.)

Current State of Data Availability Solutions

The landscape of data availability solutions within the modular blockchain ecosystem is rapidly evolving and is one of the most powerful narratives entering into 2024. The projects within the scope of this post are in various stages of development.

Avail, started by a team at Polygon, is a project aimed at developing modular blockchains that enables developers to create customizable and scalable applications. Positioning itself as a robust base layer with a sharp focus on data availability, Avail has gained significant traction on its testnet with a mainnet planned to launch sometime in 2024

Launched in 2023, Celestia stands out as a modular data availability network. Currently, it operates with its mainnet in beta testing. The network has made a substantial impact, and is considered one of the leading projects in the data availability space.

Originating as a proof of concept within the EigenLayer ecosystem, EigenDA has evolved into a more ambitious project. Initially intended to be the first AVS (Actively Validated Service) on EigenLayer, the team realized the potential for a high-performance data availability layer and how it could be a critical component in the broader modular blockchain narrative. Currently in the testnet phase, EigenDA plans to launch its mainnet by the end of Q1 2024, marking a significant milestone in its development.

Architecture Highlights

While doing a detailed architecture deep dive is beyond the scope of this article, here are some of the architectural highlights for each of the solutions and what distinguishes them from each other.

* Erasure Coding - Erasure coding (EC) is a method of data protection that involves breaking data into fragments, expanding and encoding them with redundant data pieces, and then storing them across different nodes. This technique increases data redundancy and allows for the reconstruction of data in case of node failure or data corruption.

** Named Merkle Trees - Celestia partitions the block data into multiple namespaces, one for every application (e.g., rollup) using the DA layer. As a result, every application needs to download only its own data and can ignore the data of other applications.

Evaluating Data Availability solutions

In this section I establish criteria that teams building L2 solutions should consider when evaluating the various solutions.

Cost

Understanding the cost structure is crucial for L2 networks, particularly since high data availability costs on the mainnet are a significant driver for moving to alternative solutions. The costs for any data availability solution typically consists of transaction fees, data storage fee, and bandwidth utilization fee, along with optional priority fees for faster processing. To make the costs more predictable, data availability solutions might offer long term reservation of storage and bandwidth to keep costs predictable and low.

A key strategic consideration is whether L2 networks can pay for these costs in their native tokens, potentially easing the economic strain during initial scaling and user onboarding phases. As data availability layers are still evolving, there's an opportunity for L2 projects to collaborate with these platforms to develop a cost model that benefits all parties involved, ensuring economic feasibility and sustainability.

Performance

Comparing the block size, latency, and throughput of different networks is essential to determine which data availability network best suits your needs.

Block size matters because it allows L2 solutions to bundle transactions together into a single block. The larger the block size, the more transactions can be processed and included in one block, which helps to eliminate data availability as a bottleneck in transaction processing. While most data availability solutions aim for a 1GB block size, the initial block sizes supported by these solutions would be much lower.

Latency is defined as the time it takes from when a Layer 2 solution requests to write a data blob to the data availability layer, to the point when the availability of this blob can be verified on the chain. Lower latency is crucial because it means the data availability layer becomes less of a bottleneck in the transaction workflow.

Throughput is measured by the number of bytes that can be written to the DA layer within a specific timeframe. Higher throughput is crucial for use cases that require more real time data streaming such as gaming, streaming video/audio etc. It’s also important to consider if the solution can be tuned for higher throughput or low latency depending on the use case.

Data Availability Guarantees

Redundancy of data is crucial to ensure data availability in case a percentage of nodes is compromised or corrupted. All the three data availability solutions utilize erasure coding for data protection and redundancy. Further, while AvailDA and EigenDA utilize Validity proofs in the form of KZG polynomial commitments to ensure data availability and validity, Celestia DA relies on fraud-proofs for the same.

In addition to the above, EigenDA utilizes Proof of custody where each operator must routinely compute and commit to the value of a function which can only be computed if they have stored all the blob chunks allocated to them over a designated storage period. EigenDA also has a feature called Dual Quorum, where two separate quorums can be required to attest to the availability of data making it more secure.

Security

It’s important to monitor the security of the data availability solutions on an ongoing basis, not just when evaluating a solution for your L2. Apart from ensuring that the network code and any smart contract code is open source and audited by a well-reputed auditing firm, there are a few other considerations to keep in mind when evaluating various data availability solutions. Firstly, consider what is at stake for the node operators to ensure they operate honestly. While AvailDA and CelestiaDA require operators to stake their native token (AVL, TIA respectively), EigenDA requires operators to restake ETH. While this isn’t a direct reflection of the quality of the stake, it is important to monitor the value of the stake to ensure that it’s significant enough to provide crypto economic security.

Secondly, it is also important to track the incentives for node operators and how many nodes are participating in securing the network. As these networks mature, they will compete for market share, and operators will be attracted to networks that offer good economic incentives. Good incentives also lead to more node operators participating in securing the network, making the network more decentralized.

Finally, understanding the level of fault tolerance of these networks is very important. Essentially, this means determining the percentage of nodes that need to collude or fail for the network to be compromised. This number, coupled with the economic incentives, will help you gauge the security of the network.

Conclusion

Data availability is set to play a crucial role, especially in the narrative of modular blockchains. L2 solutions being developed will increasingly depend on these data availability solutions to scale effectively. However, this isn't likely to be a 'winner-takes-all' market. Diverse solutions like AvailDA, CelestiaDA and EigenDA each have the potential to capture a share of the market. These solutions might even carve out niche markets for themselves, catering to specific sectors such as gaming, real-world assets (RWA) etc.

An intriguing possibility that isn’t prevalent yet but could emerge is L2 solutions utilizing multiple data availability solutions simultaneously. This approach could be for redundancy or to meet varying Service Level Agreements (SLAs) for different use cases. Looking ahead, we might see data availability solutions evolve beyond just handling transaction data. They could start storing any arbitrary data blobs required by Decentralized Applications (DApps) to deliver their services. Keeping an eye on how these solutions evolve will definitely be interesting.

The opinions, views, thoughts and/or other content (the “content”) expressed on this website and other media outlets, including, websites, social media platforms, articles, blogs, posts and videos, including, but not limited to, Facebook, Twitter and YouTube pages, represent my own opinions, views and/or thoughts and do not reflect the opinions, views and/or thoughts of Protagonist Management LLC or any of its respective affiliates or any of their respective directors, officers, and/or employees (collectively, “Protagonist”).

The information provided in the content should not be considered as an offer soliciting the purchase or sale of any security or interest in any investment vehicle sponsored, discussed, or mentioned by Protagonist, nor should it be construed as an offer to provide investment advisory services. Such offers or solicitations will be made separately by Protagonist and only by means of confidential offering documents of the specific investment vehicles which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. There can be no assurances that Protagonist’s investment objectives will be achieved, or investment strategies will be successful. Any investment in a vehicle managed by Protagonist involves a high degree of risk including the risk that the entire amount invested is lost. Any description of investment advisory services, experience, and/or past results is not indicative of any future profitability or positive performance. The content is not intended to be, and must not be taken as, a basis for any investment decision.

The content should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in the content are subject to change without notice and may differ or be contrary to opinions expressed by others. Certain information contained in the content has been obtained from third-party sources. While taken from sources believed to be reliable, the information has not been independently reviewed or confirmed for accuracy, and as such, no representations are made about the enduring accuracy of the information or its appropriateness for a given situation. I do not warrant the accuracy, adequacy or completeness of the content and expressly disclaim liability for errors or omissions in the content. The content is provided on an “as-is” and “as-available” basis, and I disclaim all warranties and representations of any kind with regard to the content, including implied warranties of merchantability, non-infringement of third-party rights, or fitness for a particular purpose.

PROTAGONIST