Why database sharding is massive for Ethereum Developers

database sharding blockchain

Are you curious about consensus algorithms and why they’re key to blockchain technology? Great! You’ve come to the right place. To learn blockchain development and be certified I recommend visiting Ivan on Tech Academy

Blockchain is currently #1 ranked skill by LinkedIn. Because of that, you should definitely learn more about Ethereum to get a full-time position in crypto during 2020.

In my first and second pieces, I’ve discussed Ethereum 2.0 and the best tools for developers. In my third and fourth articles, I’ve discussed quadratic voting and open governance models. Then, in my fifth piece, I’ve looked into Swarm’s infrastructure.

In my sixth and seventh ones, I’ve dove-deep into consensus algorithms. Finally, in my last release, I’ve looked into the blockchain trilemma. Thankfully, the previous issue perfectly links to what we’ll discuss today.

Now to the topic at hand: what is database sharding and how does it work? Is database sharding a key feature for blockchain technology? Should Ethereum developers pay attention to how sharding works, can be deployed and its advantadges for scalability?

What is database sharding?

Database sharding is seen as one of the most prominent ways to scale blockchains and is currently being developed by projects such as Ethereum, Zilliqa or Polkadot.

A key argument from computer science is that you can’t scale a system without losing security, privacy (decentralisation), or both. This trilemma shows how complicated it is to properly distribute a system among independent peers who can join and leave the network as they please.  Communication between nodes, or network latency, could become incredibly slow given the fact nodes can choose to randomly disconnect at any point in time – without consequences. Network bandwidth plays a key role in the overall system’s performance, which can be a significant problem for many cryptocurrencies as most people in the world don’t possess enough resources to purchase large amounts of data.

Is there a way to address both the network latency and bandwidth problem?

Come Sharding

Developers have proposed many solutions to address the issue of throughput on the protocol level. These solutions can be mostly separated into those that delegate all the computation to a small set of powerful nodes, and those that have each node in the network only doing a subset of the total amount of work. Database sharding, nowadays most commonly known simply as sharding, is a technique that allows nodes to process only small parts of the entire blockchain transactions. At the same time, it makes sure the state of the whole chain is correctly ordered and validated. Sharding creates multiple smaller databases that only store local copies of transactions. Each database validates and stores a small part, making the system lighter as a whole.

Sharding features and properties

Database sharding has a simple goal: to allow blockchains to scale. To fully understand the different properties and features, I must explain what the role of nodes are.

Essentially, nodes should perform the following tasks:

  • Process and verify transactions,
  • Relay validated transactions and completed blocks to other nodes,
  • Store the state and the history of the entire network ledger.

Each of these three tasks imposes a growing requirement on the nodes operating the network. Essentially, common storage and relay problems are prone to arise in public and open blockchains:

  1. The necessity to process transactions requires more compute power with the increased number of transactions being processed.
  2. The necessity to relay transactions and blocks requires more network bandwidth with the increased number of transactions being relayed.
  3. And, the necessity to store data requires more storage as the state grows. Importantly, unlike the processing power and network, the storage requirement grows even if the transaction rate (number of transactions processed per second) remains constant.

Under State Sharding, the nodes in each shard build their own blockchain and contain transactions that affect only the local part assigned to that shard. The validators only need to relay transactions that affect their part of the state. This partition linearly reduces the requirement on all compute power, storage, and network bandwidth. However, it introduces new problems such as data availability and cross-shard transactions.

An overview of different Sharding versions

To fully understand sharding, I must explain the different types of sharding being developed. While one is focused in scalability, the other is concentrated on cross-communication. These are the two major versions of sharding being used:

  1. Partitioned sharding, where shards don’t communicate with each other directly through a central relay.
  2. State sharding, where shards communicate with each other through a state, or central relay.

Each type of database sharding has its own benefits and drawbacks, described in the table below.

 PartitionedState
PropertiesIndependent shards, own validators, no coordinator requiredQuadratic sharding capabilities, cross-shard communication, coordinator required
BenefitsSmaller chains mean faster communication and sync times within nodes, blockchain size decreases exponentiallySmaller, linked chains means faster communication, sync times between shards chain size decreases, imcreased interoperability
RisksAs each shard has its own validators, each shard is less secure, routing issues between nodesData availability is reduced as each shard needs to be online to relay information, less security than a one-chain solution
ExampleBeansTalkBeacon Chain

Conclusion

Database sharding is a technique that separates a single blockchain into multiple smaller blockchains (or shards). Each shard runs independently and processes its own transactions. Improvements upon how sharding works are being tested, such as cross-sharding, which allows shards to communicate in-between themselves. Some of the key benefits of sharding are the reduction of the overall blockchain size and the possibility for a quadratic network performance enhancement.

Resources

This article is not financial advisement. Changes may happen that the author is unaware of. Always check the resources provided!