Learn / Blockchain Primer

Blockchain Primer

A deeper look at the concepts that underpin blockchains. This page is a reference companion to Blockchain Essentials; come here when you want the "why" and "how" behind the ideas introduced there.

Cryptography Basics

Blockchains depend on three cryptographic primitives: hashing, public-key cryptography, and digital signatures. None of them are new (they date back decades), but blockchains combine them in a way that makes trustless cooperation possible.

Hashing

A hash function takes an input of any size and produces a fixed-size output (the "hash" or "digest"). It acts as a fingerprint for data: the same input always produces the same fingerprint, but even a tiny change to the input produces a completely different result.

Critically, hash functions are one-way: given a fingerprint, there is no practical way to reconstruct the original data. This property is what makes them useful for verifying integrity without revealing the underlying content.

Bitcoin uses SHA-256. Substrate-based chains (including Subtensor) primarily use Blake2b, which is faster while providing equivalent security. In both cases, the output is 256 bits (32 bytes), a number so large that accidental collisions (two different inputs producing the same hash) are effectively impossible.

Blockchains use hashing everywhere: each block's header contains a hash of the previous block (creating the "chain"), and the state of the entire database is summarized by a single root hash in a structure called a Merkle tree. If anyone changes any stored value, the root hash changes, and every node on the network can detect the tampering.

Public and Private Keys

Asymmetric cryptography (also called public-key cryptography) uses a pair of mathematically linked keys: a private key that you keep secret and a public key that you share openly. Anything encrypted with the public key can only be decrypted with the private key, and vice versa.

On a blockchain, your public key (or a derivative of it) serves as your address, the identity other participants use to send you funds or verify your actions. Your private key proves you own that address. Losing your private key means losing access to your account permanently; there is no "forgot password" mechanism because there is no central authority to appeal to.

Substrate chains use the Sr25519 scheme (based on Curve25519) by default. In the Bittensor ecosystem, accounts are represented as SS58 addresses, a human-friendly encoding of the public key with a network-specific prefix (prefix 42 for Bittensor).

Digital Signatures

A digital signature is proof that a specific private key authorized a specific message, without revealing the private key itself. The process works in two steps: the signer uses their private key and the message data to produce a signature, and then anyone can use the signer's public key to verify that signature is valid.

Every transaction on a blockchain is digitally signed. When you submit an extrinsic (such as a TAO transfer or a staking operation), your wallet signs it with your private key. Nodes on the network verify the signature before including the transaction in a block. If the signature is invalid (meaning you don't control the account you're claiming to act from), the transaction is rejected.

This is the fundamental mechanism that makes a blockchain permissionless: anyone can participate, but nobody can impersonate anyone else, because forging a valid signature without the private key is computationally infeasible.

P2P Networks and Gossip Protocols

Traditional web services follow a client-server model: your browser connects to a company's server, and the server controls access and data. Blockchains take a fundamentally different approach.

No Central Server

In a peer-to-peer (P2P) network, every participant (called a node) connects directly to other nodes. There is no central server that everyone talks to. Each node is both a client and a server: it requests data from peers and serves data to peers.

This design makes the network resilient. If any individual node goes offline, the rest of the network continues operating. There is no single point of failure and no single entity that can censor transactions or shut the network down.

Node Discovery and Connections

When a new node starts up, it needs to find other nodes to connect to. Most networks provide a list of bootstrap nodes, well-known nodes that are almost always online. The new node connects to a few bootstrap nodes, which introduce it to their own peers, and over time the new node builds up a diverse set of connections.

Substrate-based chains (including Subtensor) use litep2p, a Rust-native networking library that replaced the older libp2p stack. litep2p handles peer discovery, connection management, multiplexing multiple protocols over a single connection, and encryption of all traffic between nodes.

Gossip Protocol

Once nodes are connected, how does information spread? Through a gossip protocol. The idea is simple and mirrors how rumors spread in a social group: when a node learns something new (a new transaction, a new block), it tells a handful of its neighbors. Each of those neighbors tells a handful of their neighbors, and so on. Within seconds, the information reaches every node in the network.

Gossip protocols are efficient because each node only needs to communicate with a small number of peers, yet the information reaches the entire network in logarithmic time relative to the network size. A network of 10,000 nodes might only need 13-14 "hops" for a message to reach everyone.

How Blocks Propagate

When a validator produces a new block, it announces the block to its connected peers via gossip. Peers that receive the block validate it (check that all transactions are correctly signed, that state transitions follow the rules) and then relay it to their own peers. Invalid blocks are rejected and not propagated further, which prevents malicious data from spreading.

Transactions follow the same path. When you submit a transaction, your node gossips it into a shared waiting area called the transaction pool (or "mempool"). The next block producer picks transactions from this pool, orders them, and includes them in the next block.

Consensus Mechanisms Compared

The core problem every blockchain must solve: how do thousands of independent nodes agree on the current state of the database, when some of those nodes might be faulty or actively malicious? This is the consensus problem, and different blockchains solve it in different ways.

Proof of Work (PoW)

Proof of Work, used by Bitcoin, is the original consensus mechanism. To produce a block, a node (called a miner) must solve a computational puzzle: find a number (a "nonce") that, when combined with the block data and hashed, produces a result below a certain threshold. This requires enormous amounts of trial-and-error computation.

Solving the puzzle is expensive (requiring real energy and hardware), but verifying a solution is cheap (a single hash check). That asymmetry means an attacker would need to outspend the entire rest of the network in hardware and electricity to produce fraudulent blocks. The economic cost makes attacks prohibitively expensive.

The downside is energy consumption. Bitcoin's mining network consumes as much electricity as some countries. PoW also limits throughput: Bitcoin produces a block roughly every 10 minutes and processes about 7 transactions per second.

Proof of Stake (PoS)

Proof of Stake replaces computational puzzles with economic commitment. Instead of spending energy on mining, participants lock up (or "stake") their cryptocurrency as collateral. The protocol selects a validator to produce each block, with the probability of selection proportional to the amount staked.

If a validator misbehaves (for instance, by trying to produce conflicting blocks), the protocol can slash their stake, destroying a portion of their locked funds. This creates an economic incentive to behave honestly: you risk losing real money if you try to cheat.

PoS is dramatically more energy-efficient than PoW because it doesn't require mining hardware running at full capacity. It also enables faster block times and higher throughput. Ethereum transitioned from PoW to PoS in 2022, reducing its energy consumption by over 99%.

Nominated Proof of Stake (NPoS)

Nominated Proof of Stake is the variant used by Substrate-based chains, including Polkadot. It adds a delegation layer: token holders who don't want to run a validator node themselves can nominate (delegate stake to) validators they trust. This increases participation in the security of the network, because even small holders can contribute their stake to the consensus process.

The protocol runs an election algorithm to select the active validator set, optimizing for even distribution of stake across validators. This prevents any single validator from accumulating too much power. If a nominated validator misbehaves, both the validator and its nominators can lose stake, which gives nominators a strong incentive to choose reliable validators.

Byzantine Fault Tolerance

The term Byzantine Fault Tolerance (BFT) comes from a classic computer science problem called the Byzantine Generals' Problem: how can a group of generals coordinate an attack on a city when some of them might be traitors sending false messages?

In blockchain terms, a BFT system can continue operating correctly as long as fewer than one-third of the participants are faulty or malicious. The honest majority can still reach agreement even when bad actors send conflicting messages or refuse to participate.

All modern consensus mechanisms are designed to be Byzantine fault tolerant. The difference is in how they achieve it: PoW makes attack prohibitively expensive, PoS makes it economically punishable, and traditional BFT algorithms use multiple rounds of voting among a known set of participants.

Finality

Finality is the guarantee that a transaction, once included in the blockchain, cannot be reversed or altered. It sounds straightforward, but different blockchains offer different kinds of finality, and understanding the distinction matters when building applications.

Probabilistic Finality

Bitcoin provides probabilistic finality. After a transaction is included in a block, additional blocks continue to be built on top of it. Each new block makes reverting the transaction exponentially harder, because an attacker would need to redo the proof of work for the target block and every subsequent block faster than the rest of the network produces new ones.

This is why Bitcoin exchanges typically wait for 6 confirmations (about 60 minutes) before considering a deposit final. At that depth, the probability of a successful reversal is negligible for all practical purposes, but it's never mathematically zero. The finality is "probabilistic" because it grows asymptotically toward certainty without ever reaching it.

Deterministic Finality

Substrate-based chains offer deterministic finality through a protocol called GRANDPA (GHOST-based Recursive Ancestor Deriving Prefix Agreement). GRANDPA runs alongside the block production mechanism and has validators explicitly vote on which blocks they consider final.

Once more than two-thirds of validators vote that a block is final, it is final: permanently and irreversibly. No amount of future computation can undo it. This is a stronger guarantee than probabilistic finality and it happens much faster, typically within a few seconds after block production.

GRANDPA is also efficient because it can finalize entire chains of blocks at once. If validators are slightly behind, they don't need to vote on each block individually; they can vote on the latest block they've verified, and all ancestor blocks are finalized simultaneously.

Why Finality Matters for Applications

For application developers, finality determines when you can safely act on a transaction. With probabilistic finality, you must decide how many confirmations to wait for, balancing speed against risk. With deterministic finality, the answer is clear: once the block is finalized, it's done.

This is especially relevant for Subtensor, where actions like staking operations and emission distributions have real economic consequences. Deterministic finality means that once a staking transaction is finalized, there is no scenario in which it gets reversed. Applications can update their state immediately with full confidence.

Smart Contracts vs Runtime Pallets

There are two main approaches to adding custom logic to a blockchain. Understanding the difference explains a key architectural decision in Subtensor.

Smart Contracts

A smart contract is a program deployed onto an existing blockchain, much like uploading a script to a shared computer. Ethereum is the most well-known platform for smart contracts, where they are written in Solidity and run on the Ethereum Virtual Machine (EVM).

Smart contracts run in a sandboxed environment: they can only interact with the blockchain through a defined set of operations, and they cannot directly access the host machine's resources. They are subject to gas metering: every operation has a computational cost, and the caller pays for the gas consumed. This prevents infinite loops and incentivizes efficient code.

The advantages of smart contracts are accessibility and composability. Anyone can deploy one without permission, and contracts can call other contracts, building complex systems from simple building blocks. The trade-offs are performance overhead (due to sandboxing and gas metering) and limited access to the underlying chain's internals.

Runtime Pallets

A runtime pallet is a module compiled directly into the blockchain's runtime -- the core logic that every node executes. Rather than being a guest program running inside a sandbox, a pallet is part of the blockchain itself.

Pallets have direct access to the chain's storage, can define custom data structures, and execute without the overhead of gas metering or sandbox interpretation. They can implement complex algorithms (like Yuma Consensus) that would be prohibitively expensive as smart contracts due to gas costs.

The trade-off is that pallets require a runtime upgrade to deploy or modify. Unlike smart contracts, which anyone can deploy at any time, adding or changing a pallet requires governance approval and a coordinated upgrade across all nodes. This makes pallets better suited for core protocol logic that changes infrequently, rather than application-level code that evolves rapidly.

Why Subtensor Uses Pallets

Subtensor's core responsibilities (staking, registration, weight-setting, epoch processing, and emission distribution) are computationally intensive and deeply integrated with the chain's state. The Yuma Consensus algorithm, for example, involves matrix operations across all neurons in a subnet. Running this as a smart contract would be impractical: the gas costs would be enormous, and the sandbox overhead would slow execution unacceptably.

By implementing these as pallets within the Substrate runtime, Subtensor achieves native performance, direct storage access, and the ability to perform complex computations on every block (during epoch processing) without gas limitations. The SubtensorModule pallet alone contains hundreds of storage items, calls, and events that work together as a cohesive system.

Substrate Framework Overview

Substrate is an open-source blockchain development framework created by Parity Technologies (the team behind Polkadot). Rather than building a blockchain from scratch (implementing networking, consensus, database storage, transaction processing, and more), Substrate provides all of these components as reusable, well-tested building blocks.

Modular Architecture: FRAME and Pallets

Substrate organizes blockchain logic using FRAME (Framework for Runtime Aggregation of Modularized Entities). FRAME provides a set of conventions and macros for building pallets -- the modular components described above.

A Substrate runtime is composed by selecting and configuring pallets. Some pallets are provided by Substrate itself (System, Balances, Timestamp, Session, Grandpa, and many others), while custom pallets implement chain-specific logic. Subtensor's runtime includes approximately 28 pallets: standard Substrate pallets for basics like account balances and block production, plus custom pallets like SubtensorModule for the Bittensor-specific staking, consensus, and emission logic.

Each pallet declares its own storage items (on-chain state), callable extrinsics (transactions users can submit), events (notifications emitted during execution), errors (failure conditions), and constants (fixed configuration values). This structure is what makes the runtime self-describing through its metadata; tools and applications can discover the full interface programmatically.

Runtime Upgradability: Forkless Upgrades

One of Substrate's most distinctive features is forkless runtime upgrades. The runtime is compiled to WebAssembly (WASM) and stored on-chain as part of the blockchain's state. When the runtime needs to be updated, the new WASM blob is submitted as a transaction and stored on-chain. All nodes automatically switch to executing the new runtime at a specified block.

This is a major advantage over traditional blockchains where protocol upgrades require a "hard fork" where all node operators must independently download and install new software, and if some don't, the network splits. With Substrate, the upgrade is coordinated through the chain itself. Once the upgrade transaction is finalized, every node runs the new logic. No coordination outside the chain is needed.

Subtensor has undergone 133 runtime version changes (from version 101 to version 411 at the time of writing), each adding new features, modifying existing logic, or fixing issues, all without a single hard fork. The version history of every pallet, event, call, and storage item is tracked in this site's reference section.

Why Bittensor Chose Substrate

Bittensor needed a blockchain that could handle complex on-chain computation (the Yuma Consensus algorithm involves matrix math across hundreds of neurons), support frequent protocol upgrades as the network evolves, and provide a clean separation between generic blockchain infrastructure and application-specific logic.

Substrate met all three requirements. The pallet system allowed the team to implement Bittensor's unique staking, registration, and consensus logic as native code without reinventing networking, storage, or block production. Forkless upgrades meant the protocol could evolve rapidly, critical for a young network where the economic model is still being refined. And Substrate's Rust foundation provided the performance needed for computation-heavy epoch processing.

For developers exploring Subtensor's internals, understanding Substrate is essential context. The metadata structure, the pallet-based organization, the extrinsic/event/storage model -- all come from Substrate. This site's reference section is organized around this structure, making it straightforward to find any pallet, call, event, or storage item in the runtime.

← Back to Blockchain Essentials