7

BTCD源码分析之database存储

 2 years ago
source link: https://imnisen.github.io/btcd-code-database.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

1 database

database模块提供了bitcoin存储相关功能,通过多层缓存,逻辑上分成metadata和flat block files两部分存储。

1.1 database/*

1.1.1 interface.go

interface.go 定义了database包对外提供的类型和方法 主要包括4块: DB, TX, Bucket, Cursor 四个部分。

这4部分的关系是:

  1. DB提供了整体数据库的功能,存储的数据大体上分成元数据和区块数据。
  2. Tx是操作DB获取数据和状态变化的途径(可以提供原子性等),
  3. Bucket是DB存储的元数据的容易,以一些列键值对方式存储。
  4. Cursor是遍历Bucket里键值对的手段。
  1. DB
    // DB通过接口和具体实现分隔开
    
    type DB interface {
        // 返回databse driver类型
        Type() string
    
        // 开始一个数据库事务
        Begin(writable bool) (Tx, error)
    
        // 在一个只读的事务里执行传入的方法
        View(fn func(tx Tx) error) error
    
        // 在一个读写的事务里执行传入的方法
        Update(fn func(tx Tx) error) error
    
        // 关闭数据库
        Close() error
    }
    
    
  2. Tx
    // 数据库事务接口
    // database在存储数据时,分为metadata数据存储和通常的block文件存储
    
    type Tx interface {
        // 返回元数据
        Metadata() Bucket
    
        // 存储区块
        StoreBlock(block *btcutil.Block) error
    
        // 检查区块是否存在
        HasBlock(hash *chainhash.Hash) (bool, error)
        HasBlocks(hashes []chainhash.Hash) ([]bool, error)
    
        // 获取区块头数据
        FetchBlockHeader(hash *chainhash.Hash) ([]byte, error)
        FetchBlockHeaders(hashes []chainhash.Hash) ([][]byte, error)
    
        // 获取区块数据
        FetchBlock(hash *chainhash.Hash) ([]byte, error)
        FetchBlocks(hashes []chainhash.Hash) ([][]byte, error)
    
        // 获取某部分区块数据
        FetchBlockRegion(region *BlockRegion) ([]byte, error)
        FetchBlockRegions(regions []BlockRegion) ([][]byte, error)
    
        // 提交(元数据和区块存储)事务
        Commit() error
    
        // 回滚(元数据和区块存储)事务
        Rollback() error
    }
    
    
    
  3. Bucket
    // Bucket 表示一系列的键值对存储
    
    type Bucket interface {
        // 返回一个子bucket
        Bucket(key []byte) Bucket
    
        // 创建子bucket
        CreateBucket(key []byte) (Bucket, error)
        CreateBucketIfNotExists(key []byte) (Bucket, error)
    
        // 删除bucket
        DeleteBucket(key []byte) error
    
        // 遍历键值对
        ForEach(func(k, v []byte) error) error
    
        // 遍历子bucket
        ForEachBucket(func(k []byte) error) error
    
        // 返回键值对之间遍历的游标
        Cursor() Cursor
    
        // bucket是否可写
        Writable() bool
    
        // 保存、获取、删除键值对
        Put(key, value []byte) error
        Get(key []byte) []byte
        Delete(key []byte) error
    }
    
    
  4. Cursor
    type Cursor interface {
        // 返回关联的bucket
        Bucket() Bucket
    
        // 如其名
        Delete() error
        First() bool
        Last() bool
        Next() bool
        Prev() bool
        Seek(seek []byte) bool
        Key() []byte
        Value() []byte
    }
    
    
    

1.1.2 driver.go

driver.go定义了用来创建DB的drivers的结构

type Driver struct {
    // 返回driver类型
    DbType string
    // 创建一个DB
    Create func(args ...interface{}) (DB, error)
    // 打开DB
    Open func(args ...interface{}) (DB, error)
    // 日志相关
    UseLogger func(logger btclog.Logger)
}

1.2 database/ffldb

ffldb包是对上面定义的接口的一种实现。

1.2.1 db.go

对 database/interface.go 里 DB,Tx,Bucket,Cursor 接口的实现.

  1. db

    db结构主要是分成两个部分: store 和 cache store部分用文件来保存区块信息, cache部分是用leveldb来保存一些区块和整体的一些元数据。

    type db struct {
        writeLock sync.Mutex   // Limit to one write transaction at a time.
        closeLock sync.RWMutex // Make database close block while txns active.
        closed    bool         // Is the database closed?
        store     *blockStore  // Handles read/writing blocks to flat files.
        cache     *dbCache     // Cache layer which wraps underlying leveldb DB.
    }
    
    
    
  2. transaction

    transation表示了每个db操作的事务的概念,采用 pendingBlocks, pendingBlockData 来缓存事务里的block数据,用 pendingKeys, pendingRemove 来缓存事务里的metadata数据

    // transaction represents a database transaction.  It can either be read-only or
    // read-write and implements the database.Bucket interface.  The transaction
    // provides a root bucket against which all read and writes occur.
    type transaction struct {
            managed        bool             // Is the transaction managed?
            closed         bool             // Is the transaction closed?
            writable       bool             // Is the transaction writable?
            db             *db              // DB instance the tx was created from.
            snapshot       *dbCacheSnapshot // Underlying snapshot for txns.
            metaBucket     *bucket          // The root metadata bucket.
            blockIdxBucket *bucket          // The block index bucket.
    
            // Blocks that need to be stored on commit.  The pendingBlocks map is
            // kept to allow quick lookups of pending data by block hash.
            pendingBlocks    Map[chainhash.Hash]int
            pendingBlockData []pendingBlock
    
            // Keys that need to be stored or deleted on commit.
            pendingKeys   *treap.Mutable
            pendingRemove *treap.Mutable
    
            // Active iterators that need to be notified when the pending keys have
            // been updated so the cursors can properly handle updates to the
            // transaction state.
            activeIterLock sync.RWMutex
            activeIters    []*treap.Iterator
    }
    
    
  3. bucket

    bucket通过操作关联的transaction上的 pending_keys, pending_remove 来实现其 Put,Get,Delete,ForEach 等接口方法。

    type bucket struct {
        tx *transaction
        id [4]byte
    }
    

    bucket通过 Cursor() 方法来生成Cursor对象

    func (b *bucket) Cursor() database.Cursor {
            // Ensure transaction state is valid.
            if err := b.tx.checkClosed(); err != nil {
                    return &cursor{bucket: b}
            }
    
            // Create the cursor and setup a runtime finalizer to ensure the
            // iterators are released when the cursor is garbage collected.
            c := newCursor(b, b.id[:], ctFull)
            runtime.SetFinalizer(c, cursorFinalizer)
            return c
    }
    
    // newCursor 方法本质是创建键值对的iterators
    
    
  4. cursor

    cursor是上面bucket创建的各种iterators的组合,然后再其基础上实现 Cursor 接口所需的 First,Last,Next 等遍历方法。

    // cursor is an internal type used to represent a cursor over key/value pairs
    // and nested buckets of a bucket and implements the database.Cursor interface.
    type cursor struct {
        bucket      *bucket
        dbIter      iterator.Iterator
        pendingIter iterator.Iterator
        currentIter iterator.Iterator
    }
    
    

1.2.2 blockio.go

DB采用文件存储block相关数据的实现。

// blockStore houses information used to handle reading and writing blocks (and
// part of blocks) into flat files with support for multiple concurrent readers.
type blockStore struct {
        // network is the specific network to use in the flat files for each
        // block.
        network wire.BitcoinNet

        // 文件存放目录
        // basePath is the base path used for the flat block files and metadata.
        basePath string

        // 每个文件最大存储blocks数量
        // maxBlockFileSize is the maximum size for each file used to store
        // blocks.  It is defined on the store so the whitebox tests can
        // override the value.
        maxBlockFileSize uint32

        // 一些缓存记录和并发控制
        obfMutex         sync.RWMutex
        lruMutex         sync.Mutex
        openBlocksLRU    *list.List // Contains uint32 block file numbers.
        fileNumToLRUElem map[uint32]*list.Element
        openBlockFiles   map[uint32]*lockableFile

        // 读写文件的油表
        // writeCursor houses the state for the current file and location that
        // new blocks are written to.
        writeCursor *writeCursor

        // 操作底层存储文件的方法
        // These functions are set to openFile, openWriteFile, and deleteFile by
        // default, but are exposed here to allow the whitebox tests to replace
        // them when working with mock files.
        openFileFunc      func(fileNum uint32) (*lockableFile, error)
        openWriteFileFunc func(fileNum uint32) (filer, error)
        deleteFileFunc    func(fileNum uint32) error
}



1.2.3 dbcache.go

dbcache.go 里主要的结构有 dbCachedbCacheSnapshot

dbCache是db关联的存储metadata的结构

type dbCache struct {
    // 底层存储的leveldb
    // ldb is the underlying leveldb DB for metadata.
    ldb *leveldb.DB

    // store is used to sync blocks to flat files.
    store *blockStore

    // 缓存大小和时间相关
    maxSize       uint64
    flushInterval time.Duration
    lastFlush     time.Time

    // 缓存
    cacheLock    sync.RWMutex
    cachedKeys   *treap.Immutable
    cachedRemove *treap.Immutable
}


dbCacheSnapshot是每个Transaction创建时生成的对于dbCache的一个快照:

// dbCacheSnapshot defines a snapshot of the database cache and underlying
// database at a particular point in time.
type dbCacheSnapshot struct {
    dbSnapshot    *leveldb.Snapshot
    pendingKeys   *treap.Immutable
    pendingRemove *treap.Immutable
}

1.2.4 其它

  • driver.go: 对database/driver.go的里的Driver类型的实现。
  • ldbtreapiter.go: 相关iterators的包装层
  • reconcile.go: 保证metadata和flat block files一致性的一些操作

1.3 internal/treap

主要实现了可变的和不可变的treap数据类型,用来存储metadata。

Treap(Tree + Heap)是一种平衡二叉树, 不过Treap会记录一个优先级(一般来说是随机生成),即Treap在以关键码构成二叉搜索树的同时,还会按照优先级的高低满足堆的性质。

对于每个结点,该结点的优先级不大于其所有孩子的优先级。Treap引入优先级的原因就是防止BST(二叉搜索树)退化成一条链,从而影响查询效率。

所以对于结点上的关键字来说,它是一颗BST,而对于结点上的优先级来讲,它是一个小顶堆。其平均查找长度为 O(logn) 。

Treap有插入、删除、旋转和查询等基本操作,进而可以实现查询第 k 大和查询关键字 x 排名等功能。

1.3.1 common.go

包含了实现treap的基本数据结构: treapNodeparentStack :

// treapNode represents a node in the treap.
type treapNode struct {
    key      []byte
    value    []byte
    priority int
    left     *treapNode
    right    *treapNode
}

type parentStack struct {
    index    int
    items    [staticDepth]*treapNode
    overflow []*treapNode
}

1.3.2 mutable.go, imutable.go

分别实现了可变的treap和不可变的treap。 不可变的treap每次插入或者删除均返回新的treap。 两者均实现了: Len, Size, Has, Get, Put, Delete, ForEach 等方法用来实现对treap的基本操作

1.3.3 treapiter.go

实现了对treap的遍历方法,包括: First,Last, Next, Prev,Seek,Key, Value, Valid, ForceReseek

1.4 cmd/dbtool

提供了命令行操作DB的一些方法,暂时略去


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK