本文首發(fā)于泊浮目的簡(jiǎn)書(shū):http://www.itdecent.cn/u/204b8aaab8ba

版本	日期	備注
1.0	2020.3.12	文章首發(fā)
1.0.1	2020.3.16	改進(jìn)部分大小寫(xiě)問(wèn)題及形容方式
1.0.2	2020.3.21	改進(jìn)可能會(huì)引起錯(cuò)誤理解的部分
1.0.3	2020.3.29	修改標(biāo)題
1.0.4	2020.4.18	改進(jìn)小結(jié)部分
1.0.5	2020.6.26	更新部分部分解釋，改進(jìn)注釋風(fēng)格
1.0.6	2020.7.6	增加部分詳細(xì)解釋
1.1	2021.6.22	標(biāo)題從`深入淺出Zookeeper（二）：存儲(chǔ)技術(shù)`變更為`深入淺出Zookeeper源碼（二）：存儲(chǔ)技術(shù)`

前言

在上篇文章中，我們簡(jiǎn)單提到了Zookeeper的幾個(gè)核心點(diǎn)。在這篇文章中，我們就來(lái)探索其存儲(chǔ)技術(shù)。在開(kāi)始前，讀者可以考慮思考下列問(wèn)題：

Zookeeper的數(shù)據(jù)存儲(chǔ)是如何實(shí)現(xiàn)的？
Zookeeper進(jìn)行一次寫(xiě)操作的時(shí)候，會(huì)發(fā)生什么？
當(dāng)一個(gè)Zookeeper新加入現(xiàn)有集群時(shí)，如何同步現(xiàn)集群中的數(shù)據(jù)？

抱著問(wèn)題，我們進(jìn)入下面的內(nèi)容。

Zookeper本地存儲(chǔ)模型

眾所周知，Zookeeper不擅長(zhǎng)大量數(shù)據(jù)的讀寫(xiě)，因?yàn)椋?/p>

本質(zhì)上就是一個(gè)內(nèi)存里的字典。
持久化節(jié)點(diǎn)的寫(xiě)入由于WAL會(huì)導(dǎo)致刷盤(pán)，過(guò)大的數(shù)據(jù)會(huì)引起額外的seek。
同樣的，在zk啟動(dòng)時(shí)，所有的數(shù)據(jù)會(huì)從WAL的日志中讀出。如果過(guò)大，也會(huì)導(dǎo)致啟動(dòng)時(shí)間較長(zhǎng)。

而內(nèi)存中的數(shù)據(jù)，也被稱為ZkDatabase（Zk的內(nèi)存數(shù)據(jù)庫(kù)），由它來(lái)負(fù)責(zé)管理Zk的會(huì)話DataTree存儲(chǔ)和事務(wù)日志，它也會(huì)定時(shí)向磁盤(pán)dump快照數(shù)據(jù)，在Zk啟動(dòng)時(shí)，也會(huì)通過(guò)事務(wù)日志和快照數(shù)據(jù)來(lái)恢復(fù)內(nèi)存中的數(shù)據(jù)。

既然Zk的數(shù)據(jù)是在內(nèi)存里的，那么它是如何解決數(shù)據(jù)持久化問(wèn)題的呢？上一段我們已經(jīng)提到了：即通過(guò)事務(wù)日志——WAL，在每次寫(xiě)請(qǐng)求前，都會(huì)根據(jù)目前的zxid來(lái)寫(xiě)log，將請(qǐng)求先記錄到日志中。

接下來(lái)，我們來(lái)談?wù)刉AL的優(yōu)化措施。

WAL的優(yōu)化

WAL優(yōu)化方案1：Group Commit

一般的WAL中每次寫(xiě)完END都要調(diào)用一次耗時(shí)的sync API，這其實(shí)是會(huì)影響到系統(tǒng)的性能。為了解決這個(gè)問(wèn)題，我們可以一次提交多個(gè)數(shù)據(jù)寫(xiě)入——只在最后一個(gè)數(shù)據(jù)寫(xiě)入的END日志之后，才調(diào)用sync API。like this:

without group commit: BEGIN Data1 END Sync BEGIN Data2 END Sync BEGIN Data3 END Sync
with group commit: BEGIN Data1 END BEGIN Data2 END BEGIN Data3 END Sync

凡事都有代價(jià)，這可能會(huì)引起數(shù)據(jù)一致性相關(guān)的問(wèn)題。

WAL優(yōu)化方案2：File Padding

在往 WAL 里面追加日志的時(shí)候，如果當(dāng)前的文件 block 不能保存新添加的日志，就要為文件分配新的 block，這要更新文件 inode 里面的信息（例如 size）。如果我們使用的是 HHD 的話，就要先 seek 到 inode 所在的位置，然后回到新添加 block 的位置進(jìn)行日志追加，這些都是發(fā)生在寫(xiě)事務(wù)日志時(shí)，這會(huì)明顯拖慢系統(tǒng)的性能。

為了減少這些 seek，我們可以預(yù)先為 WAL 分配 block。例如 ZooKeeper 當(dāng)檢測(cè)到當(dāng)前事務(wù)日志文件不足4KB時(shí)，就會(huì)填充0使該文件到64MB（這里0僅僅作為填充位）。并新建一個(gè)64MB的文件。

所以這也是Zookeeper不擅長(zhǎng)讀寫(xiě)大數(shù)據(jù)的原因之一，這會(huì)引起大量的block分配。

WAL優(yōu)化方案3：Snapshot

如果我們使用一個(gè)內(nèi)存數(shù)據(jù)結(jié)構(gòu)加 WAL 的存儲(chǔ)方案，WAL 就會(huì)一直增長(zhǎng)。這樣在存儲(chǔ)系統(tǒng)啟動(dòng)的時(shí)候，就要讀取大量的 WAL 日志數(shù)據(jù)來(lái)重建內(nèi)存數(shù)據(jù)?？煺湛梢越鉀Q這個(gè)問(wèn)題。

除了解決啟動(dòng)時(shí)間過(guò)長(zhǎng)的問(wèn)題之外，快照還可以減少存儲(chǔ)空間的使用。WAL 的多個(gè)日志條目有可能是對(duì)同一個(gè)數(shù)據(jù)的改動(dòng)，通過(guò)快照，就可以只保留最新的數(shù)據(jù)改動(dòng)（Merge）。

Zk的確采用了這個(gè)方案來(lái)做優(yōu)化。還帶來(lái)的一個(gè)好處是：在一個(gè)節(jié)點(diǎn)加入時(shí)，可以用最新的Snapshot傳過(guò)去便于同步數(shù)據(jù)。

源碼解析

本節(jié)內(nèi)容都以3.5.7版本為例

核心接口和類

TxnLog：接口類型，提供讀寫(xiě)事務(wù)日志的API。
FileTxnLog：基于文件的TxnLog實(shí)現(xiàn)。
Snapshot：快照接口類型，提供序列化、反序列化、訪問(wèn)快照API。
FileSnapshot：基于文件的Snapshot實(shí)現(xiàn)。
FileTxnSnapLog：TxnLog和Snapshot的封裝
DataTree：Zookeeper的內(nèi)存數(shù)據(jù)結(jié)構(gòu)，ZNode構(gòu)成的樹(shù)。
DataNode：表示一個(gè)ZNode。

TxnLog

TxnLog是我們前面提到的事務(wù)日志。那么接下來(lái)我們就來(lái)看它的相關(guān)源碼。

先看注釋：

package org.apache.zookeeper.server.persistence;

import ...

/**
 * This class implements the TxnLog interface. It provides api's
 * to access the txnlogs and add entries to it.
 * <p>
 * The format of a Transactional log is as follows:
 * <blockquote><pre>
 * LogFile:
 *     FileHeader TxnList ZeroPad
 *
 * FileHeader: {
 *     magic 4bytes (ZKLG)
 *     version 4bytes
 *     dbid 8bytes
 *   }
 *
 * TxnList:
 *     Txn || Txn TxnList
 *
 * Txn:
 *     checksum Txnlen TxnHeader Record 0x42
 *
 * checksum: 8bytes Adler32 is currently used
 *   calculated across payload -- Txnlen, TxnHeader, Record and 0x42
 *
 * Txnlen:
 *     len 4bytes
 *
 * TxnHeader: {
 *     sessionid 8bytes
 *     cxid 4bytes
 *     zxid 8bytes
 *     time 8bytes
 *     type 4bytes
 *   }
 *
 * Record:
 *     See Jute definition file for details on the various record types
 *
 * ZeroPad:
 *     0 padded to EOF (filled during preallocation stage)
 * </pre></blockquote>
 */
public class FileTxnLog implements TxnLog, Closeable {

在注釋中，我們可以看到一個(gè)FileLog由三部分組成：

FileHeader
TxnList
ZerdPad

關(guān)于FileHeader，可以理解其為一個(gè)標(biāo)示符。TxnList則為主要內(nèi)容。ZeroPad是一個(gè)終結(jié)符。

TxnLog.append

我們來(lái)看看最典型的append方法，可以將其理解WAL過(guò)程中的核心方法：

    /**
     * append an entry to the transaction log
     * @param hdr the header of the transaction
     * @param txn the transaction part of the entry
     * returns true iff something appended, otw false
     */
    public synchronized boolean append(TxnHeader hdr, Record txn)
        throws IOException
    {
        if (hdr == null) { //為null意味著這是一個(gè)讀請(qǐng)求，直接返回
            return false;
        }
        if (hdr.getZxid() <= lastZxidSeen) {
            LOG.warn("Current zxid " + hdr.getZxid()
                    + " is <= " + lastZxidSeen + " for "
                    + hdr.getType());
        } else {
            lastZxidSeen = hdr.getZxid();
        }
        if (logStream==null) { //為空的話則new一個(gè)Stream
           if(LOG.isInfoEnabled()){
                LOG.info("Creating new log file: " + Util.makeLogName(hdr.getZxid()));
           }

           logFileWrite = new File(logDir, Util.makeLogName(hdr.getZxid()));
           fos = new FileOutputStream(logFileWrite);
           logStream=new BufferedOutputStream(fos);
           oa = BinaryOutputArchive.getArchive(logStream);
           FileHeader fhdr = new FileHeader(TXNLOG_MAGIC,VERSION, dbId);
           fhdr.serialize(oa, "fileheader");   //寫(xiě)file header
           // Make sure that the magic number is written before padding.
           logStream.flush();      // zxid必須比日志先落盤(pán)
           filePadding.setCurrentSize(fos.getChannel().position());
           streamsToFlush.add(fos); //加入需要Flush的隊(duì)列
        }
        filePadding.padFile(fos.getChannel());   //確定是否要擴(kuò)容。每次64m擴(kuò)容
        byte[] buf = Util.marshallTxnEntry(hdr, txn);  //序列化寫(xiě)入
        if (buf == null || buf.length == 0) {
            throw new IOException("Faulty serialization for header " +
                    "and txn");
        }
        Checksum crc = makeChecksumAlgorithm();   //生成butyArray的checkSum
        crc.update(buf, 0, buf.length);
        oa.writeLong(crc.getValue(), "txnEntryCRC");//寫(xiě)入日志里
        Util.writeTxnBytes(oa, buf);
        return true;
    }

這里有個(gè)zxid（ZooKeeper Transaction Id），有點(diǎn)像MySQL的GTID。每次對(duì)Zookeeper的狀態(tài)的改變都會(huì)產(chǎn)生一個(gè)zxid，zxid是全局有序的，如果zxid1小于zxid2，則zxid1在zxid2之前發(fā)生。

簡(jiǎn)單分析一下寫(xiě)入過(guò)程：

確定要寫(xiě)的事務(wù)日志：當(dāng)Zk啟動(dòng)完成或日志寫(xiě)滿時(shí)，會(huì)與日志文件斷開(kāi)連接。這個(gè)時(shí)候會(huì)根據(jù)zxid創(chuàng)建一個(gè)日志。
是否需要預(yù)分配：如果檢測(cè)到當(dāng)前日志剩余空間不足4KB時(shí)
事務(wù)序列化
為每個(gè)事務(wù)生成一個(gè)Checksum，目的是為了校驗(yàn)數(shù)據(jù)的完整性和一致性。
寫(xiě)入文件，不過(guò)是寫(xiě)在Buffer里，并未落盤(pán)。
落盤(pán)。根據(jù)用戶配置來(lái)決定是否強(qiáng)制落盤(pán)。

TxnLog.commit

這個(gè)方法被調(diào)用的時(shí)機(jī)大致有：

服務(wù)端比較閑的時(shí)候去調(diào)用
到請(qǐng)求數(shù)量超出1000時(shí)，調(diào)用。之前提到過(guò)GroupCommit，其實(shí)就是在這個(gè)時(shí)候調(diào)用的。
zk的shutdown鉤子被調(diào)用時(shí)，調(diào)用

    /**
     * commit the logs. make sure that everything hits the
     * disk
     */
    public synchronized void commit() throws IOException {
        if (logStream != null) {
            logStream.flush();
        }
        for (FileOutputStream log : streamsToFlush) {
            log.flush();
            if (forceSync) {
                long startSyncNS = System.nanoTime();

                FileChannel channel = log.getChannel();
                channel.force(false);//對(duì)應(yīng)fdataSync

                syncElapsedMS = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startSyncNS);
                if (syncElapsedMS > fsyncWarningThresholdMS) {
                    if(serverStats != null) {
                        serverStats.incrementFsyncThresholdExceedCount();
                    }
                    LOG.warn("fsync-ing the write ahead log in "
                            + Thread.currentThread().getName()
                            + " took " + syncElapsedMS
                            + "ms which will adversely effect operation latency. "
                            + "File size is " + channel.size() + " bytes. "
                            + "See the ZooKeeper troubleshooting guide");
                }
            }
        }
        while (streamsToFlush.size() > 1) {
            streamsToFlush.removeFirst().close();
        }
    }

代碼非常的簡(jiǎn)單。如果logStream還有，那就先刷下去。然后遍歷待flush的隊(duì)列（是個(gè)鏈表，用來(lái)保持操作順序），同時(shí)還會(huì)關(guān)注寫(xiě)入的時(shí)間，如果過(guò)長(zhǎng)，則會(huì)打一個(gè)Warn的日志。

DataTree和DataNode

DataTree是Zk的內(nèi)存數(shù)據(jù)結(jié)構(gòu)——就是我們之前說(shuō)到的MTable。它以樹(shù)狀結(jié)構(gòu)來(lái)組織DataNode。

這么聽(tīng)起來(lái)可能有點(diǎn)云里霧里，不妨直接看一下DataNode的相關(guān)代碼。

public class DataNode implements Record {
    /** the data for this datanode */
    byte data[];

    /**
     * the acl map long for this datanode. the datatree has the map
     */
    Long acl;

    /**
     * the stat for this node that is persisted to disk.
     */
    public StatPersisted stat;

    /**
     * the list of children for this node. note that the list of children string
     * does not contain the parent path -- just the last part of the path. This
     * should be synchronized on except deserializing (for speed up issues).
     */
    private Set<String> children = null;
.....
}

如果用過(guò)ZkClient的小伙伴，可能非常熟悉。這就是我們根據(jù)一個(gè)path獲取數(shù)據(jù)時(shí)返回的相關(guān)屬性——這就是用來(lái)描述存儲(chǔ)數(shù)據(jù)的一個(gè)類。注意，DataNode還會(huì)維護(hù)它的Children。

簡(jiǎn)單了解DataNode后，我們來(lái)看一下DataTree。為了避免干擾，我們選出最關(guān)鍵的成員變量：

public class DataTree {
    private static final Logger LOG = LoggerFactory.getLogger(DataTree.class);

    /**
     * This hashtable provides a fast lookup to the datanodes. The tree is the
     * source of truth and is where all the locking occurs
     */
    private final ConcurrentHashMap<String, DataNode> nodes =
        new ConcurrentHashMap<String, DataNode>();

    private final WatchManager dataWatches = new WatchManager();

    private final WatchManager childWatches = new WatchManager();

    /**
     * This hashtable lists the paths of the ephemeral nodes of a session.
     */
    private final Map<Long, HashSet<String>> ephemerals =
        new ConcurrentHashMap<Long, HashSet<String>>();
    .......
}

我們可以看到，DataTree本質(zhì)上是通過(guò)一個(gè)ConcurrentHashMap來(lái)存儲(chǔ)DataNode的（臨時(shí)節(jié)點(diǎn)也是）。保存的是 DataNode 的 path 到 DataNode 的映射。

那為什么要保存兩個(gè)狀態(tài)呢？這得看調(diào)用它們被調(diào)用的場(chǎng)景：

一般CRUD ZNode的請(qǐng)求都是走ConcurrentHashMap的
序列化DataTree的時(shí)候會(huì)從Root節(jié)點(diǎn)開(kāi)始遍歷所有節(jié)點(diǎn)

如果需要獲取所有節(jié)點(diǎn)的信息，顯然遍歷樹(shù)會(huì)比一個(gè)個(gè)從ConcurrentHashMap 拿快。

接下來(lái)看一下序列化的相關(guān)代碼：

DataNode的序列化方法

    /**
     * this method uses a stringbuilder to create a new path for children. This
     * is faster than string appends ( str1 + str2).
     *
     * @param oa
     *            OutputArchive to write to.
     * @param path
     *            a string builder.
     * @throws IOException
     * @throws InterruptedException
     */
    void serializeNode(OutputArchive oa, StringBuilder path) throws IOException {
        String pathString = path.toString();
        DataNode node = getNode(pathString);
        if (node == null) {
            return;
        }
        String children[] = null;
        DataNode nodeCopy;
        synchronized (node) {
            StatPersisted statCopy = new StatPersisted();
            copyStatPersisted(node.stat, statCopy);
            //we do not need to make a copy of node.data because the contents
            //are never changed
            nodeCopy = new DataNode(node.data, node.acl, statCopy);
            Set<String> childs = node.getChildren();
            children = childs.toArray(new String[childs.size()]);
        }
        serializeNodeData(oa, pathString, nodeCopy);
        path.append('/');
        int off = path.length();
        for (String child : children) {
            // since this is single buffer being resused
            // we need
            // to truncate the previous bytes of string.
            path.delete(off, Integer.MAX_VALUE);
            path.append(child);
            serializeNode(oa, path);
        }
    }

可以看到，的確是通過(guò)DataNode的Children來(lái)遍歷所有節(jié)點(diǎn)。

DataNode的反序列化方法

接下來(lái)看一下反序列化的代碼：

    public void deserialize(InputArchive ia, String tag) throws IOException {
        aclCache.deserialize(ia);
        nodes.clear();
        pTrie.clear();
        String path = ia.readString("path");
        while (!"/".equals(path)) {
            DataNode node = new DataNode();
            ia.readRecord(node, "node");
            nodes.put(path, node);
            synchronized (node) {
                aclCache.addUsage(node.acl);
            }
            int lastSlash = path.lastIndexOf('/');
            if (lastSlash == -1) {
                root = node;
            } else {
                String parentPath = path.substring(0, lastSlash);
                DataNode parent = nodes.get(parentPath);
                if (parent == null) {
                    throw new IOException("Invalid Datatree, unable to find " +
                            "parent " + parentPath + " of path " + path);
                }
                parent.addChild(path.substring(lastSlash + 1));
                long eowner = node.stat.getEphemeralOwner();
                EphemeralType ephemeralType = EphemeralType.get(eowner);
                if (ephemeralType == EphemeralType.CONTAINER) {
                    containers.add(path);
                } else if (ephemeralType == EphemeralType.TTL) {
                    ttls.add(path);
                } else if (eowner != 0) {
                    HashSet<String> list = ephemerals.get(eowner);
                    if (list == null) {
                        list = new HashSet<String>();
                        ephemerals.put(eowner, list);
                    }
                    list.add(path);
                }
            }
            path = ia.readString("path");
        }
        nodes.put("/", root);
        // we are done with deserializing the
        // the datatree
        // update the quotas - create path trie
        // and also update the stat nodes
        setupQuota();

        aclCache.purgeUnused();
    }

因?yàn)樾蛄谢臅r(shí)候是前序遍歷。所以反序列化時(shí)是先反序列化父親節(jié)點(diǎn)，再反序列化孩子節(jié)點(diǎn)。

Snapshot

那么DataTree在什么情況下會(huì)序列化呢？在這里就要提到快照了。

前面提到過(guò)：如果我們使用一個(gè)內(nèi)存數(shù)據(jù)結(jié)構(gòu)加 WAL 的存儲(chǔ)方案，WAL 就會(huì)一直增長(zhǎng)。這樣在存儲(chǔ)系統(tǒng)啟動(dòng)的時(shí)候，就要讀取大量的 WAL 日志數(shù)據(jù)來(lái)重建內(nèi)存數(shù)據(jù)。快照可以解決這個(gè)問(wèn)題。

除了減少WAL日志，Snapshot還會(huì)在Zk全量同步時(shí)被用到——當(dāng)一個(gè)全新的ZkServer（這個(gè)一般叫Learner）被加入集群時(shí)，Leader服務(wù)器會(huì)將本機(jī)上的數(shù)據(jù)全量同步給新來(lái)的ZkServer。

序列化

接下來(lái)看一下代碼入口：

    /**
     * serialize the datatree and session into the file snapshot
     * @param dt the datatree to be serialized
     * @param sessions the sessions to be serialized
     * @param snapShot the file to store snapshot into
     */
    public synchronized void serialize(DataTree dt, Map<Long, Integer> sessions, File snapShot)
            throws IOException {
        if (!close) {
            try (OutputStream sessOS = new BufferedOutputStream(new FileOutputStream(snapShot));
                 CheckedOutputStream crcOut = new CheckedOutputStream(sessOS, new Adler32())) {
                //CheckedOutputStream cout = new CheckedOutputStream()
                OutputArchive oa = BinaryOutputArchive.getArchive(crcOut);
                FileHeader header = new FileHeader(SNAP_MAGIC, VERSION, dbId);
                serialize(dt, sessions, oa, header);
                long val = crcOut.getChecksum().getValue();
                oa.writeLong(val, "val");
                oa.writeString("/", "path");
                sessOS.flush();
            }
        } else {
            throw new IOException("FileSnap has already been closed");
        }
    }

JavaIO的基礎(chǔ)知識(shí)在這不再介紹，有興趣的人可以自行查閱資料或看從一段代碼談起——淺談JavaIO接口。

本質(zhì)就是創(chuàng)建文件，并調(diào)用DataTree的序列化方法，DataTree的序列化其實(shí)就是遍歷DataNode去序列化，最后將這些序列化的內(nèi)容寫(xiě)入文件。

反序列化

    /**
     * deserialize a data tree from the most recent snapshot
     * @return the zxid of the snapshot
     */
    public long deserialize(DataTree dt, Map<Long, Integer> sessions)
            throws IOException {
        // we run through 100 snapshots (not all of them)
        // if we cannot get it running within 100 snapshots
        // we should  give up
        List<File> snapList = findNValidSnapshots(100);
        if (snapList.size() == 0) {
            return -1L;
        }
        File snap = null;
        boolean foundValid = false;
        for (int i = 0, snapListSize = snapList.size(); i < snapListSize; i++) {
            snap = snapList.get(i);
            LOG.info("Reading snapshot " + snap);
            try (InputStream snapIS = new BufferedInputStream(new FileInputStream(snap));
                 CheckedInputStream crcIn = new CheckedInputStream(snapIS, new Adler32())) {
                InputArchive ia = BinaryInputArchive.getArchive(crcIn);
                deserialize(dt, sessions, ia);
                long checkSum = crcIn.getChecksum().getValue();
                long val = ia.readLong("val");
                if (val != checkSum) {
                    throw new IOException("CRC corruption in snapshot :  " + snap);
                }
                foundValid = true;
                break;
            } catch (IOException e) {
                LOG.warn("problem reading snap file " + snap, e);
            }
        }
        if (!foundValid) {
            throw new IOException("Not able to find valid snapshots in " + snapDir);
        }
        dt.lastProcessedZxid = Util.getZxidFromName(snap.getName(), SNAPSHOT_FILE_PREFIX);
        return dt.lastProcessedZxid;
    }

簡(jiǎn)單來(lái)說(shuō)，先讀取Snapshot文件們。并反序列化它們，組成DataTree。

小結(jié)

在本文中，筆者和大家一起學(xué)習(xí)了Zk的底層存儲(chǔ)技術(shù)。在此處，我們做個(gè)簡(jiǎn)單的回顧：

zk的數(shù)據(jù)主要維護(hù)在內(nèi)存中。在寫(xiě)入內(nèi)存前，會(huì)做WAL，同時(shí)也會(huì)定期的做快照持久化到磁盤(pán)
WAL的常見(jiàn)優(yōu)化手段有三種：Group Commit、File Padding、Snapshot

另外，Zk中序列化技術(shù)用的是Apache Jute——本質(zhì)上調(diào)用了JavaDataOutput和Input，較為簡(jiǎn)單。故沒(méi)在本文中展開(kāi)。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

深入淺出Zookeeper源碼（二）：存儲(chǔ)技術(shù)

深入淺出Zookeeper源碼（二）：存儲(chǔ)技術(shù)

前言

Zookeper本地存儲(chǔ)模型

WAL的優(yōu)化

WAL優(yōu)化方案1：Group Commit

WAL優(yōu)化方案2：File Padding

WAL優(yōu)化方案3：Snapshot

源碼解析

核心接口和類

TxnLog

TxnLog.append

TxnLog.commit

DataTree和DataNode

DataNode的序列化方法

DataNode的反序列化方法

Snapshot

序列化

反序列化

小結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

深入淺出Zookeeper源碼（二）：存儲(chǔ)技術(shù)

前言

Zookeper本地存儲(chǔ)模型

WAL的優(yōu)化

WAL優(yōu)化方案1：Group Commit

WAL優(yōu)化方案2：File Padding

WAL優(yōu)化方案3：Snapshot

源碼解析

核心接口和類

TxnLog

TxnLog.append

TxnLog.commit

DataTree和DataNode

DataNode的序列化方法

DataNode的反序列化方法

Snapshot

序列化

反序列化

小結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av