HASH表設(shè)計(jì)與源碼分析

  • 擴(kuò)容操作
  • 擴(kuò)容大小
  • 漸進(jìn)式rehash
  • 何時(shí)觸發(fā)漸進(jìn)式rehash

Redis提供了傳統(tǒng)的hash表實(shí)現(xiàn),但是對(duì)其中的內(nèi)存管理提供了擴(kuò)充,提供了擴(kuò)容/縮容處理,為了盡可能的高效,在擴(kuò)容縮容的過(guò)程中如果hash數(shù)據(jù)量非常多,將是一個(gè)堵塞的操作,redis設(shè)計(jì)了漸進(jìn)式rehash來(lái)將rehash操作分散到每次對(duì)hash的操作中。

擴(kuò)容

static int _dictExpandIfNeeded(dict *d)
{
    /* Incremental rehashing already in progress. Return. */
    if (dictIsRehashing(d)) return DICT_OK;

    /* If the hash table is empty expand it to the initial size. */
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);/*當(dāng)hash表的大小為0時(shí),觸發(fā)擴(kuò)容*/

    /* If we reached the 1:1 ratio, and we are allowed to resize the hash
     * table (global setting) or we should avoid it but the ratio between
     * elements/buckets is over the "safe" threshold, we resize doubling
     * the number of buckets. */
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio) &&
        dictTypeExpandAllowed(d))/*當(dāng)hash表的數(shù)據(jù)量大于hash表的大小并且允許進(jìn)行resize(redis當(dāng)前不在進(jìn)行RDB備份或者AOF重寫(xiě))時(shí),或者當(dāng)hash表的數(shù)據(jù)量大于hash表大小的5倍時(shí)*/
    {
        return dictExpand(d, d->ht[0].used + 1);
    }
    return DICT_OK;
}
int _dictExpand(dict *d, unsigned long size, int* malloc_failed)
{
    if (malloc_failed) *malloc_failed = 0;

    /* the size is invalid if it is smaller than the number of
     * elements already inside the hash table */
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    dictht n; /* the new hash table */
    unsigned long realsize = _dictNextPower(size);/*計(jì)算新的數(shù)組大小,數(shù)組大小是大于hash元素的2的n次方*/

    /* Rehashing to the same table size is not useful. */
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* Allocate the new hash table and initialize all pointers to NULL */
    n.size = realsize;
    n.sizemask = realsize-1;
    if (malloc_failed) {
        n.table = ztrycalloc(realsize*sizeof(dictEntry*));
        *malloc_failed = n.table == NULL;
        if (*malloc_failed)
            return DICT_ERR;
    } else
        n.table = zcalloc(realsize*sizeof(dictEntry*));

    n.used = 0;

    /* Is this the first initialization? If so it's not really a rehashing
     * we just set the first hash table so that it can accept keys. */
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    /* Prepare a second hash table for incremental rehashing */
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

static unsigned long _dictNextPower(unsigned long size)
{
    unsigned long i = DICT_HT_INITIAL_SIZE;/*初識(shí)容量4*/

    if (size >= LONG_MAX) return LONG_MAX + 1LU;
    while(1) {
        if (i >= size)
            return i;
        i *= 2;
    }
}

擴(kuò)容條件

1、當(dāng)前hash表的元素大于hash表的大小,并且此時(shí)時(shí)允許resize的(當(dāng)前沒(méi)有進(jìn)行著RDB的備份和AOF的重寫(xiě))。
2、或者當(dāng)前hash表的元素大于hash表數(shù)量的5倍。

擴(kuò)容大小

大于hash表元素個(gè)數(shù)兩倍的2的n次冪。

漸進(jìn)式rehash

static void _dictRehashStep(dict *d) {
    if (d->pauserehash == 0) dictRehash(d,1);
}

nt dictRehash(dict *d, int n) {/*n 表示本次遷移的bucket的數(shù)量*/
    int empty_visits = n*10; /*表示本次遷移過(guò)程允許的最大的訪問(wèn)到空的bucket的次數(shù) */
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;
        /*rehashidx表示遷移的bucket下標(biāo)*/
        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx];
        /* Move all the keys in this bucket from the old to the new hash HT */
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* Check if we already rehashed the whole table... */
    if (d->ht[0].used == 0) {/*如果歷史hash表的元素全部遷移完成,則釋放歷史hash表的內(nèi)存*/
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }

    /* More to rehash... */
    return 1;
}

何時(shí)觸發(fā)漸進(jìn)式rehash

  • dictAddRaw(dict *d, void *key, dictEntry **existing)
  • dictGenericDelete(dict *d, const void *key, int nofree)
  • *dictFind(dict *d, const void *key)
  • *dictGetRandomKey(dict *d)
  • dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count)
    幾乎是在hash的所有數(shù)據(jù)操作過(guò)程調(diào)用中,都會(huì)在進(jìn)行操作前先判斷當(dāng)前hash是否在進(jìn)行rehash操作再進(jìn)行rehash操作。

總結(jié)

redis對(duì)于hash表的設(shè)計(jì)很常規(guī),但是擴(kuò)展實(shí)現(xiàn)了擴(kuò)縮容操作。為了減少擴(kuò)縮容的時(shí)間消耗,redis采用的分治法,將hash擴(kuò)容之后的數(shù)據(jù)遷移工作分散到對(duì)hash的操作中,從而防止了單次數(shù)據(jù)遷移耗時(shí)高的問(wèn)題,但是帶來(lái)的弊端是會(huì)存在比單次完成數(shù)據(jù)遷移更多的內(nèi)存消耗,因?yàn)閿U(kuò)容時(shí)會(huì)存量?jī)蓚€(gè)數(shù)組內(nèi)存的消耗。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容