Innodb-AHI

作用

  • AHI是針對葉子節(jié)點(diǎn)的,作用是減少B樹尋址的代價。

如何做的

通過key(index_id+fileds+bytes):value(記錄的物理地址)來直接定位。
這里面有幾個細(xì)節(jié)

  • fileds+bytes如何選擇
  • key(index_id+fileds+bytes)能唯一確定一條記錄,如果存在重復(fù)的情況則選擇最左或最右的記錄。
    如下圖,假設(shè)fileds為2,bytes為0,則需要創(chuàng)建(a,a),(a,b),(a,c),(a,d)的key(不考慮跨頁的情況)。
    存在重復(fù)的情況,當(dāng)left_side為True時,key(a,b)的value選擇(a,b,1)這條記錄,否則key(a,b)的value選擇(a,b,3)這條記錄.


    image.png

如何使用

整體流程

在btr_cur_search_to_nth_level函數(shù)里

  • 如果滿足一定條件,就進(jìn)入btr_search_guess_on_hash來定位,再通過btr_search_check_guess來判斷記錄的有效性。
  • 如果無法通過AHI定位或者定位不成功,search_loop逐層查找。
  • 當(dāng)完成了搜索之后,如果最終定位的層是葉子節(jié)點(diǎn),會調(diào)用btr_search_info_update更新AHI相關(guān)的信息。
    代碼
btr_cur_search_to_nth_level(
    dict_index_t*  index,  /*!< in: index */
    ulint      level,  /*!< in: the tree level of search */
    ulint      mode,   /*!< in: PAGE_CUR_L, ...;
                Inserts should always be made using
                PAGE_CUR_LE to search the position! */
    ...)
{
    /* Use of AHI is disabled for intrinsic table as these tables re-use
    the index-id and AHI validation is based on index-id. */
    if (rw_lock_get_writer(btr_get_search_latch(index))
        == RW_LOCK_NOT_LOCKED
        && latch_mode <= BTR_MODIFY_LEAF
        && info->last_hash_succ
        && !index->disable_ahi
        && !estimate
# ifdef PAGE_CUR_LE_OR_EXTENDS
        && mode != PAGE_CUR_LE_OR_EXTENDS
# endif /* PAGE_CUR_LE_OR_EXTENDS */
        && !dict_index_is_spatial(index)
        /* If !has_search_latch, we do a dirty read of
        btr_search_enabled below, and btr_search_guess_on_hash()
        will have to check it again. */
        && UNIV_LIKELY(btr_search_enabled)
        && !modify_external
        && btr_search_guess_on_hash(index, info, tuple, mode,
                    latch_mode, cursor,
                    has_search_latch, mtr)) {

        /* Search using the hash index succeeded */

        ut_ad(cursor->up_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_GE);
        ut_ad(cursor->up_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_LE);
        ut_ad(cursor->low_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_LE);
        btr_cur_n_sea++;

        DBUG_VOID_RETURN;
    }
  // 初始的,獲得索引的根節(jié)點(diǎn)(space_id,page_no)
  space = dict_index_get_space(index);
  page_no = dict_index_get_page(index);

search_loop:
  // 循環(huán)、逐層的查找,直至達(dá)到傳入的層數(shù)「level」,一般是0(即葉子節(jié)點(diǎn))
  // 此處的分析忽略Change Buffer的部分
  // 從Buffer Pool或磁盤中得到索引頁    
  block = buf_page_get_gen(space, zip_size, page_no, rw_latch, guess, buf_mode,
        file, line, mtr);
    
  // 在索引頁中中查找對于指定的Tuple,滿足某種條件(依賴于傳入的 mode,例如 PAGE_CUR_L)
  // 的 record 將查找結(jié)果保存在page_cursor中,page_cursor結(jié)構(gòu)也很簡單:
  //   struct page_cur_t{
  //     byte*       rec;    /*!< pointer to a record on page */
  //     buf_block_t*    block;  /*!< pointer to the block containing rec */
  //   };
  page_cur_search_with_match(block, index, tuple, page_mode, &up_match, &up_bytes,
        &low_match, &low_bytes, page_cursor);

  if (level != height) {
    // 如果沒到達(dá)指定層數(shù),獲得page_cursor(中間節(jié)點(diǎn))內(nèi)保存的下層節(jié)點(diǎn)的索引頁page_no
    //注意:中間節(jié)點(diǎn)的Value是一個Pointer(page_no),指向子節(jié)點(diǎn)(中間節(jié)點(diǎn)或葉子節(jié)點(diǎn))
    node_ptr = page_cur_get_rec(page_cursor);
    /* Go to the child node */
    page_no = btr_node_ptr_get_child_page_no(node_ptr, offsets);
        
    // 在下一層繼續(xù)查找
    goto search_loop;
  }

  // 達(dá)到指定層數(shù),函數(shù)退出
        if (btr_search_enabled && !index->disable_ahi) {
            btr_search_info_update(index, cursor);
        }
}

定位與判斷有效性

定位-btr_search_guess_on_hash
  • 首先用戶提供的前綴索引查詢條件必須大于等于構(gòu)建AHI時的前綴索引列數(shù),這里存在一種可能性:索引上的search_info的n_fields 和block上構(gòu)建AHI時的cur_n_fields值已經(jīng)不相同了,但是我們并不知道本次查詢到底落在哪個block上,這里一致以search_info上的n_fields為準(zhǔn)來計算fold,去查詢AHI;
  • 在檢索AHI時需要加&btr_search_latch的S鎖;
  • 如果本次無法命中AHI,就會將btr_search_info::last_hash_succ設(shè)置為false,這意味著隨后的查詢都不會去使用AHI了,只能等待下一路查詢信息分析后才可能再次啟動(btr_search_failure);
  • 對于從ahi中獲得的記錄指針,還需要根據(jù)當(dāng)前的查詢模式檢查是否是正確的記錄位置(btr_search_check_guess)。
判斷記錄有效性btr_search_check_guess

判斷記錄的有效性跟查詢模式很相關(guān),細(xì)節(jié)看注釋。

btr_search_check_guess(
    btr_cur_t*  cursor,
    ibool       can_only_compare_to_cursor_rec,
    const dtuple_t* tuple,
    ulint       mode,
    mtr_t*      mtr)
{
    rec_t*      rec;
    ulint       n_unique;
    ulint       match;
    int     cmp;
    mem_heap_t* heap        = NULL;
    ulint       offsets_[REC_OFFS_NORMAL_SIZE];
    ulint*      offsets     = offsets_;
    ibool       success     = FALSE;
    rec_offs_init(offsets_);
    n_unique = dict_index_get_n_unique_in_tree(cursor->index);
    rec = btr_cur_get_rec(cursor);
    ut_ad(page_rec_is_user_rec(rec));
    match = 0;
    offsets = rec_get_offsets(rec, cursor->index, offsets,
                  n_unique, &heap);
    cmp = cmp_dtuple_rec_with_match(tuple, rec, offsets, &match);
    if (mode == PAGE_CUR_GE) {
        //cmp>0,說明tuple大于rec,rec可能是重復(fù)情況下的最左記錄,比如AHI的key為(a,b),tuple為(a,b,c),rec為(a,b,a),那么這種情況就不行。
        if (cmp > 0) {
            goto exit_func;
        }
        cursor->up_match = match;
        if (match >= n_unique) {
            success = TRUE;
            goto exit_func;
        }
    } else if (mode == PAGE_CUR_LE) {
        if (cmp < 0) {
            goto exit_func;
        }
        cursor->low_match = match;
    } else if (mode == PAGE_CUR_G) {
        if (cmp >= 0) {
            goto exit_func;
        }
    } else if (mode == PAGE_CUR_L) {
        if (cmp <= 0) {
            goto exit_func;
        }
    }
    if (can_only_compare_to_cursor_rec) {
        /* Since we could not determine if our guess is right just by
        looking at the record under the cursor, return FALSE */
        goto exit_func;
    }
    match = 0;
    //還需要進(jìn)一步判斷記錄是否有效,當(dāng)mode為PAGE_CUR_G和PAGE_CUR_GE時,判斷記錄是否為滿足AHI查詢key的最左記錄。否則判斷記錄是否為滿足AHI查詢key的最右記錄。
    if ((mode == PAGE_CUR_G) || (mode == PAGE_CUR_GE)) {
        rec_t*  prev_rec;
        ut_ad(!page_rec_is_infimum(rec));
        prev_rec = page_rec_get_prev(rec);
        if (page_rec_is_infimum(prev_rec)) {
            success = btr_page_get_prev(page_align(prev_rec), mtr)
                == FIL_NULL;
            goto exit_func;
        }
        offsets = rec_get_offsets(prev_rec, cursor->index, offsets,
                      n_unique, &heap);
        cmp = cmp_dtuple_rec_with_match(
            tuple, prev_rec, offsets, &match);
        if (mode == PAGE_CUR_GE) {
            success = cmp > 0;
        } else {
            success = cmp >= 0;
        }
        goto exit_func;
    } else {
        rec_t*  next_rec;
        ut_ad(!page_rec_is_supremum(rec));
        next_rec = page_rec_get_next(rec);
        if (page_rec_is_supremum(next_rec)) {
            if (btr_page_get_next(page_align(next_rec), mtr)
                == FIL_NULL) {
                cursor->up_match = 0;
                success = TRUE;
            }
            goto exit_func;
        }
        offsets = rec_get_offsets(next_rec, cursor->index, offsets,
                      n_unique, &heap);
        cmp = cmp_dtuple_rec_with_match(
            tuple, next_rec, offsets, &match);
        if (mode == PAGE_CUR_LE) {
            success = cmp < 0;
            cursor->up_match = match;
        } else {
            success = cmp <= 0;
        }
    }
exit_func:
    if (UNIV_LIKELY_NULL(heap)) {
        mem_heap_free(heap);
    }
    return(success);
}

如何構(gòu)建AHI

fileds+bytes

上文說了key是由index_id+fileds+bytes構(gòu)成,那么如何確定fileds和bytes。

fileds和bytes是什么概念

參看http://www.itdecent.cn/p/0cdd573a8232

用于確定fileds和bytes的結(jié)構(gòu)體字段

總共有3個結(jié)構(gòu)體在確定fileds和bytes發(fā)揮作用,分別是btr_cur_t(樹查詢時的游標(biāo))、btr_search_t(為每個索引維護(hù)的查詢信息)、buf_block_t(block控制結(jié)構(gòu)體)。btr_cur_t中的信息在B樹定位中更新,在B樹定位后,btr_search_t根據(jù)btr_cur_t的信息更新,用于記錄B樹查詢相關(guān)的信息,然后buf_block_t根據(jù)btr_search_t的信息更新,用于記錄本Block相關(guān)的查詢信息。

為每個索引對象維護(hù)的index->search_info,類型為btr_search_t。

/** The search info struct in an index */
struct btr_search_t{

    ...

    ulint   n_fields;   /*!< recommended prefix length for hash search:
                number of full fields */
    ulint   n_fields;   /*!< recommended prefix: number of bytes in
                an incomplete field
                @see BTR_PAGE_MAX_REC_SIZE */
    ibool   left_side;  /*!< TRUE or FALSE, depending on whether
                the leftmost record of several records with
                the same prefix should be indexed in the
                hash index */

    ...

};

block控制結(jié)構(gòu)體上相關(guān)變量(buf_block_t)

struct buf_block_t{
    
    ...

    volatile ulint  n_bytes;    /*!< recommended prefix length for hash
                    search: number of bytes in
                    an incomplete last field */
    volatile ulint  n_fields;   /*!< recommended prefix length for hash
                    search: number of full fields */
    volatile bool   left_side;  /*!< true or false, depending on
                    whether the leftmost record of several
                    records with the same prefix should be
                    indexed in the hash index */
    ...
}

The tree cursor

struct btr_cur_t {

    ...

    ulint       up_match;   /*!< If the search mode was PAGE_CUR_LE,
                    the number of matched fields to the
                    the first user record to the right of
                    the cursor record after
                    btr_cur_search_to_nth_level;
                    for the mode PAGE_CUR_GE, the matched
                    fields to the first user record AT THE
                    CURSOR or to the right of it;
                    NOTE that the up_match and low_match
                    values may exceed the correct values
                    for comparison to the adjacent user
                    record if that record is on a
                    different leaf page! (See the note in
                    row_ins_duplicate_error_in_clust.) */
    ulint       up_bytes;   /*!< number of matched bytes to the
                    right at the time cursor positioned;
                    only used internally in searches: not
                    defined after the search */
    ulint       low_match;  /*!< if search mode was PAGE_CUR_LE,
                    the number of matched fields to the
                    first user record AT THE CURSOR or
                    to the left of it after
                    btr_cur_search_to_nth_level;
                    NOT defined for PAGE_CUR_GE or any
                    other search modes; see also the NOTE
                    in up_match! */
    ulint       low_bytes;  /*!< number of matched bytes to the
                    left at the time cursor positioned;
                    only used internally in searches: not
                    defined after the search */
    ulint       n_fields;   /*!< prefix length used in a hash
                    search if hash_node != NULL */
    ulint       n_bytes;    /*!< hash prefix bytes if hash_node !=
                    NULL */

    ...
    
};
確定fileds與bytes的時機(jī)

參考整體流程,當(dāng)完成了搜索之后,如果最終定位的層是葉子節(jié)點(diǎn),會調(diào)用btr_search_info_update更新AHI相關(guān)的信息。
這個時候cursor->{up_match, up_bytes, low_match, low_bytes}都已經(jīng)確定。
首先需要根據(jù)cursor->{up_match, up_bytes, low_match, low_bytes}來更新index的search info。
路徑為btr_search_info_update->btr_search_info_update_slow->btr_search_info_update_hash。
有兩種情況需要更新btr_search_t->{n_fields,n_bytes,left_side}。

  • btr_search_t->n_hash_potential為0:search info首次初始化或者上次查詢根據(jù)查詢條件無法唯一確定一條記錄。
  • 如代碼所示,如果cmp<=0,說明cursor->low_match, cursor->low_bytes所在的記錄是在info->n_fields, info->n_bytes這個范圍內(nèi)與查詢條件相等的最右邊的記錄,如果info的建議是按照相同前綴最左記錄構(gòu)建AHI,說明已不符合當(dāng)次查詢要求,需要重新生成建議。(補(bǔ)個圖吧)
    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->low_match, cursor->low_bytes);
    if (info->left_side ? cmp <= 0 : cmp > 0) {
        goto set_new_recomm;
    }

生成info->{n_fields,n_bytes,left_side}新值是如下算法,由以下算法可以看出,選擇{info->n_fields, info->n_bytes, info->left_side}的依據(jù)則是在不超過 unique index 列數(shù)的前提下,使其計算代價最小,而 index->info->left_side 的值則會決定存儲同一數(shù)據(jù)頁上相同前綴索引的最左記錄還是最右記錄。
細(xì)節(jié)說明看注釋

set_new_recomm:
    /* We have to set a new recommendation; skip the hash analysis
    for a while to avoid unnecessary CPU time usage when there is no
    chance for success */
    info->hash_analysis = 0;
    cmp = ut_pair_cmp(cursor->up_match, cursor->up_bytes,
              cursor->low_match, cursor->low_bytes);
    if (cmp == 0) {
        //cmp==0說明根據(jù)查詢條件無法唯一確定一條記錄,比如根據(jù)=b查詢,然后定位low定位到a,up定位到c。
        info->n_hash_potential = 0;
        /* For extra safety, we set some sensible values here */
        info->n_fields = 1;
        info->n_bytes = 0;
        info->left_side = TRUE;
    } else if (cmp > 0) {
        //cm
        info->n_hash_potential = 1;
        if (cursor->up_match >= n_unique) {
            //n_unique個fileds已經(jīng)能唯一確定一條記錄了
            info->n_fields = n_unique;
            info->n_bytes = 0;

        } else if (cursor->low_match < cursor->up_match) {
            //+1怎么理解,比如low_match=1,up_match=3,這個時候把n_fields設(shè)置為2已經(jīng)足以定位到up_match上了,比如查詢條件是(a,b,c),low_match為(a), up_match為(a,b,c)這個時候使用(a,b已經(jīng)足以定位到up_match)。另外low_match=0時,n_fields設(shè)置為1,也足以滿足情況了。
            info->n_fields = cursor->low_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->low_match;
            info->n_bytes = cursor->low_bytes + 1;
        }

        info->left_side = TRUE;
    } else {
        info->n_hash_potential = 1;

        if (cursor->low_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;
        } else if (cursor->low_match > cursor->up_match) {

            info->n_fields = cursor->up_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->up_match;
            info->n_bytes = cursor->up_bytes + 1;
        }

        info->left_side = FALSE;
    }
static
void
btr_search_info_update_hash(
    btr_search_t*   info,
    btr_cur_t*  cursor)
{
    dict_index_t*   index = cursor->index;
    ulint       n_unique;
    int     cmp;

    ut_ad(!rw_lock_own(btr_get_search_latch(index), RW_LOCK_S));
    ut_ad(!rw_lock_own(btr_get_search_latch(index), RW_LOCK_X));

    if (dict_index_is_ibuf(index)) {
        /* So many deletes are performed on an insert buffer tree
        that we do not consider a hash index useful on it: */

        return;
    }

    n_unique = dict_index_get_n_unique_in_tree(index);

    if (info->n_hash_potential == 0) {

        goto set_new_recomm;
    }

    /* Test if the search would have succeeded using the recommended
    hash prefix */

    if (info->n_fields >= n_unique && cursor->up_match >= n_unique) {
increment_potential:
        info->n_hash_potential++;

        return;
    }

    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->low_match, cursor->low_bytes);

    if (info->left_side ? cmp <= 0 : cmp > 0) {

        goto set_new_recomm;
    }

    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->up_match, cursor->up_bytes);

    if (info->left_side ? cmp <= 0 : cmp > 0) {

        goto increment_potential;
    }

set_new_recomm:
    /* We have to set a new recommendation; skip the hash analysis
    for a while to avoid unnecessary CPU time usage when there is no
    chance for success */

    info->hash_analysis = 0;

    cmp = ut_pair_cmp(cursor->up_match, cursor->up_bytes,
              cursor->low_match, cursor->low_bytes);
    if (cmp == 0) {
        //cmp==0說明根據(jù)查詢條件無法唯一確定一條記錄,比如根據(jù)=b查詢,然后定位low定位到a,up定位到c。
        info->n_hash_potential = 0;

        /* For extra safety, we set some sensible values here */

        info->n_fields = 1;
        info->n_bytes = 0;

        info->left_side = TRUE;

    } else if (cmp > 0) {
        info->n_hash_potential = 1;

        if (cursor->up_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;

        } else if (cursor->low_match < cursor->up_match) {

            info->n_fields = cursor->low_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->low_match;
            info->n_bytes = cursor->low_bytes + 1;
        }

        info->left_side = TRUE;
    } else {
        info->n_hash_potential = 1;

        if (cursor->low_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;
        } else if (cursor->low_match > cursor->up_match) {

            info->n_fields = cursor->up_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->up_match;
            info->n_bytes = cursor->up_bytes + 1;
        }

        info->left_side = FALSE;
    }
}

完成Index層面的n_fileds和n_bytes建議后,如何落實到block層面。
代碼路徑為btr_search_info_update->btr_search_info_update_slow->btr_search_update_block_hash_info。
因為AHI雖然是針對Index產(chǎn)生建議,但是最終是在block上建立key:value的映射關(guān)系,block層面的記錄的是對block的查詢信息,如果滿足一定條件,就建立AHI。
有個疑問?在一個block構(gòu)建完成后,如果index建議的fields和bytes發(fā)生變化,innodb是什么行為。

如何避免頻繁構(gòu)建AHI

  • Index層面btr_search_t,如何避免頻繁生成新的建議
  • block層面buf_block_t,如何判斷該block是否值得構(gòu)建
    先說index層面的btr_search_t,btr_search_t有個變量hash_analysis,當(dāng)生成新的建議后hash_analysis被重置為0,重置后對該索引BTR_SEARCH_HASH_ANALYSIS次查詢內(nèi),都不會嘗試生成新的建議了。
    再說block層面的,參考https://juejin.cn/post/6844903536765976590

AHI并發(fā)控制

https://juejin.cn/post/6844903536765976590
https://developer.aliyun.com/article/41046
http://mysql.taobao.org/monthly/2015/09/01/
http://www.itdecent.cn/p/0cdd573a8232

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容