go的hashtable是用map實(shí)現(xiàn)。
今天簡單的整理了map的結(jié)構(gòu)。
1.map的結(jié)構(gòu)
hmap
在源碼中, map的結(jié)構(gòu)體是hmap, 它是hashmap的“縮寫”。
先看看看源碼是怎么解釋的。
// A header for a Go map.
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/reflectdata/reflect.go.
// Make sure this stays in sync with the compiler's definition.
count int // # live cells == size of map. Must be first (used by len() builtin)
flags uint8
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
hash0 uint32 // hash seed
buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated)
extra *mapextra // optional fields
}
下面是每個(gè)字段的內(nèi)容和值。
bmap
其中可以看出保存map數(shù)據(jù)的是在【buckets unsafe.Pointer】
源碼中的buckets pointer 是指向 bmap。
// A bucket for a Go map.
type bmap struct {
// tophash generally contains the top byte of the hash value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
// Followed by bucketCnt keys and then bucketCnt elems.
// NOTE: packing all the keys together and then all the elems together makes the
// code a bit more complicated than alternating key/elem/key/elem/... but it allows
// us to eliminate padding which would be needed for, e.g., map[int64]int8.
// Followed by an overflow pointer.
}
這里只有一個(gè)tophash字段,但是實(shí)際運(yùn)行中bmap的結(jié)構(gòu)會(huì)多幾個(gè)字段。
看看這些源碼的內(nèi)容。
// https://github.com/golang/go/blob/go1.13.8/src/cmd/compile/internal/gc/reflect.go
func bmap(t *types.Type) *types.Type {
// 略...
field := make([]*types.Field, 0, 5)
field = append(field, makefield("topbits", arr))
// 略...
keys := makefield("keys", arr)
field = append(field, keys)
// 略...
elems := makefield("elems", arr)
field = append(field, elems)
// 略...
if int(elemtype.Align) > Widthptr || int(keytype.Align) > Widthptr {
field = append(field, makefield("pad", types.Types[TUINTPTR]))
}
// 略...
overflow := makefield("overflow", otyp)
field = append(field, overflow)
// 略...
}
從這里我們可以推斷出,在執(zhí)行的過程中bmap的結(jié)構(gòu)是如下的。
type bmap struct {
topbits [8]uint8
keys [8]keytype
values [8]valuetype
pad uintptr
overflow uintptr
}
這是每個(gè)字段含有的內(nèi)容和值。
這里需要解釋一下。每個(gè)bucket(bmap)會(huì)存8調(diào)數(shù)據(jù)。
結(jié)構(gòu)是頭部會(huì)存一個(gè)uint8的tophash, 而且不是 key-value 的格式存純數(shù)據(jù)。
是存連續(xù)的8個(gè)的key和連續(xù)的8個(gè)value。這樣會(huì)節(jié)省空間。
mapextra
hamap里除了buckets(bmap)以外還有個(gè)extra *mapextra字段,
從字段名就能看出是是用于擴(kuò)展的。
// mapextra holds fields that are not present on all maps.
type mapextra struct {
// If both key and elem do not contain pointers and are inline, then we mark bucket
// type as containing no pointers. This avoids scanning such maps.
// However, bmap.overflow is a pointer. In order to keep overflow buckets
// alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow.
// overflow and oldoverflow are only used if key and elem do not contain pointers.
// overflow contains overflow buckets for hmap.buckets.
// oldoverflow contains overflow buckets for hmap.oldbuckets.
// The indirection allows to store a pointer to the slice in hiter.
overflow *[]*bmap
oldoverflow *[]*bmap
// nextOverflow holds a pointer to a free overflow bucket.
nextOverflow *bmap
}
2.map的key的定位過程
下面繼續(xù)看看Map是如何在這樣的結(jié)構(gòu)中保存數(shù)據(jù)和查詢的。
結(jié)合上圖看看map查詢key的定位過程的步驟:
- 先通過hash函數(shù)獲取目標(biāo)key的哈希,通過這個(gè)哈希值確定這個(gè)數(shù)據(jù)在哪個(gè)bucket(bmap)。
- 然后遍歷bmap里的鍵(先對(duì)比前8位tophash,如果一致再去對(duì)比key)獲取key的索引(找不到則返回空值)。
- 根據(jù)key的索引計(jì)算偏移量,獲取到對(duì)應(yīng)value。
- 如果bucket 中沒找到,并且 overflow 不為空,還要繼續(xù)去 overflow bucket 中尋找。
如上步驟是一個(gè)簡化的過程,實(shí)際細(xì)節(jié)比這個(gè)復(fù)雜的多了。
如果想了解更詳細(xì)的內(nèi)容那只能是分析源代碼了。
而且現(xiàn)在有很多大牛分享了這一個(gè)過程。
可以先看一下人家分析的內(nèi)容在去看源代碼,會(huì)節(jié)省的很多時(shí)間。
歡迎大家的意見和交流
email: li_mingxie@163.com