熟女老熟女,午夜久久日韩av

string類型和[]byte類型是我們編程時最常使用到的數(shù)據(jù)結(jié)構(gòu)。本文將探討兩者之間的轉(zhuǎn)換方式，通過分析它們之間的內(nèi)在聯(lián)系來撥開迷霧。

兩種轉(zhuǎn)換方式

標(biāo)準(zhǔn)轉(zhuǎn)換

go中string與[]byte的互換，相信每一位gopher都能立刻想到以下的轉(zhuǎn)換方式，我們將之稱為標(biāo)準(zhǔn)轉(zhuǎn)換。

    // string to []byte
    s1 := "hello"
    b := []byte(s1)
    
    // []byte to string
    s2 := string(b)

強轉(zhuǎn)換

通過unsafe和reflect包，可以實現(xiàn)另外一種轉(zhuǎn)換方式，我們將之稱為強轉(zhuǎn)換（也常常被人稱作黑魔法）。

func String2Bytes(s string) []byte {
    sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
    bh := reflect.SliceHeader{
        Data: sh.Data,
        Len:  sh.Len,
        Cap:  sh.Len,
    }
    return *(*[]byte)(unsafe.Pointer(&bh))
}

func Bytes2String(b []byte) string {
    return *(*string)(unsafe.Pointer(&b))
}

性能對比

既然有兩種轉(zhuǎn)換方式，那么我們有必要對它們做性能對比。

// 測試強轉(zhuǎn)換功能
func TestBytes2String(t *testing.T) {
    x := []byte("Hello Gopher!")
    y := Bytes2String(x)
    z := string(x)

    if y != z {
        t.Fail()
    }
}

// 測試強轉(zhuǎn)換功能
func TestString2Bytes(t *testing.T) {
    x := "Hello Gopher!"
    y := String2Bytes(x)
    z := []byte(x)

    if !bytes.Equal(y, z) {
        t.Fail()
    }
}

// 測試標(biāo)準(zhǔn)轉(zhuǎn)換string()性能
func Benchmark_NormalBytes2String(b *testing.B) {
    x := []byte("Hello Gopher! Hello Gopher! Hello Gopher!")
    for i := 0; i < b.N; i++ {
        _ = string(x)
    }
}

// 測試強轉(zhuǎn)換[]byte到string性能
func Benchmark_Byte2String(b *testing.B) {
    x := []byte("Hello Gopher! Hello Gopher! Hello Gopher!")
    for i := 0; i < b.N; i++ {
        _ = Bytes2String(x)
    }
}

// 測試標(biāo)準(zhǔn)轉(zhuǎn)換[]byte性能
func Benchmark_NormalString2Bytes(b *testing.B) {
    x := "Hello Gopher! Hello Gopher! Hello Gopher!"
    for i := 0; i < b.N; i++ {
        _ = []byte(x)
    }
}

// 測試強轉(zhuǎn)換string到[]byte性能
func Benchmark_String2Bytes(b *testing.B) {
    x := "Hello Gopher! Hello Gopher! Hello Gopher!"
    for i := 0; i < b.N; i++ {
        _ = String2Bytes(x)
    }
}

測試結(jié)果如下

$ go test -bench="." -benchmem
goos: darwin
goarch: amd64
pkg: workspace/example/stringBytes
Benchmark_NormalBytes2String-8          38363413                27.9 ns/op            48 B/op          1 allocs/op
Benchmark_Byte2String-8                 1000000000               0.265 ns/op           0 B/op          0 allocs/op
Benchmark_NormalString2Bytes-8          32577080                34.8 ns/op            48 B/op          1 allocs/op
Benchmark_String2Bytes-8                1000000000               0.532 ns/op           0 B/op          0 allocs/op
PASS
ok      workspace/example/stringBytes   3.170s

注意，-benchmem可以提供每次操作分配內(nèi)存的次數(shù)，以及每次操作分配的字節(jié)數(shù)。

當(dāng)x的數(shù)據(jù)均為"Hello Gopher!"時，測試結(jié)果如下

$ go test -bench="." -benchmem
goos: darwin
goarch: amd64
pkg: workspace/example/stringBytes
Benchmark_NormalBytes2String-8          245907674                4.86 ns/op            0 B/op          0 allocs/op
Benchmark_Byte2String-8                 1000000000               0.266 ns/op           0 B/op          0 allocs/op
Benchmark_NormalString2Bytes-8          202329386                5.92 ns/op            0 B/op          0 allocs/op
Benchmark_String2Bytes-8                1000000000               0.532 ns/op           0 B/op          0 allocs/op
PASS
ok      workspace/example/stringBytes   4.383s

強轉(zhuǎn)換方式的性能會明顯優(yōu)于標(biāo)準(zhǔn)轉(zhuǎn)換。

讀者可以思考以下問題

1.為啥強轉(zhuǎn)換性能會比標(biāo)準(zhǔn)轉(zhuǎn)換好？

2.為啥在上述測試中，當(dāng)x的數(shù)據(jù)較大時，標(biāo)準(zhǔn)轉(zhuǎn)換方式會有一次分配內(nèi)存的操作，從而導(dǎo)致其性能更差，而強轉(zhuǎn)換方式卻不受影響？

3.既然強轉(zhuǎn)換方式性能這么好，為啥go語言提供給我們使用的是標(biāo)準(zhǔn)轉(zhuǎn)換方式？

原理分析

要回答以上三個問題，首先要明白是string和[]byte在go中到底是什么。

[]byte

在go中，byte是uint8的別名，在go標(biāo)準(zhǔn)庫builtin中有如下說明：

// byte is an alias for uint8 and is equivalent to uint8 in all ways. It is
// used, by convention, to distinguish byte values from 8-bit unsigned
// integer values.
type byte = uint8

在go的源碼中src/runtime/slice.go，slice的定義如下：

type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

array是底層數(shù)組的指針，len表示長度，cap表示容量。對于[]byte來說，array指向的就是byte數(shù)組。

1.png

string

關(guān)于string類型，在go標(biāo)準(zhǔn)庫builtin中有如下說明：

// string is the set of all strings of 8-bit bytes, conventionally but not
// necessarily representing UTF-8-encoded text. A string may be empty, but
// not nil. Values of string type are immutable.
type string string

翻譯過來就是：string是8位字節(jié)的集合，通常但不一定代表UTF-8編碼的文本。string可以為空，但是不能為nil。string的值是不能改變的。

在go的源碼中src/runtime/string.go，string的定義如下：

type stringStruct struct {
    str unsafe.Pointer
    len int
}

stringStruct代表的就是一個string對象，str指針指向的是某個數(shù)組的首地址，len代表的數(shù)組長度。那么這個數(shù)組是什么呢？我們可以在實例化stringStruct對象時找到答案。

//go:nosplit
func gostringnocopy(str *byte) string {
    ss := stringStruct{str: unsafe.Pointer(str), len: findnull(str)}
    s := *(*string)(unsafe.Pointer(&ss))
    return s
}

可以看到，入?yún)tr指針就是指向byte的指針，那么我們可以確定string的底層數(shù)據(jù)結(jié)構(gòu)就是byte數(shù)組。

2.png

綜上，string與[]byte在底層結(jié)構(gòu)上是非常的相近（后者的底層表達僅多了一個cap屬性，因此它們在內(nèi)存布局上是可對齊的），這也就是為何builtin中內(nèi)置函數(shù)copy會有一種特殊情況copy(dst []byte, src string) int的原因了。

// The copy built-in function copies elements from a source slice into a
// destination slice. (As a special case, it also will copy bytes from a
// string to a slice of bytes.) The source and destination may overlap. Copy
// returns the number of elements copied, which will be the minimum of
// len(src) and len(dst).
func copy(dst, src []Type) int

區(qū)別

對于[]byte與string而言，兩者之間最大的區(qū)別就是string的值不能改變。這該如何理解呢？下面通過兩個例子來說明。

對于[]byte來說，以下操作是可行的：

    b := []byte("Hello Gopher!")
    b [1] = 'T'

string，修改操作是被禁止的：

    s := "Hello Gopher!"
    s[1] = 'T'

而string能支持這樣的操作：

    s := "Hello Gopher!"
    s = "Tello Gopher!"

字符串的值不能被更改，但可以被替換。 string在底層都是結(jié)構(gòu)體stringStruct{str: str_point, len: str_len}，string結(jié)構(gòu)體的str指針指向的是一個字符常量的地址，這個地址里面的內(nèi)容是不可以被改變的，因為它是只讀的，但是這個指針可以指向不同的地址。

那么，以下操作的含義是不同的：

s := "S1" // 分配存儲"S1"的內(nèi)存空間，s結(jié)構(gòu)體里的str指針指向這塊內(nèi)存
s = "S2"  // 分配存儲"S2"的內(nèi)存空間，s結(jié)構(gòu)體里的str指針轉(zhuǎn)為指向這塊內(nèi)存

b := []byte{1} // 分配存儲'1'數(shù)組的內(nèi)存空間，b結(jié)構(gòu)體的array指針指向這個數(shù)組。
b = []byte{2}  // 將array的內(nèi)容改為'2'

圖解如下

3.png

因為string的指針指向的內(nèi)容是不可以更改的，所以每更改一次字符串，就得重新分配一次內(nèi)存，之前分配的空間還需要gc回收，這是導(dǎo)致string相較于[]byte操作低效的根本原因。

標(biāo)準(zhǔn)轉(zhuǎn)換的實現(xiàn)細節(jié)

[]byte(string)的實現(xiàn)（源碼在src/runtime/string.go中）

// The constant is known to the compiler.
// There is no fundamental theory behind this number.
const tmpStringBufSize = 32

type tmpBuf [tmpStringBufSize]byte

func stringtoslicebyte(buf *tmpBuf, s string) []byte {
    var b []byte
    if buf != nil && len(s) <= len(buf) {
        *buf = tmpBuf{}
        b = buf[:len(s)]
    } else {
        b = rawbyteslice(len(s))
    }
    copy(b, s)
    return b
}

// rawbyteslice allocates a new byte slice. The byte slice is not zeroed.
func rawbyteslice(size int) (b []byte) {
    cap := roundupsize(uintptr(size))
    p := mallocgc(cap, nil, false)
    if cap != uintptr(size) {
        memclrNoHeapPointers(add(p, uintptr(size)), cap-uintptr(size))
    }

    *(*slice)(unsafe.Pointer(&b)) = slice{p, size, int(cap)}
    return
}

這里有兩種情況：s的長度是否大于32。當(dāng)大于32時，go需要調(diào)用mallocgc分配一塊新的內(nèi)存（大小由s決定），這也就回答了上文中的問題2：當(dāng)x的數(shù)據(jù)較大時，標(biāo)準(zhǔn)轉(zhuǎn)換方式會有一次分配內(nèi)存的操作。

最后通過copy函數(shù)實現(xiàn)string到[]byte的拷貝，具體實現(xiàn)在src/runtime/slice.go中的slicestringcopy方法。

func slicestringcopy(to []byte, fm string) int {
    if len(fm) == 0 || len(to) == 0 {
        return 0
    }

  // copy的長度取決與string和[]byte的長度最小值
    n := len(fm)
    if len(to) < n {
        n = len(to)
    }

  // 如果開啟了競態(tài)檢測 -race
    if raceenabled {
        callerpc := getcallerpc()
        pc := funcPC(slicestringcopy)
        racewriterangepc(unsafe.Pointer(&to[0]), uintptr(n), callerpc, pc)
    }
  // 如果開啟了memory sanitizer -msan
    if msanenabled {
        msanwrite(unsafe.Pointer(&to[0]), uintptr(n))
    }

  // 該方法將string的底層數(shù)組從頭部復(fù)制n個到[]byte對應(yīng)的底層數(shù)組中去（這里就是copy實現(xiàn)的核心方法，在匯編層面實現(xiàn) 源文件為memmove_*.s）
    memmove(unsafe.Pointer(&to[0]), stringStructOf(&fm).str, uintptr(n))
    return n
}

copy實現(xiàn)過程圖解如下

4.png

string([]byte)的實現(xiàn)（源碼也在src/runtime/string.go中）

// Buf is a fixed-size buffer for the result,
// it is not nil if the result does not escape.
func slicebytetostring(buf *tmpBuf, b []byte) (str string) {
    l := len(b)
    if l == 0 {
        // Turns out to be a relatively common case.
        // Consider that you want to parse out data between parens in "foo()bar",
        // you find the indices and convert the subslice to string.
        return ""
    }
  // 如果開啟了競態(tài)檢測 -race
    if raceenabled {
        racereadrangepc(unsafe.Pointer(&b[0]),
            uintptr(l),
            getcallerpc(),
            funcPC(slicebytetostring))
    }
  // 如果開啟了memory sanitizer -msan
    if msanenabled {
        msanread(unsafe.Pointer(&b[0]), uintptr(l))
    }
    if l == 1 {
        stringStructOf(&str).str = unsafe.Pointer(&staticbytes[b[0]])
        stringStructOf(&str).len = 1
        return
    }

    var p unsafe.Pointer
    if buf != nil && len(b) <= len(buf) {
        p = unsafe.Pointer(buf)
    } else {
        p = mallocgc(uintptr(len(b)), nil, false)
    }
    stringStructOf(&str).str = p
    stringStructOf(&str).len = len(b)
  // 拷貝字節(jié)數(shù)組至字符串
    memmove(p, (*(*slice)(unsafe.Pointer(&b))).array, uintptr(len(b)))
    return
}

// 實例stringStruct對象
func stringStructOf(sp *string) *stringStruct {
    return (*stringStruct)(unsafe.Pointer(sp))
}

可見，當(dāng)數(shù)組長度超過32時，同樣需要調(diào)用mallocgc分配一塊新內(nèi)存。最后通過memmove完成拷貝。

強轉(zhuǎn)換的實現(xiàn)細節(jié)

萬能的unsafe.Pointer指針

在go中，任何類型的指針*T都可以轉(zhuǎn)換為unsafe.Pointer類型的指針，它可以存儲任何變量的地址。同時，unsafe.Pointer類型的指針也可以轉(zhuǎn)換回普通指針，而且可以不必和之前的類型*T相同。另外，unsafe.Pointer類型還可以轉(zhuǎn)換為uintptr類型，該類型保存了指針?biāo)赶虻刂返臄?shù)值，從而可以使我們對地址進行數(shù)值計算。以上就是強轉(zhuǎn)換方式的實現(xiàn)依據(jù)。

而string和slice在reflect包中，對應(yīng)的結(jié)構(gòu)體是reflect.StringHeader和reflect.SliceHeader，它們是string和slice的運行時表達。

type StringHeader struct {
    Data uintptr
    Len  int
}

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

內(nèi)存布局

從string和slice的運行時表達可以看出，除了SilceHeader多了一個int類型的Cap字段，Date和Len字段是一致的。所以，它們的內(nèi)存布局是可對齊的，這說明我們就可以直接通過unsafe.Pointer進行轉(zhuǎn)換。

[]byte轉(zhuǎn)string圖解

5.png

string轉(zhuǎn)[]byte圖解

6.png

Q1. 為啥強轉(zhuǎn)換性能會比標(biāo)準(zhǔn)轉(zhuǎn)換好？

對于標(biāo)準(zhǔn)轉(zhuǎn)換，無論是從[]byte轉(zhuǎn)string還是string轉(zhuǎn)[]byte都會涉及底層數(shù)組的拷貝。而強轉(zhuǎn)換是直接替換指針的指向，從而使得string和[]byte指向同一個底層數(shù)組。這樣，當(dāng)然后者的性能會更好。

Q2. 為啥在上述測試中，當(dāng)x的數(shù)據(jù)較大時，標(biāo)準(zhǔn)轉(zhuǎn)換方式會有一次分配內(nèi)存的操作，從而導(dǎo)致其性能更差，而強轉(zhuǎn)換方式卻不受影響？

標(biāo)準(zhǔn)轉(zhuǎn)換時，當(dāng)數(shù)據(jù)長度大于32個字節(jié)時，需要通過mallocgc申請新的內(nèi)存，之后再進行數(shù)據(jù)拷貝工作。而強轉(zhuǎn)換只是更改指針指向。所以，當(dāng)轉(zhuǎn)換數(shù)據(jù)較大時，兩者性能差距會愈加明顯。

Q3. 既然強轉(zhuǎn)換方式性能這么好，為啥go語言提供給我們使用的是標(biāo)準(zhǔn)轉(zhuǎn)換方式？

首先，我們需要知道Go是一門類型安全的語言，而安全的代價就是性能的妥協(xié)。但是，性能的對比是相對的，這點性能的妥協(xié)對于現(xiàn)在的機器而言微乎其微。另外強轉(zhuǎn)換的方式，會給我們的程序帶來極大的安全隱患。

如下示例

a := "hello"
b := String2Bytes(a)
b[0] = 'H'

a是string類型，前面我們講到它的值是不可修改的。通過強轉(zhuǎn)換將a的底層數(shù)組賦給b，而b是一個[]byte類型，它的值是可以修改的，所以這時對底層數(shù)組的值進行修改，將會造成嚴重的錯誤（通過defer+recover也不能捕獲）。

unexpected fault address 0x10b6139
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x10b6139 pc=0x1088f2c]

Q4. 為啥string要設(shè)計為不可修改的？

我認為有必要思考一下該問題。string不可修改，意味它是只讀屬性，這樣的好處就是：在并發(fā)場景下，我們可以在不加鎖的控制下，多次使用同一字符串，在保證高效共享的情況下而不用擔(dān)心安全問題。

取舍場景

在你不確定安全隱患的條件下，盡量采用標(biāo)準(zhǔn)方式進行數(shù)據(jù)轉(zhuǎn)換。
當(dāng)程序?qū)\行性能有高要求，同時滿足對數(shù)據(jù)僅僅只有讀操作的條件，且存在頻繁轉(zhuǎn)換（例如消息轉(zhuǎn)發(fā)場景），可以使用強轉(zhuǎn)換。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

徹底弄清Golang中[]byte與string轉(zhuǎn)換

徹底弄清Golang中[]byte與string轉(zhuǎn)換

兩種轉(zhuǎn)換方式

標(biāo)準(zhǔn)轉(zhuǎn)換

強轉(zhuǎn)換

性能對比

原理分析

[]byte

string

區(qū)別

標(biāo)準(zhǔn)轉(zhuǎn)換的實現(xiàn)細節(jié)

強轉(zhuǎn)換的實現(xiàn)細節(jié)

Q&A

取舍場景

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

徹底弄清Golang中[]byte與string轉(zhuǎn)換

兩種轉(zhuǎn)換方式

標(biāo)準(zhǔn)轉(zhuǎn)換

強轉(zhuǎn)換

性能對比

原理分析

[]byte

string

區(qū)別

標(biāo)準(zhǔn)轉(zhuǎn)換的實現(xiàn)細節(jié)

強轉(zhuǎn)換的實現(xiàn)細節(jié)

Q&A

取舍場景

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av