接觸Java有段時(shí)間了,基本的原理和使用大概清楚了,想通過閱讀源碼來進(jìn)一步提升Java能力,聽說Doug Lea的java.util.concurrent包很值得一讀,所以就產(chǎn)生了這篇文章。
Atomic包是java.util.concurrent下的另一個(gè)專門為線程安全設(shè)計(jì)的Java包,包含多個(gè)原子操作類。這個(gè)包里面提供了一組原子變量類。其基本的特性就是在多線程環(huán)境下,當(dāng)有多個(gè)線程同時(shí)執(zhí)行這些類的實(shí)例包含的方法時(shí),具有排他性,即當(dāng)某個(gè)線程進(jìn)入方法,執(zhí)行其中的指令時(shí),不會(huì)被其他線程打斷,而別的線程就像自旋鎖一樣,一直等到該方法執(zhí)行完成,才由JVM從等待隊(duì)列中選擇一個(gè)另一個(gè)線程進(jìn)入,這只是一種邏輯上的理解。實(shí)際上是借助硬件的相關(guān)指令來實(shí)現(xiàn)的,不會(huì)阻塞線程(或者說只是在硬件級(jí)別上阻塞了)??梢詫?duì)基本數(shù)據(jù)、數(shù)組中的基本數(shù)據(jù)、對(duì)類中的基本數(shù)據(jù)進(jìn)行操作。
下面這張圖是java.util.concurrent.atomic包下的類結(jié)構(gòu),總共12個(gè)類,可以按照一定的類別分成4組:
- 基本類型 AtomicBoolean,AtomicInteger,AtomicLong,AtomicReference
- 數(shù)組類型 AtomicIntegerArray,AtomicLongArray,AtomicReferenceArray
- 更新器類型 AtomicLongFieldUpdater,AtomicIntegerFieldUpdater,AtomicReferenceFieldUpdater
- 復(fù)合變量類型 AtomicMarkableReference,AtomicStampedReference
在具體了解每個(gè)類的實(shí)現(xiàn)前,我們先了解下這些類共同依賴的基礎(chǔ)類Unsafe。這個(gè)類包含了大量的對(duì)C代碼的操作,包括很多直接內(nèi)存分配以及原子操作的調(diào)用,而它之所以標(biāo)記為非安全的,是告訴你這個(gè)里面大量的方法調(diào)用都會(huì)存在安全隱患,需要小心使用,否則會(huì)導(dǎo)致嚴(yán)重的后果。
下面以AtomicInteger為例將其源碼走一遍,詳細(xì)介紹每段代碼的實(shí)現(xiàn)邏輯和功能。
第一行代碼是獲取Unsafe類的實(shí)例的,Unsafe是原子操作的基礎(chǔ)類,也就是所有的原子操作都是基于unsafe來實(shí)現(xiàn)的。而valueOffset表示AtomicInteger實(shí)例中的value屬性在內(nèi)存中的地址。
上面這幾行代碼是用來獲取AtomicInteger實(shí)例中的value屬性在內(nèi)存中的位置。這里使用了Unsafe的objectFieldOffset方法。這個(gè)方法是一個(gè)本地方法, 該方法用來獲取一個(gè)給定的靜態(tài)屬性的位置。
這個(gè)非常簡(jiǎn)單,每個(gè)AtomicInteger實(shí)例都會(huì)存放一個(gè)值,這個(gè)值就用變量value來表示。細(xì)心的你一定注意到了volatile這個(gè)關(guān)鍵字,慚愧啊,寫了一年Java也沒場(chǎng)景使用過這個(gè)東東。根據(jù)Java Language Specification中的說明, jvm系統(tǒng)中存在一個(gè)主內(nèi)存(Main Memory或Java Heap Memory),Java中所有變量都儲(chǔ)存在主存中,對(duì)于所有線程都是共享的。每條線程都有自己的工作內(nèi)存(Working Memory),工作內(nèi)存中保存的是主存中某些變量的拷貝,線程對(duì)所有變量的操作都是在工作內(nèi)存中進(jìn)行,線程之間無法相互直接訪問,變量傳遞均需要通過主存完成。所以,同一變量的值在工作內(nèi)存和主存中可能不一致。volatile其實(shí)是告訴處理器, 不要將我放入工作內(nèi)存, 請(qǐng)直接在主存操作我。
兩個(gè)構(gòu)造函數(shù),如果帶參數(shù),就將參數(shù)賦值給AtomicInteger實(shí)例的value屬性。
value屬性的get和set方法,由于value屬性上添加了volatile關(guān)鍵字,所以value的讀寫操作是無須加鎖的。
方法getAndSet內(nèi)部調(diào)用了compareAndSet,所以我們先了解下compareAndSet的原理,其實(shí)Atomic的基礎(chǔ)是CAS,那么什么是CAS,系下面是來自維基百科的解釋。
In computer science, the compare-and-swap CPU instruction ("CAS") (or the Compare & Exchange - CMPXCHG instruction in the x86 and Itanium architectures) is a special instruction that atomically (regarding intel x86, lock prefix should be there to make it really atomic) compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. This guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple Boolean response (this variant is often called compare-and-set), or by returning the value read from the memory location (not the value written to it). Compare-and-Swap (and Compare-and-Swap-Double) has been an integral part of the IBM 370(and all successor) architectures since 1970. The operating systems which run on these architectures make extensive use of Compare-and-Swap (and Compare-and-Swap-Double) to facilitate process (i.e., system and user tasks) and processor (i.e., central processors) parallelism while eliminating, to the greatest degree possible, the "disabled spin locks" which were employed in earlier IBM operating systems. In these operating systems, new units of work may be instantiated "globally", into the Global Service Priority List, or "locally", into the Local Service Priority List, by the execution of a single Compare-and-Swap instruction. This dramatically improved the responsiveness of these operating systems.
CAS是硬件CPU提供的元語,它的原理是:我認(rèn)為位置 V 應(yīng)該包含值 A;如果包含該值,則將 B 放到這個(gè)位置;否則,不要更改該位置,只告訴我這個(gè)位置現(xiàn)在的值即可。而上圖中的compareAndSet就是調(diào)用CAS元語完成的。
這兩個(gè)方法是value屬性的自增自減操作,由于volatile,value的get和set不需要加鎖的,那為什么自增自減操作需要通過CAS完成呢?仔細(xì)觀察incrementAndGet()方法,發(fā)現(xiàn)自增操作其實(shí)拆成了兩步完成的
由于valatile只能保證讀取或?qū)懭氲氖亲钚轮?,那么可能出現(xiàn)以下情況:
- A線程執(zhí)行g(shù)et()操作,獲取current值(假設(shè)為1)
- B線程執(zhí)行g(shù)et()操作,獲取current值(為1)
- B線程執(zhí)行next = current + 1操作,next = 2
- A線程執(zhí)行next = current + 1操作,next = 2
這樣的結(jié)果明顯不是我們想要的,所以,自增操作必須采用CAS來完成。
在閱讀源碼的過程中,還發(fā)現(xiàn)了一些不太容易理解的方法,比如下面這個(gè)
既然已經(jīng)有set,為什么還有個(gè)lazySet,困惑不懂,馬上google,非常幸運(yùn),找到了原作者的解釋。
"As probably the last little JSR166 follow-up for Mustang, we added a "lazySet" method to the Atomic classes (AtomicInteger, AtomicReference, etc). This is a niche method that is sometimes useful when fine-tuning code using non-blocking data structures. The semantics are that the write is guaranteed not to be re-ordered with any previous write, but may be reordered with subsequent operations (or equivalently, might not be visible to other threads) until some other volatile write or synchronizing action occurs).
The main use case is for nulling out fields of nodes in non-blocking data structures solely for the sake of avoiding long-term garbage retention; it applies when it is harmless if other threads see non-null values for a while, but you'd like to ensure that structures are eventually GCable. In such cases, you can get better performance by avoiding the costs of the null volatile-write. There are a few other use cases along these lines for non-reference-based atomics as well, so the method is supported across all of the AtomicX classes.
For people who like to think of these operations in terms of machine-level barriers on common multiprocessors, lazySet provides a preceeding store-store barrier (which is either a no-op or very cheap on current platforms), but no store-load barrier (which is usually the expensive part of a volatile-write)."
weakCompareAndSet( )方法和compareAndSet( )類似,都是conditional modifier方法。這2個(gè)方法接受2個(gè)參數(shù),一個(gè)是期望數(shù)據(jù)(expected),一個(gè)是新數(shù)據(jù)(new);如果atomic里面的數(shù)據(jù)和期望數(shù)據(jù)一 致,則將新數(shù)據(jù)設(shè)定給atomic的數(shù)據(jù),返回true,表明成功;否則就不設(shè)定,并返回false。JSR規(guī)范中說:以原子方式讀取和有條件地寫入變量但不 創(chuàng)建任何 happen-before 排序,因此不提供與除 weakCompareAndSet 目標(biāo)外任何變量以前或后續(xù)讀取或?qū)懭氩僮饔嘘P(guān)的任何保證。大意就是說調(diào)用weakCompareAndSet時(shí)并不能保證不存在happen- before的發(fā)生(也就是可能存在指令重排序?qū)е麓瞬僮魇。?。但是從Java源碼來看,其實(shí)此方法并沒有實(shí)現(xiàn)JSR規(guī)范的要求,最后效果和 compareAndSet是等效的,都調(diào)用了unsafe.compareAndSwapInt()完成操作。
至于其他類,大致原理都差不多,下面列舉下閱讀源碼中發(fā)現(xiàn)的一些差異:
數(shù)組類型 AtomicIntegerArray,AtomicLongArray,AtomicReferenceArray
- 沒有Boolean的Array,可以用Integer代替,底層實(shí)現(xiàn)完全一致,畢竟AtomicBoolean底層就是用Integer實(shí)現(xiàn)
- 數(shù)組變量volatile沒有意義,因此set/get就需要Unsafe來做了,方法構(gòu)成與上面一致,但是多了一個(gè)index來指定操作數(shù)組中的哪一個(gè)元素。
更新器類型 AtomicLongFieldUpdater,AtomicIntegerFieldUpdater,AtomicReferenceFieldUpdater
- 利用反射原理,實(shí)現(xiàn)對(duì)一個(gè)類的某個(gè)字段的原子化更新,該字段類型必須和Updater要求的一致,例如如果使用 AtomicIntegerFieldUpdater,字段必須是Integer類型,而且必須有volatile限定符。Updater的可以調(diào)用的方 法和數(shù)字類型完全一致,額外增加一個(gè)該類型的對(duì)象為參數(shù),updater就會(huì)更新該對(duì)象的那個(gè)字段了。
- Updater本身為抽象類,但有一個(gè)私有化的實(shí)現(xiàn),利用門面模式,在抽象類中使用靜態(tài)方法創(chuàng)建實(shí)現(xiàn)
復(fù)合變量類型 AtomicMarkableReference,AtomicStampedReference
- 前者ReferenceBooleanPair類型的AtomicReference,ReferenceBooleanPair表示一個(gè)對(duì)象和boolean標(biāo)記的pair