Chapter1 Computer Abstractions and Technology
寫過論文的人對abstraction一定不會陌生;不錯(cuò),本章就是一些綜述的內(nèi)容,對于熱衷技術(shù)的人而言肯定乏味的很,但作為一個(gè)行業(yè)管理者而言,可謂字字珠璣啊;
1.1 introduction
首先介紹了計(jì)算機(jī)科學(xué)在廣闊領(lǐng)域的發(fā)展,主要是如下領(lǐng)域:
- automobiles
- cell phones
- Human genome project
- world wide web
- search engines
然后總結(jié)傳統(tǒng)計(jì)算機(jī)應(yīng)用及其特點(diǎn),主要分為3個(gè)領(lǐng)域:
- PCs
- Servers
- Embedded computers
最后歡迎大家來到后PC時(shí)代,介紹后PC時(shí)代的特點(diǎn):
- 最大特點(diǎn)就是PMD替換PC
- PMD:Personal Mobile Device
那么讀者可以從這本書學(xué)習(xí)到什么呢?答案如下:
- 高級軟件語言(C,JAVA)如何轉(zhuǎn)為硬件能理解的語言,硬件如何執(zhí)行程序;
- 軟硬件接口是什么,軟件如何讓硬件執(zhí)行特定功能;
- 什么決定了程序的性能,怎么提高性能;
- 硬件工程師可利用什么技術(shù)提高性能;
- 硬件工程師可利用什么技術(shù)提高energy efficiency;
- Parallelism的原因及其后續(xù)演進(jìn);
- 現(xiàn)代計(jì)算機(jī)架構(gòu)中個(gè)great ideas;
1.2 Eight Great Ideas in Computer Architecture.
過去60年,計(jì)算機(jī)架構(gòu)出現(xiàn)了8個(gè)great ideas
- Design for Moore's Law
- Use Abstraction to Simplify Design
- Make the Common Case Fast
- Performance via parallelism
- Performance via Pipelining
- Performance via Prediction
- Hierarchy of Memories
- Dependability via Redundancy
1.3 Below Your Program
Application software在system software之上,而system software又在hardware之上;
有兩種system software:
- operating system
- compiler
將高級語言編寫的程序翻譯成機(jī)器能執(zhí)行的指令;
From a High-Level Language to the Language of Hardware
機(jī)器所能理解的語言叫做
instruction:通常為binary形式的,如1001010100101110,此指令為兩個(gè)數(shù)相加;
1.4 Under the Covers
以ipad2為例,從LCD到CPU的硬件簡介。
1.5 Technologies for Buildig Processors and Memory
簡單介紹半導(dǎo)體工藝
the cost of an integrated circuit can be expressed in 3 simple equations:
Cost per die=Cost per water/(Dies per wafer x yield)
Dies per wafer=wafer area /Die area
Yield=1/(1+(Defects per area x Die area/2))^2
1.6 Performance
- Defining Performance
- Measuring Performance
- time is the measure of computer performance,時(shí)間有不同的定義方式:
- wall clock time 經(jīng)過時(shí)間
- response time 響應(yīng)時(shí)間
- elapsed time 運(yùn)行時(shí)間
- CPU execution time/CPU time
- user CPU time:單純執(zhí)行程序的時(shí)間
- system CPU time:為了執(zhí)行程序而調(diào)用操作系統(tǒng)的時(shí)間;
- 二者很難精確區(qū)分;
- 一般我們喜歡用elapsed time來評價(jià)一個(gè)程序的性能,其實(shí)elapsed time和CPU time還是有區(qū)別的
- we will use the term system performance to refer to elapsed time on an unloaded system and CPU performance to refer to user CPU time;
- 如一些server對IO很依賴,從而性能需要評價(jià)軟硬件綜合性能,而一些application則可能只關(guān)注throughput或者response time,或者兩者的組合。所以為提高performance,你必須要知道哪些方面構(gòu)成了performance matric matters,從而方便找到performance的瓶頸。
- time is the measure of computer performance,時(shí)間有不同的定義方式:
- CPU Performance and its Factors
- 減少應(yīng)用程序的時(shí)鐘個(gè)數(shù)或者提高時(shí)鐘頻率
- Instruction Performance
- 時(shí)鐘周期數(shù)=指令個(gè)數(shù)
指令平均時(shí)鐘個(gè)數(shù)
- 指令平均時(shí)鐘個(gè)數(shù):clock cycles per instrunction,簡稱CPI
- 每個(gè)指令執(zhí)行的時(shí)鐘個(gè)數(shù)不同,CPI為所有指令執(zhí)行的時(shí)鐘個(gè)數(shù)的平均數(shù);
- CPI provides one way of comparing two different implementations of the identical instruction set architecture,since the number of instructions executed for a program will, of course, be the same.
- 時(shí)鐘周期數(shù)=指令個(gè)數(shù)
- The Clasic CPU Performance Equation
- CPU time=Instrunction count
CPI
Clock cycle time(or 1/Clock rate)
- 注意IPC=1/CPI,IPC: instructions per clock
- 注意現(xiàn)在很多CPU有變頻特點(diǎn),如Intel i7可以將頻率提升10%直到CPU太熱在降回來,這種技術(shù)被稱為turbo技術(shù);
- CPU time=Instrunction count
1.7 The Power Wall
首先介紹了8代Intel CPU的頻率和功耗走勢圖;需要注意的是在Pentium4(2001)時(shí),頻率達(dá)到3.6GHz,功耗達(dá)到103M,起后頻率與功耗都有降低;
其次介紹單個(gè)晶體管的動(dòng)態(tài)功耗(焦耳和瓦特角度):
Energy1/2
Capacitive load
Voltages2
Power1/2
Capacitive load
Voltages2
Frequency switched
Frequency switched和時(shí)鐘頻率相關(guān);
with regard to the Figure, how could clock rates grow by a factor of 1000 while power increased by only a factor of 30?(頻率有1000倍增長而功耗只有30倍增長)
Energy and thus power can be reduced by lowering the voltage,每代CPU都采用這種技術(shù),即每代電壓減少15%(0.0225)
In 20 years, voltages have gone from 5V to 1V, which is why the increase in power is only 30 times.(0.0225x1000=22.530)
現(xiàn)代的問題是,繼續(xù)降低電壓會使晶體管too leaky(服務(wù)器芯片40%功耗來源于leakage)
to try to adress the power problem, designers have already attached large devices to increase cooling. and they turn off parts of the chip that are not used in a given clock cycle.
盡管需要高昂的降溫設(shè)備(300W功耗),這種方案還是應(yīng)用于PC和server,而PMD則不需要;
Power is a challenge for integrated circuits for 2 reasons:
- power must be brought in and distributed around the chip;現(xiàn)代芯片可能需要數(shù)百個(gè)ground和power的引腳
- power is dissipated as heat and must be removed.(功耗消耗為熱量需要更貴的散熱設(shè)備)
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
為了解決power問題,引入了multiprocessor的概念,這就要求現(xiàn)代程序員必須要考慮重新編寫已有的程序以適應(yīng)multiprocers;
目前看程序員的轉(zhuǎn)型還很少,期待未來的改變;
For parallel programming, the challenges include scheduling,load balancing,time for synchronization and overhead for communication between the parties .
后面介紹各個(gè)章節(jié)為parallel revolution而引入的內(nèi)容;
1.9 Real Stuff: Benchmarking the Intel Core i7
每個(gè)章節(jié)的結(jié)束會舉個(gè)實(shí)例來復(fù)習(xí)本章內(nèi)容,第一章結(jié)束介紹Benchmark程序如何對不同的CPU進(jìn)行評價(jià);
如今的Benchmark大多出自SPEC:System Performance Evaluation Cooperative
感興趣可以看看,具體不介紹了;
1.10 Fallacies and Pitfalls
每章的結(jié)束也會有謬論與陷阱這一小節(jié);
1.11 Concluding Remarks
總結(jié)本章所有內(nèi)容。
1.12 Historical Perspective and Further Reading
online reading