




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
ComputerArchitecture
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)UndergraduateCourse
WeiminWu(吳為民)SchoolofComputerandInformationTechnology,BeijingJiaotongUniveristySpring2014內(nèi)容1.FundamentalsofComputerArchitecture計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)的基本原理2.InstructionSet指令集3.Pipeling流水線4.MemoryHierarchy存儲(chǔ)層次5.Input-OutputSubsystem輸入輸出子系統(tǒng)6.InterconnectionNetworks7.ParallelComputers
本課的一般情況1.
共48學(xué)時(shí)(24次課)。其中課堂講授32學(xué)時(shí)(16次課),實(shí)驗(yàn)16學(xué)時(shí)(8次課)。2.平時(shí)包括考勤、課堂作業(yè)和上機(jī)作業(yè)。3.最終有期末考試,開卷。英文試卷。4.考核方式:平時(shí)40%,期末60%。5.要求盡量讀懂英文原文。讀不懂的地方可參見本書的翻譯版或者張晨曦的計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)教材。也可發(fā)Email給我:wmwu@著重注意:作業(yè)實(shí)驗(yàn)報(bào)告中務(wù)必寫上你的課程班號(hào)(01或03),學(xué)號(hào),姓名。1.FundamentalsofComputerArchitecture1.1LayersofComputerSystem
計(jì)算機(jī)系統(tǒng)的層次
1.2ComputerArchitectureandImplementation
計(jì)算機(jī)的系統(tǒng)結(jié)構(gòu)和實(shí)現(xiàn)1.3TheTaskofAComputerDesigner
計(jì)算機(jī)設(shè)計(jì)者的任務(wù)1.4MeasuringandReportingPerformance
測(cè)量和報(bào)告性能1.5QuantitativePrinciplesofComputerDesign
計(jì)算機(jī)設(shè)計(jì)的量化原理1.6ClassificationofComputerArchitecture
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)的分類計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)的基本原理1.1LayersofComputersystemsApplicationLanguageMachineM5應(yīng)用語言機(jī)High-LevelLanguageMachineM4高級(jí)語言機(jī)AssemblyLanguageMachineM3匯編語言機(jī)OperatingSystemMachineM2操作系統(tǒng)機(jī)ConventionalMachineM1傳統(tǒng)機(jī)MicroprogramMachineM0微程序機(jī)每個(gè)層次執(zhí)行相關(guān)的功能子集。每個(gè)層次要依賴于下一個(gè)低層去執(zhí)行更原始的功能。這就將問題分解成更易處理的子問題。從M2到M5的層次是虛擬機(jī)。在傳統(tǒng)機(jī)上的指令(算數(shù)、邏輯等)由微程序級(jí)的程序?qū)崿F(xiàn)。該程序是作為一個(gè)解釋器,能理解一組簡(jiǎn)單的操作集合,稱為微指令集。計(jì)算機(jī)系統(tǒng)的層次1.2ComputerArchitectureandImplementationComputerArchitecture
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)Referstothoseattributesofasystemvisibletoaprogrammer,
orthoseattributeshavedirectimpactonlogicalexecutionofprogram.
程序員可見,或者對(duì)程序執(zhí)行有直接影響的屬性Implementation實(shí)現(xiàn)Twocomponents:Organizationandhardware.*Organization(組織):includeshigh-levelaspectsofacomputer’sdesign,
suchas:memorysystem,busstructure,internalCPU.*Hardware(硬件):referstothespecificsofamachine,include:detailedlogicdesignandpackagingtechnology.計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)和實(shí)現(xiàn)ArchitecturalAttributes系統(tǒng)結(jié)構(gòu)方面的屬性instructionset,指令集I/Omechanisms,I/O機(jī)制techniquesforaddressingmemory尋址技術(shù)
numberofbitsrepresentingvariousdatatype(numbers,characters)表示各種數(shù)據(jù)類型的位數(shù)(數(shù)值、字符)1.2ComputerArchitectureandImplementation,cont’dHardwareAttributes硬件方面的屬性packagingtechnology封裝技術(shù)power功耗cooling冷卻
OrganizationalAttributes組織方面的屬性Hardwaredetailstransparenttotheprogrammer.
對(duì)于程序員透明的硬件細(xì)節(jié)suchas:controlsignals控制信號(hào)computer/peripheralinterfaces計(jì)算機(jī)/外設(shè)接口
memorytechnology存儲(chǔ)技術(shù)1.2ComputerArchitectureandImplementation,cont’dArchitecturalDesignIssue系統(tǒng)結(jié)構(gòu)設(shè)計(jì)問題Whetheracomputerwillhaveamultiplyinstruction.是否要有一個(gè)乘法指令OrganizationalIssue組織設(shè)計(jì)問題Whethertheinstructionwillbeimplementedbyaspecialmultiplyunitorbyrepeateduseoftheaddunit.是采用乘法單元還是采用加法單元迭代使用Thedecisionmaybebasedontheanticipatedfrequencyofuseofthemultiplyinstruction,therelativespeedofthetwoapproaches,andthecostandphysicalsizeofaspecialmultiplyunit.決策取決于乘法指令使用頻率,兩種方法的相對(duì)速度,乘法單元的成本和大小1.2ComputerArchitectureandImplementation,cont’d1.3TheTaskofAComputerDesignerIsacomplexone:是一個(gè)復(fù)雜的問題
*Determinewhatattributesareimportantforanewmachine.確定哪些屬性是重要的*Designamachinetomaximizeperformance(性能)
whilestayingwithincost(成本)
andpower(功耗)
constraints,including:instructionsetdesign指令集設(shè)計(jì)
functionalorganization功能設(shè)計(jì)
logicdesign邏輯設(shè)計(jì)
implementation(實(shí)現(xiàn)):ICdesign,package,cooling計(jì)算機(jī)設(shè)計(jì)者的任務(wù)功能要求需要或支持的典型特征補(bǔ)充知識(shí)集成電路產(chǎn)業(yè)發(fā)展的里程碑:1947:Bell實(shí)驗(yàn)室的Bardeen、Brattain、Schockly發(fā)明了晶體管。共獲1956年諾貝爾物理學(xué)獎(jiǎng)。
晶體管是IC產(chǎn)業(yè)的基石。1952:SONY開發(fā)出第一個(gè)基于晶體管的收音機(jī)。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1958:TI的Kilby發(fā)明了第一塊集成電路(IC)。獲2000年諾貝爾物理學(xué)獎(jiǎng)。Noyce將其完善實(shí)用化。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1965:Moore對(duì)IC發(fā)展作出預(yù)言:Moore定律GordonMooreIntelCo-FounderandChairmainEmeritusImagesource:IntelCorporation
歷史證明一直正確。但是,會(huì)繼續(xù)持續(xù)下去嗎?物理限制經(jīng)濟(jì)限制晶體管密度每18-24個(gè)月翻一番。性能每18-24個(gè)月翻一番。舉個(gè)例子:光刻過程:因此:產(chǎn)生光刻畸變,需要矯正(OPC)集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1968:Noyce和Moore建立了Intel。1970:Intel開發(fā)出1KDRAM。1971:Intel研發(fā)出4位的4004微處理器(2250個(gè)晶體管)。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1976/81:APPLEII/IBMPC。1984:Xilinx發(fā)明了FPGA。1985:Intel開始集中研發(fā)微處理器產(chǎn)品。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1987:TSMC建立.全球最大的專業(yè)芯片制造服務(wù)公司。1991:ARM開發(fā)出其第一個(gè)可嵌入的RISCIP核(無芯片IC設(shè)計(jì))。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1996:三星開發(fā)出1GDRAM。1998:IBM研發(fā)出1GHz實(shí)驗(yàn)型微處理器。集成電路產(chǎn)業(yè)發(fā)展的里程碑(續(xù)):1999/較早:系統(tǒng)芯片(System-on-Chip,SOC)應(yīng)用。2002/較早:系統(tǒng)封裝(System-in-Package,SiP)工藝。1.4MeasuringandReportingPerformance快的涵義?*Theusermaysayacomputerisfasterwhenaprogramrunsinlesstime.用戶:程序運(yùn)行時(shí)間短*thecomputercentermanagermaysayacomputerisfasterwhenitcompletesmorejobsinanhour.計(jì)算機(jī)中心經(jīng)理:在一小時(shí)內(nèi)做更多工作*Thecomputeruserisinterestedinreducingresponsetime(響應(yīng)時(shí)間)—thetimebetweenthestartandthecompletionofanevent—alsoreferredtoasexecutiontime(執(zhí)行時(shí)間).*Themanagerofadataprocessingcentermaybeinterestedinincreasingthroughput(吞吐量)—thetotalamountofworkdoneinagiventime.測(cè)量和報(bào)告計(jì)算機(jī)的性能Comparingdesignalternatives:*“XisfasterthanY”meanthattheresponsetimeisloweronXthanonY.X比Y快涵義*“XisntimesfasterthanY”mean:X比Y快n倍*Sinceexecutiontimeisthereciprocalofperformance:執(zhí)行時(shí)間是性能的倒數(shù)1.4MeasuringandReportingPerformance,cont’dEvenexecutiontimecanbedefinedindifferentways:執(zhí)行時(shí)間的不同定義*wall-clocktime,responsetime,orelapsedtime,whichisthelatencytocompleteatask,includingdiskaccesses,memoryaccesses,input/output
activities,operatingsystemoverhead.最直接的定義
*WithmultiprogrammingtheCPUworksonanotherprogramwhilewaitingforI/Oandmaynotnecessarilyminimizetheelapsedtimeofoneprogram.Henceweneedatermtotakethisactivityintoaccount.但多道程序的情況要考慮MeasuringPerformance測(cè)量性能1.4MeasuringandReportingPerformance,cont’dEvenexecutiontimecanbedefinedindifferentways:執(zhí)行時(shí)間的不同定義*CPUtime(CPU時(shí)間):meansthetimeCPUiscomputing,notincludingthetimewaitingforI/Oorrunningotherprograms.*CPUtimecanbefurtherdividedinto:進(jìn)一步分為
theCPUtimespentintheprogram,calleduserCPUtime(用戶CPU時(shí)間),theCPUtimespentintheoperatingsystemperformingtasksrequestedbytheprogram,calledsystemCPUtime(系統(tǒng)CPU時(shí)間).MeasuringPerformance測(cè)量性能1.4MeasuringandReportingPerformance,cont’dChoosingProgramstoEvaluatePerformance
選擇程序來評(píng)估性能1.4MeasuringandReportingPerformance,cont’dfourlevelsofprogramslistedbelowindecreasingorderofaccuracyofprediction.四個(gè)層次的程序,按預(yù)測(cè)精確度從高到底的次序1.Realapplications
真實(shí)應(yīng)用*ExamplesarecompilersforC,text-processingsoftwarelikeWord,andotherapplicationslikePhotoshop.*Realapplicationshaveinput,output,andoptionsthatausercanselectwhenrunningtheprogram.有輸入、輸出、可選項(xiàng)1.4MeasuringandReportingPerformance,cont’d
2.Kernels
核心程序*extractsmall,keypiecesfromrealprogramsandusethemtoevaluateperformance.關(guān)鍵片段*Unlikerealprograms,nouserwouldrunkernelprograms,fortheyexistsolelytoevaluateperformance.無實(shí)際用途,只用于評(píng)估性能*Kernelsarebestusedtoisolateperformanceofindividualfeaturesofamachinetoexplainthereasonsfordifferencesinperformanceofrealprograms.最便于辨析出機(jī)器單個(gè)特性的性能ChoosingProgramstoEvaluatePerformance
選擇程序來評(píng)估性能3.Toybenchmarks
玩具測(cè)試基準(zhǔn)*typicallybetween10and100linesofcodeandproducearesulttheuseralreadyknows.
10-100行的代碼,運(yùn)行結(jié)果已知。*ProgramslikePuzzle,andQuicksortarepopularbecausetheyaresmall,easytotype,andrunonalmostanycomputer.
小,易于鍵入,可運(yùn)行于幾乎所有計(jì)算機(jī)。1.4MeasuringandReportingPerformance,cont’dChoosingProgramstoEvaluatePerformance
選擇程序來評(píng)估性能4.Syntheticbenchmarks
合成測(cè)試基準(zhǔn)*Similarinphilosophytokernels,syntheticbenchmarkstrytomatchtheaveragefrequencyofoperationsandoperandsofalargesetofprograms.匹配程序中操作和操作數(shù)的平均頻率*Nouserrunssyntheticbenchmarks,becausetheydon’tcomputeanythingausercouldwant.1.4MeasuringandReportingPerformance,cont’dChoosingProgramstoEvaluatePerformance
選擇程序來評(píng)估性能puttogethercollectionsofbenchmarkstomeasuretheperformanceofprocessorswithavarietyofapplications.是一個(gè)有各種應(yīng)用的組合Akeyadvantageofsuchsuitesisthattheweaknessofonebenchmarkislessenedbythepresenceofotherbenchmarks.優(yōu)劣互補(bǔ)Benchmarksuitsaremadeofcollectionsofprograms,someofwhichmaybekernels,butmanyofwhicharetypicallyrealprograms.有些是核心程序,但很多是真實(shí)程序BenchmarkSuites測(cè)試基準(zhǔn)程序1.4MeasuringandReportingPerformance,cont’dTheguidingprincipleofreportingperformancemeasurementsshouldbereproducibility
(可重現(xiàn)性).requiresafairlycompletedescriptionofthemachine,thecompilerflags,aswellasthepublicationofboththebaselineandoptimizedresults.要求完整的描述containstheactualperformancetimes,shownbothintabularformandasagraph.
包含實(shí)際性能,并用表或圖的形式表示ReportingPerformanceResults報(bào)告性能結(jié)果1.4MeasuringandReportingPerformance,cont’dComparingandSummarizingPerformance
比較和總結(jié)性能1.4MeasuringandReportingPerformance,cont’dbattlesarefoughtoverwhatisthefairwaytosummarizerelativeperformanceofacollectionofprograms.什么是公平的方法Forexample,twoarticlesonsummarizingperformanceinthesamejournaltookopposingpointsofview.觀點(diǎn)不同F(xiàn)igure1.5,takenfromonearticle,isanexampleoftheconfusionthatcanarise.thefollowingstatementshold:*Ais10timesfasterthanBforprogramP1.A比B快10倍*Bis10timesfasterthanAforprogramP2.B比A快10倍*Ais20timesfasterthanCforprogramP1.A比C快20倍*Cis50timesfasterthanAforprogramP2.C比A快50倍*Bis2timesfasterthanCforprogramP1.B比C快2倍*Cis5timesfasterthanBforprogramP2.C比B快5倍TherelativeperformanceofA,B,andCisunclear.結(jié)論不明1.4MeasuringandReportingPerformance,cont’dusetotalexecutiontimeofP1andP2.*Bis9.1timesfasterthanA.*Cis25timesfasterthanA.*Cis2.75timesfasterthanB.Thissummarytracksexecutiontime,ourfinalmeasureofperformance.執(zhí)行時(shí)間,最終性能度量IftheworkloadconsistedofrunningprogramsP1andP2anequalnumberoftimes,thestatementsabovewouldpredicttherelativeexecutiontimes.如果P1和P2的執(zhí)行次數(shù)相等,okTotalExecutionTime:AConsistentSummaryMeasure總體執(zhí)行時(shí)間1.4MeasuringandReportingPerformance,cont’dAnaverageoftheexecutiontimeisthearithmeticmean:平均執(zhí)行時(shí)間whereTimeiistheexecutiontimefortheithprogram.1.4MeasuringandReportingPerformance,cont’dAreprogramsP1andP2infactrunequallyintheworkload?P1和P2同等嗎?程序出現(xiàn)頻率不同時(shí)的執(zhí)行時(shí)間計(jì)算方法。Ifnot,thenoneapproachistoassignaweightingfactor
wi
toeachprogramtoindicatetherelativefrequencyoftheprograminworkload.
第一種方法:對(duì)每個(gè)程序賦予權(quán)值,指明其出現(xiàn)的相對(duì)頻率WeightedExecutionTime加權(quán)執(zhí)行時(shí)間1.4MeasuringandReportingPerformance,cont’dThisiscalledtheweightedarithmeticmean:加權(quán)算數(shù)平均值whereWeighti
isthefrequencyoftheithprogramintheworkloadandTimei
istheexecutiontimeofthatprogram.1.4MeasuringandReportingPerformance,cont’dFigure1.6showsthedatafromFigure1.5withthreedifferentweightings,eachproportionaltotheexecutiontimeofaworkloadwithagivenmix.權(quán)值設(shè)定:與執(zhí)行時(shí)間成比例1.4MeasuringandReportingPerformance,cont’dABCAsecondapproachtounequalmixtureofprogramsistonormalizeexecutiontimestoareferencemachine(參考機(jī))
andtaketheaverageofthenormalizedexecutiontimes.第二種方法:歸一化執(zhí)行時(shí)間,再取平均值performanceofnewprogramscanbepredictedbysimplymultiplyingthisnumbertimesitsperformanceonthereferencemachine.實(shí)際性能=歸一化數(shù)×參考機(jī)性能NormalizedExecutionTimeandtheProsandConsofGeometricMeans歸一化執(zhí)行時(shí)間,以及幾何平均值的優(yōu)劣1.4MeasuringandReportingPerformance,cont’dAveragenormalizedexecutiontimecanbeexpressedaseitheranarithmeticorgeometricmean.可采用算數(shù)或幾何平均值Theformulaforthegeometricmeanis
幾何平均值的表達(dá)式whereExecutiontimeratioi
istheexecutiontime,normalizedtothereferencemachine,fortheithprogramofatotalofnintheworkload.1.4MeasuringandReportingPerformance,cont’dGeometricmeanshaveanicepropertyfortwosamplesXi
andYi:幾何平均值的好性質(zhì)幾何平均值的比率與比率的幾何平均值相同1.4MeasuringandReportingPerformance,cont’dIncontrasttoarithmeticmeans,geometricmeansofnormalizedexecutiontimesareconsistentnomatterwhichmachineisthereference.Hence,thearithmeticmeanshouldnotbeusedto.無論采用哪個(gè)機(jī)器作為參考機(jī),歸一化執(zhí)行時(shí)間的幾何平均值都是一致的。故不應(yīng)采用算數(shù)平均值。Figure1.7showssomevariationsusingbotharithmeticandgeometricmeans.ExecutiontimesfromFigure1.5normalizedtoeachmachine1.4MeasuringandReportingPerformance,cont’dThearithmeticmeanperformancevariesdependingonwhichisthereferencemachine*incolumn2,B’sexecutiontimeisfivetimeslongerthanA’s,althoughthereverseistrueincolumn4.*Incolumn3,Cisslowest,butincolumn9,Cisfastest.1.4MeasuringandReportingPerformance,cont’dThegeometricmeansareindependentofnormalization*AandBhavethesameperformance,andtheexecutiontimeofCis0.63ofAorB(1/1.58is0.63).*Unfortunately,thetotalexecutiontimeofAis10timeslongerthanthatofB,andBinturnisabout3timeslongerthanC.*Asapointofinterest,therelationshipbetweenthemeansofthesamesetofnumbersisalways:geometricmean≤arithmeticmeanadvantage:geometricmeanisindependentoftherunningtimesofindividualprograms,anditdoesn’tmatterwhichmachineisusedtonormalize.與各個(gè)程序運(yùn)行時(shí)間無關(guān),與采用哪一個(gè)機(jī)器進(jìn)行歸一化無關(guān)drawback:geometricmeansviolateourfundamentalprincipleofperformancemeasurement---donotpredictexecutiontime.違反了性能測(cè)量的基本原理,不預(yù)測(cè)時(shí)間1.4MeasuringandReportingPerformance,cont’dNormalizedExecutionTimeandtheProsandConsofGeometricMeans歸一化執(zhí)行時(shí)間,以及幾何平均值的優(yōu)劣MakeCommonCaseFast使常見情況更快Perhapsthemostimportantandpervasiveprincipleofcomputerdesignistomakethecommoncasefast.Inmakingadesigntradeoff,favorfrequentcaseoverinfrequentcase.照顧經(jīng)常發(fā)生的情況Thisprinciplealsoapplieswhendetermininghowtospendresources.
對(duì)資源使用也是這個(gè)道理1.5QuantitativePrinciplesofComputerDesign計(jì)算機(jī)設(shè)計(jì)的量化原理1.5QuantitativePrinciplesofComputerDesignAmdahl’sLaw阿姆達(dá)爾定律TheperformancegainobtainedbyimprovingsomeportionofacomputercanbecalculatedusingAmdahl’sLaw.用途Amdahl’sLawstatesthattheperformanceimprovementtobegainedfromusingsomefastermodeofexecutionislimitedbythefractionofthetimethefastermodecanbeused.阿姆達(dá)爾定律的涵義:由某些部分加速所得到的性能提高受加速部分的百分率所限。1.5QuantitativePrinciplesofComputerDesign或者Amdahl’sLawdefinesthespeedup
thatcanbegainedbyusingaparticularfeature.Speedupistheratio加速比的定義Amdahl’sLawgivesusaquickwaytofindthespeedupfromsomeenhancement,Speedupoverall,whichdependsontwofactors:加速比取決于兩個(gè)因素1.Thefractionofthecomputationtimeintheoriginalmachinethatcanbeconvertedtotakeadvantageoftheenhancement.
能加速的部分Fractionenhanced12.Theimprovementgainedbytheenhancedexecutionmode.
能加速的程度Speedupenhanced11.5QuantitativePrinciplesofComputerDesign新的執(zhí)行時(shí)間Theoverallspeedupistheratiooftheexecutiontimes:總體加速比1.5QuantitativePrinciplesofComputerDesignEXAMPLE:Supposethatweareconsideringanenhancementthatruns10timesfasterthantheoriginalmachine,butisonlyusable40%ofthetime.Whatistheoverallspeedupgainedbyincorporatingtheenhancement?例子1.5QuantitativePrinciplesofComputerDesignAmdahl’sLawexpressesthelawofdiminishingreturns(回報(bào)遞減法則):Theincrementalimprovementinspeedupgainedbyanadditionalimprovementinjustaportionofthecomputationdiminishesasimprovementsareadded.對(duì)于一部分性能的提高,總體加速比的提高呈遞減AnimportantcorollaryofAmdahl’sLawisthatifanenhancementisonlyusableforafractionofatask,wecan’tspeedupthetaskbymorethanthereciprocalof1minusthatfraction.總體加速比有上界1.5QuantitativePrinciplesofComputerDesignEXAMPLE:Implementationsoffloating-pointsquareroot(FPSQR)
varysignificantlyinperformance.SupposeFPSQRisresponsiblefor20%oftheexecutiontimeofacriticalbenchmark.OneproposalistoaddFPSQRhardwarethatwillspeedupthisoperationbyafactorof10.TheotheralternativeisjusttotrytomakeallFPinstructionsrunfaster;FPinstructionsareresponsibleforatotalof50%oftheexecutiontime.ThedesignteambelievesthattheycanmakeallFPinstructionsruntwotimesfasterwiththesameeffortasrequiredforthefastsquareroot.Comparethesetwodesignalternatives.ANSWER:comparingthespeedups:2.00.751.33ImprovingtheperformanceoftheFPoperationsoverallisslightlybetterbecauseofthehigherfrequency.1.5QuantitativePrinciplesofComputerDesignTheCPUPerformanceEquationCPU性能方程Essentiallyallcomputersareconstructedusingaclockrunningataconstantrate.Thesediscretetimeeventsarecalledticks,clockticks,clockperiods,clocks,cycles,orclockcycles.時(shí)鐘Computerdesignersrefertothetimeofaclockperiodbyitsduration(e.g.,1ns)orbyitsrate(e.g.,1GHz).CPUtimeforaprogramcanthenbeexpressedintwoways:程序的CPU時(shí)間1.5QuantitativePrinciplesofComputerDesignwecanalsocountthenumberofinstructionsexecuted---theinstructionpathlength
orinstructioncount
(IC).指令數(shù)
Ifweknowthenumberofclockcyclesandtheinstructioncountwecancalculatetheaveragenumberofclockcyclesperinstruction(CPI).
每條指令的平均時(shí)鐘數(shù)1.5QuantitativePrinciplesofComputerDesignThisallowsustouseCPIintheexecutiontimeformula:執(zhí)行時(shí)間的公式Expandingthefirstformulaas:1.5QuantitativePrinciplesofComputerDesignorSo,CPUperformanceisdependentupon:clockcycle(orrate),CPI,andIC.Butitisdifficulttochangeoneparameterinisolationfromothersbecausethebasictechnologiesinvolvedareinterdependent:很難改變一個(gè)參數(shù)而不影響其它參數(shù)*Clockcycletime
--Hardwaretechnologyandorganization*CPI--OrganizationandISA*Instructioncount--ISAandcompilertechnologyLuckily,manyimprovementtechniquesprimarilyimproveonecomponentwithsmallorpredictableimpactsontheothertwo.幸好,很多技術(shù)在改進(jìn)一個(gè)部分時(shí),對(duì)于其他部分影響很小或影響可預(yù)測(cè)1.5QuantitativePrinciplesofComputerDesignSometimesitisusefulindesigningtheCPUtouse:另一種計(jì)算公式whereICi
representsnumberoftimesinstructioniisexecutedinaprogramandCPIi
representstheaveragenumberofclockcyclesforinstructioni.ThisformcanbeusedtoexpressCPUtimeas:1.5QuantitativePrinciplesofComputerDesignandCPIas:EXAMPLE:
例子Supposewehavethefollowingmeasurements:*FrequencyofFPoperations=25%*AverageCPIofFPoperations=4.0*AverageCPIofotherinstructions=1.33*FrequencyofFPSQR=2%*CPIofFPSQR=20
測(cè)量結(jié)果Assumethatthetwodesignalternativesareto
reducetheCPIofFPSQRto2ortoreducetheaverageCPIofallFPoperationsto2.ComparethesetwodesignalternativesusingtheCPUperformanceequation.設(shè)計(jì)選擇1.5QuantitativePrinciplesofComputerDesignANSWER:答案First,observethatonlytheCPIchanges;theclockrateandinstructioncountremainidentical.只有CPI變化了WecancomputetheCPIfortheenhancedFPSQRby:增強(qiáng)FPSQR的CPI1.5QuantitativePrinciplesofComputerDesignWecomputetheCPIfortheenhancementofallFPinstructions:增強(qiáng)FP指令的CPITheCPIofoverallFPenhancementislower,itsperformancewillbetter.改進(jìn)FP的CPI更好Specifically,thespeedupfortheoverallFPenhancementis:2.01.5
1.5
1.33
1.5QuantitativePrinciplesofComputerDesignMeasuringtheComponentsofCPUPerformance
測(cè)量CPU性能的各組成部分TousetheCPUperformanceequation,weneedmeasurementsoftheindividualcomponents.需要測(cè)量性能非常的各組成部分Todeterminetheclockcycle:時(shí)鐘周期*iseasyforanexistingCPU.現(xiàn)有CPU:容易*Low-leveltools,calledtimingestimatorsortimingverifiers,areusedforacompleteddesign.
已完成的設(shè)計(jì):用時(shí)延估計(jì)器或時(shí)延驗(yàn)證器*Foradesignthatisnotcompleted,byexaminingcriticalpaths.未完成的設(shè)計(jì):考察關(guān)鍵路徑1.5QuantitativePrinciplesofComputerDesignMeasuringtheinstructioncount:
指令數(shù)測(cè)量*compilertogetherwithtoolsthatmeasuretheinstructionsetbehavior.編譯器及測(cè)量指令集行為的工具*Foracompiledversionofaprogram,therearetwomajormethodstoobtainIC.如何獲得ICfirstway:byinstructionsetsimulatorthatinterpretstheinstructions—slowbutcanmeasurealmostanyaspectofinstructionsetbehavioraccurately.指令集模擬器:慢,但能精確地測(cè)量指令集行為的幾乎所有方面secondway:usesexecution-basedmonitoring.thebinaryprogramismodifiedtoincludeinstrumentationcode
—veryfast,sinceprogramisexecuted,ratherthaninterpreted用基于執(zhí)行的監(jiān)視:修改程序(插樁代碼),快。1.5QuantitativePrinciplesofComputerDesignMeasuringtheCPI:difficult測(cè)量CPI困難*Forsimpleprocessors,CPIfromatable.查表*Formodernprocessorsusetechniquessuchaspipeliningandmemoryhierarchies:對(duì)于帶流水線和存儲(chǔ)層次的現(xiàn)代處理器DesignersoftenuseaverageCPIvalues,buttheseaverageCPIsarecomputedbymeasuringtheeffectsofthepipelineandcachestructure.通常使用平均CPI,需考慮流水線和cache結(jié)構(gòu)itisoftenusefultoseparatethecomponentarisingfromthememorysystemandthecomponentdeterminedbythepipeline.流水線和存儲(chǔ)系統(tǒng)分別考慮Thus,wecancomputetheCPIforinstructioni,as:
CPIi=PipelineCPIi+MemorysystemCPIi1.5QuantitativePrinciplesofComputerDesignUsingtheCPUPerformanceEquations:MoreExamples運(yùn)用CPU性能方程:更多例子EXAMPLE:例子weareconsideringtwoalternativesforourconditionalbranchinstructions(條件轉(zhuǎn)移指令),as:條件轉(zhuǎn)移指令有兩種設(shè)計(jì)選擇
*CPUA:Aconditioncodeissetbyacompareinstructionandfollowedbyabranchthatteststheconditioncode.先用比較指令置條件碼,然后轉(zhuǎn)移指令檢測(cè)條件碼*CPUB:Acompareisincludedinthebranch.
轉(zhuǎn)移指令中包含了比較操作1.5QuantitativePrinciplesofComputerDesignOnbothCPUs,conditionalbranchinstructiontakes2cycles,andallotherinstructionstake1clockcycle.條件轉(zhuǎn)移指令2周期,其他指令1周期
OnCPUA,20%ofallinstructionsexecutedareconditionalbranches.Sinceeverybranchneedsacompare,another20%oftheinstructionsarecompares.CPUA:有20%條件轉(zhuǎn)移指令,相應(yīng)也就有20%的比較指令BecauseCPUAdoesnothavethecompareincludedinthebranch,assumethatitsclockcycletimeis1.25timesfasterthanthatofCPUB.
CPUA的時(shí)鐘比CPUB的快1.25倍WhichCPUisfaster?哪一個(gè)更快?WhatifCPUAwasonly1.1timesfaster?
1.5QuantitativePrinciplesofComputerDesignANSWER:答案wecanuseCPUperformanceformula:
CPIA=0.202+0.801=1.2CPUtimeA=ICA1.2ClockcycletimeAClockcycletimeB=1.25ClockcycletimeAComparesarenotexecutedinCPUB,so20%/80%=25%instructionsarebranches:
CPIB=0.252+0.751=1.25Because,ICB=0.8ICA.so:
CPUtimeB=ICB1.25ClockcycletimeB
=0.8ICA1.25(1.25ClockcycletimeA)=1.25ICAClockcycletimeA
>CPUtimeA
所以此時(shí)A快1.5QuantitativePrinciplesofComputerDesignIfCPUAwereonly1.1timesfaster,thenClockcycletimesis1.10ClockcycletimeAandtheperformanceofCPUBis:如果CPUA只比CPUB快1.1倍
CPUtimeB=ICBCPIBClockcycletimeB
=0.8ICA1.25(1.10ClockcycletimeA)=1.10ICAClockcycletimeA<CPUtimeA
所以此時(shí)B快本質(zhì)上是時(shí)鐘周期和指令數(shù)量之間的權(quán)衡。1
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 木材主題鄉(xiāng)村旅游線路創(chuàng)新創(chuàng)業(yè)項(xiàng)目商業(yè)計(jì)劃書
- 紅外康復(fù)理療儀企業(yè)制定與實(shí)施新質(zhì)生產(chǎn)力項(xiàng)目商業(yè)計(jì)劃書
- 肉類蛋糕罐頭行業(yè)跨境出海項(xiàng)目商業(yè)計(jì)劃書
- 植物功能性飲料創(chuàng)新創(chuàng)業(yè)項(xiàng)目商業(yè)計(jì)劃書
- 寵物腸胃調(diào)理粉創(chuàng)新創(chuàng)業(yè)項(xiàng)目商業(yè)計(jì)劃書
- 美容護(hù)膚虛擬試妝體驗(yàn)企業(yè)制定與實(shí)施新質(zhì)生產(chǎn)力項(xiàng)目商業(yè)計(jì)劃書
- 植物園攝影創(chuàng)新創(chuàng)業(yè)項(xiàng)目商業(yè)計(jì)劃書
- 殯葬服務(wù)創(chuàng)新創(chuàng)業(yè)項(xiàng)目商業(yè)計(jì)劃書
- DB41T 2879-2025高速公路混凝土結(jié)構(gòu)預(yù)防養(yǎng)護(hù)防腐涂裝技術(shù)規(guī)程
- DB37T 4918.3-2025政務(wù)信息化項(xiàng)目管理 第3部分:備案管理指南
- 浙江國(guó)企招聘2025安邦護(hù)衛(wèi)集團(tuán)總部及下屬單位部分崗位公開招聘16人筆試參考題庫附帶答案詳解
- 倉庫安全培訓(xùn)課件
- (2024新版)七上第14課:絲綢之路的開通與經(jīng)營(yíng)西域
- 小兒鼾癥課件
- 國(guó)開2025年《人文英語4》綜合測(cè)試答案
- 算力:新質(zhì)生產(chǎn)力的核心引擎
- 學(xué)生歷史思維品質(zhì)提升策略淺識(shí)
- DB32∕T 3812-2020 建筑同層排水工程技術(shù)規(guī)程
- 《創(chuàng)傷失血性休克中國(guó)急診專家共識(shí)(2023)》解讀 2
- 銀行柜臺(tái)人員手語課件
- 項(xiàng)目部領(lǐng)導(dǎo)帶班記錄
評(píng)論
0/150
提交評(píng)論