您好,欢迎光临本网站![请登录][注册会员]  
文件名称: microarchitecture.pdf
  所属分类: 硬件开发
  开发工具:
  文件大小: 1mb
  下载次数: 0
  上传时间: 2019-05-23
  提 供 者: drji*****
 详细说明: Contents 1 Introduction.......................................................................................................................3 1.1 About this manual.......................................................................................................3 1.2 Microprocessor versions covered by this manual........... .............................................4 2 Out-of-order execution (All processors except P1, PMMX)................................................5 2.1 Instructions are split into uops.....................................................................................5 2.2 Register renaming......................................................................................................6 3 Branch prediction (all processors).....................................................................................7 3.1 Prediction methods for conditional jumps....................................................................7 3.2 Branch prediction in P1.............................................................................................13 3.3 Branch prediction in PMMX, PPro, P2, and P3.........................................................17 3.4 Branch prediction in P4 and P4E..............................................................................18 3.5 Branch prediction in PM and Core2..........................................................................21 3.6 Branch prediction in AMD64.....................................................................................22 3.7 Indirect jumps (all processors except PM and Core2)...............................................25 3.8 Returns (all processors except P1)...........................................................................25 3.9 Static prediction........................................................................................................26 3.10 Close jumps............................................................................................................27 4 Pentium 1 and Pentium MMX pipeline.............................................................................29 4.1 Pairing integer instructions........................................................................................29 4.2 Address generation interlock.....................................................................................33 4.3 Splitting complex instructions into simpler ones........................................................33 4.4 Prefixes.....................................................................................................................34 4.5 Scheduling floating point code..................................................................................35 5 Pentium Pro, II and III pipeline.........................................................................................38 5.1 The pipeline in PPro, P2 and P3...............................................................................38 5.2 Instruction fetch........................................................................................................38 5.3 Instruction decoding..................................................................................................39 5.4 Register renaming....................................................................................................43 5.5 ROB read..................................................................................................................43 5.6 Out of order execution..............................................................................................47 5.7 Retirement................................................................................................................48 5.8 Partial register stalls..................................................................................................49 5.9 Partial memory stalls.................................................................................................52 5.10 Bottlenecks in PPro, P2, P3....................................................................................53 6 Pentium M pipeline..........................................................................................................55 6.1 The pipeline in PM....................................................................................................55 6.2 The pipeline in Core Solo and Duo...........................................................................56 6.3 Instruction fetch........................................................................................................56 6.4 Instruction decoding..................................................................................................56 6.5 Loop buffer...............................................................................................................58 6.6 Micro-op fusion.........................................................................................................58 6.7 Stack engine.............................................................................................................60 6.8 Register renaming....................................................................................................62 6.9 Register read stalls...................................................................................................62 2 6.10 Execution units.......................................................................................................64 6.11 Execution units that are connected to both port 0 and 1..........................................64 6.12 Retirement..............................................................................................................66 6.13 Partial register access.............................................................................................66 6.14 Partial memory stalls...............................................................................................68 6.15 Bottlenecks in PM...................................................................................................68 7 Core 2 pipeline................................................................................................................71 7.1 Pipeline.....................................................................................................................71 7.2 Instruction fetch and predecoding.............................................................................71 7.3 Instruction decoding..................................................................................................73 7.4 Micro-op fusion.........................................................................................................74 7.5 Macro-op fusion........................................................................................................74 7.6 Stack engine.............................................................................................................76 7.7 Register renaming....................................................................................................76 7.8 Register read stalls...................................................................................................76 7.9 Execution units.........................................................................................................78 7.10 Retirement..............................................................................................................80 7.11 Partial register access.............................................................................................80 7.12 Partial memory stalls...............................................................................................81 7.13 Cache and memory access.....................................................................................81 7.14 Breaking dependence chains..................................................................................82 7.15 Bottlenecks in Core2...............................................................................................83 8 Pentium 4 (NetBurst) pipeline..........................................................................................85 8.1 Data cache...............................................................................................................85 8.2 Trace cache..............................................................................................................85 8.3 Instruction decoding..................................................................................................90 8.4 Execution units.........................................................................................................91 8.5 Do the floating point and MMX units run at half speed?............................................93 8.6 Transfer of data between execution units..................................................................96 8.7 Retirement................................................................................................................98 8.8 Partial registers and partial flags...............................................................................99 8.9 Partial memory access............................................................................................100 8.10 Memory intermediates in dependence chains.......................................................100 8.11 Breaking dependence chains................................................................................102 8.12 Choosing the optimal instructions.........................................................................102 8.13 Bottlenecks in P4 and P4E....................................................................................105 9 AMD64 pipeline.............................................................................................................108 9.1 The pipeline in AMD64............................................................................................108 9.2 Instruction fetch......................................................................................................110 9.3 Predecoding and instruction length decoding..........................................................110 9.4 Single, double and vector path instructions.............................................................111 9.5 Integer execution pipes...........................................................................................112 9.6 Floating point execution pipes.................................................................................112 9.7 Mixing instructions with different latency.................................................................114 9.8 64 bit versus 128 bit instructions.............................................................................115 9.9 Data delay between differently typed instructions...................................................116 9.10 Partial register access...........................................................................................117 9.11 Partial flag access.................................................................................................117 9.12 Partial memory stalls.............................................................................................118 9.13 Loops....................................................................................................................118 9.14 Cache...................................................................................................................119 9.15 Bottlenecks in AMD64...........................................................................................120 10 Comparison of microarchitectures...............................................................................122 10.1 The AMD kernel....................................................................................................122 10.2 The Pentium 4 kernel............................................................................................123 10.3 The Pentium M kernel...........................................................................................124 10.4 Intel Core 2 microarchitecture...............................................................................125 10.5 Conclusion............................................................................................................126 3 10.6 Future trends........................................................................................................128 11 Literature.....................................................................................................................129
(系统自动生成,下载前可以参看下载内容)

下载文件列表

相关说明

  • 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
  • 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度
  • 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
  • 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
  • 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
  • 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.
 相关搜索:
 输入关键字,在本站1000多万海量源码库中尽情搜索: