Publications by Type

Conference papers Journal Articles Workshop Papers Tech Reports and Theses

Conference Papers

  1. "Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs." E. Choukse, M. Sullivan, M. O'Connor, M. Erez, J. Pool, D. Nellans, and S.W. Keckler. International Symposium on Computer Architecture (ISCA), May 2020.

  2. "Speculative Reconvergence for Improved SIMT Efficiency." S. Damani, D. Johnson, M. Stephenson, S.W. Keckler, E. Yan, M. McKeown, and O. Giroux. International Symposium on Code Generation and Optimization (CGO), February 2020.

  3. "NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs." O. Villa, M. Stephenson, D. Nellans, and S.W. Keckler. 52nd International Symposium on Microarchitecture (MICRO), October 2019.

  4. "Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture." Y. Shao, J. Clemons, R. Venkatesan, B. Zimmer, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S. Tell, Y. Zhang, W. Dally, J. Emer, C. Gray, B. Khailany, and S.W. Keckler. 52nd International Symposium on Microarchitecture (MICRO), October 2019.

  5. "A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology." R. Venkatesan, S. Shao, B. Zimmer, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S. Tell, Y. Zhang, W. Dally, J. Emer, T. Gray, S.W. Keckler, and B. Khailany. HotChips 31, August 2019.

  6. "MAGNet: A Modular Accelerator Generator for Neural Networks." R. Venkatesan, Y.S. Shao, M. Wang, J. Clemons, S. Dai, M. Fojtik, B. Keller, A. Klinefilter, N. Pinckney, Y. Zhang, B. Zimmer, W.J. Dally, J. Emer, S.W. Keckler, and B.Khailany. International Conference on Computer Aided Design (ICCAD), November 2019.

  7. "GPU Snapshot: Checkpoint Offloading for GPU-Dense Systems." K. Lee, M. Sullivan, S.K.S. Hari, T. Tsai, S.W. Keckler, and M. Erez. International Conference on Supercomputing (ICS), June 2019.

  8. "A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm." B. Zimmer, R. Venkatesan, S. Shao, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S. Tell, Y. Zhang, W. Dally, J. Emer, T. Gray, S. Keckler, and B. Khailany, Symposia on VLSI Technology and Circuits (VLSI), June 2019.

  9. "SNAP: A 1.67 - 21.55 TOPS/W Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference." J. Zhang (Michigan), C. Lee (Michigan), C. Liu (Michigan), Y.S. Shao, S.W. Keckler, and Z. Zhang (Michigan), Symposia on VLSI Technology and Circuits (VLSI), June 2019.

  10. "ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection." S. Jha, S.S. Banerjee, T. Tsai, S.K.S. Hari, M. Sullivan. Z. Kalbarczyk, S.W. Keckler, and R.K. Iyer, International Conference on Dependable Systems and Networks (DSN), June 2019.

  11. "Timeloop: A Systematic Approach to DNN Accelerator Evaluation," A. Parashar, P. Raina, Y.S. Shao, Y. Chen, V.A. Ying, A. Mukkara, R. Venkatesan, B. Khailany, S.W. Keckler, and J. Emer, International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2019.

  12. "Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration," M. Pellauer, Y.S. Shao, J. Clemons, N. Crago, K. Hegde, R. Venkatesan, S.W. Keckler, C.W. Fletcher, and J. Emer, 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2019.

  13. "Optimizing Software-Directed Instruction Replication for GPU Error Detection," A. Mahmoud, S.K.S. Hari, M.B. Sullivan, T. Tsai, and S.W. Keckler, International Conference for High Performance Computing and Communications (SC), November 2018.

  14. "SwapCodes: Error Codes for Hardware-Software Cooperative GPU Pipeline Error Detection," M.B. Sullivan, S.K.S Hari, B. Zimmer, T. Tsai, and S.W. Keckler, International Symposium on Microarchitecture (MICRO), October 2018.

  15. "Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks," C. Lee, Y.S. Shao, J. Zhang, A. Parashar, J. Emer, S.W. Keckler, and Z. Zhang. SysML, February 2018.

  16. "Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks," M. Rhu, M. O'Connor, N. Chatterjee, J. Pool, Y. Kwon, and S.W. Keckler. International Symposium On High Performance Computer Architecture (HPCA), February 2018. PDF

  17. "Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications," G. Li, S.K.S. Hari, M. Sullivan, T. Tsai. K. Pattabiraman, J. Emer, and S.W. Keckler. International Conference for High Performance Computing and Communications (SC), November 2017. PDF

  18. "Fine-Grained DRAM: Energy Efficient DRAM for Extreme Bandwidth Systems," M. O'Connor, N. Chatterjee, D. Lee, J. Wilson, A. Agrawal, S.W. Keckler, and W.J. Dally, International Symposium on Microarchitecture (MICRO), October 2017. PDF

  19. "SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks," A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S.W. Keckler, and W.J. Dally. International Symposium on Computer Architecture (ISCA), June 2017. PDF

  20. SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation," S.K.S. Hari, T. Tsai, M. Stephenson, S.W. Keckler, and J. Emer. International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2017. PDF

  21. "Architectecting an Energy-Efficient DRAM System for GPUs," N. Chatterjee, M. O'Connor, D. Lee, D. Johnson, S.W. Keckler, M. Rhu, and W.J. Dally. International Symposium On High Performance Computer Architecture (HPCA) - industry session, February 2017. PDF

  22. "Virtualizing Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design," M. Rhu, N. Gimelshein, J. Clemons, A. Zulfiqar, and S.W. Keckler. International Symposium on Microarchitecture (MICRO), October 2016. PDF

  23. "A Patch Memory System For Image Processing and Computer Vision," J. Clemons, C. Cheng, I. Frosio, D. Johnson, and S.W. Keckler, International Symposium on Microarchitecture (MICRO), October 2016. PDF

  24. "CLARA: Circular Linked-List Auto and Self Refresh Architecture," A. Agrawal, M. O'Connor, N. Chatterjee, E. Bolotin, J. Emer, and S.W. Keckler. International Symposium on Memory Systems (MEMSYS), October 2016. PDF

  25. "Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems," K. Hsieh, E. Ebrahimi, G. Kim, N. Chatterjee, M. O'Connor, N. Vijaykumar, O. Mutlu, and S.W. Keckler. International Symposium on Computer Architecture (ISCA), June 2016. PDF

  26. "A Real-time Energy-efficient Superpixel Hardware Accelerator for Mobile Computer Vision Applications," I. Hong, J. Clemons, I. Frosio, R. Venkatesan, B. Khailany, and S.W. Keckler. Design Automation Conference (DAC), June 2016. PDF

  27. "Towards High Performance Paged Memory for GPUs," T. Zhen, D. Nellans, A. Zulfiqar, M. Stephenson, and S.W. Keckler. 22nd International Symposium On High Performance Computer Architecture (HPCA), March 2016. PDF

  28. "A Case for Toggle-Aware Compression for GPU Systems," G. Pekhimenko, E. Bolotin, N. Vijaykumar, O. Mutlu, T.C. Mowry, S.W. Keckler. 22nd International Symposium On High Performance Computer Architecture (HPCA), March 2016. PDF

  29. "Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence," N. Agarwal, D. Nellans, E. Ebrahimi, T.F. Wenisch, J. Danskin, and S.W. Keckler. 22nd International Symposium On High Performance Computer Architecture (HPCA) - industry session, March 2016. PDF

  30. "GPU Computing Pipeline Inefficiencies and Optimization Opportunities in Heterogeneous CPU-GPU Processors," J. Hestness, S.W. Keckler, and D.A. Wood. International Symposium on Workload Characterization (IISWC), October 2015. PDF

  31. "Anatomy of GPU Memory System for Multi-Application Execution," A. Jog, O. Kayiran, T. Kesten, A. Pattnaik, E. Bolotin, N. Chatterjee, S.W. Keckler, M.T. Kandemir, and Chita R. Das. International Symposium on Memory Systems (MEMSYS), pp. 223-234, October 2015. PDF

  32. "A Variable Warp Size Architecture," T.G. Rogers, D.R. Johnson, M. O'Connor, and S.W. Keckler. 42nd International Symposium on Computer Architecture (ISCA), pp. 489-501, June 2015. PDF

  33. "Flexible Software Profiling of GPU Architectures," M. Stephenson, S.K.S. Hari, M. O'Connor, Y. Lee, D. Nellans, E. Ebrahimi, D.R. Johnson, and S.W. Keckler. 42nd International Symposium on Computer Architecture (ISCA), pp. 185-197, June 2015. PDF

  34. "Page Placement Strategies for GPUs within Heterogeneous Memory Systems," N. Agarwal, D. Nellans, M. Stephenson, M. O'Connor, and S.W. Keckler. 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) pp. 608-617, March, 2015. PDF

  35. "Unlocking Bandwidth for GPUs in CC-NUMA Systems," N. Agarwal, D. Nellans, M. O'Connor, S.W. Keckler and T.F. Wenisch. 21st International Symposium on High Performance Computer Architecture (HPCA) - industry session, pp. 354-365, February 2015. PDF

  36. "Priority-Based Cache Allocation for Throughput Processors," D. Li, M. Rhu, D.R. Johnson, M. O'Connor, M. Erez, D. Burger, D.S. Fussell, and S.W. Keckler. 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 89-100, February 2015. PDF

  37. "Arbitrary Modulus Indexing," J. Diamond, D. Fussell, and S.W. Keckler. 47th International Symposium on Microarchitecture (MICRO), December 2014. PDF

  38. "Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures," Y. Lee, M. Stephenson, V. Grover, R. Krashinsky, S.W. Keckler, and K. Asanovic. 47th International Symposium on Microarchitecture (MICRO), December 2014. PDF

  39. "Scaling the Power Wall: A Path to Exascale," O. Villa, D.R. Johnson, M. O'Connor, E. Bolotin, D. Nellans, J. Luitjens, N. Sakharnykh, P. Wang, P. Micikevicius, A. Scudiero, S.W. Keckler, and W.J. Dally. International Conference for High Performance Computing and Communications (SC), November 2014. PDF

  40. "A Comparative Analysis of Microarchitecture Effects on CPU and GPU Memory System Behavior," J. Hestness, S.W. Keckler, and D. Wood, IEEE International Symposium on Workload Characterization (IISWC), October 2014. PDF

  41. "Author Retrospective for A NUCA Substrate for Flexible CMP Cache Sharing," J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S.W. Keckler. 28th International Conference on Supercomputing (ICS), June 2014. Invited paper. PDF

  42. "21st Century Digital Design Tools," W.J. Dally, C. Malachowsky, and S.W. Keckler, Design Automation Conference (DAC), June 2013. Invited paper. PDF

  43. "How to Implement Effective Prediction and Forwarding for Fusable Dynamic Multicore Architectures," B. Robatmili, D. Li, H. Esmaeilzadeh, M. Govindan, A. Smith, A. Putnam, D. Burger, and S.W. Keckler, 19th International Symposium on High-Performance Computer Architecture (HPCA), February 2013. PDF

  44. "Convergence and Scalarization for Data-Parallel Architectures," Y. Lee, R. Krashinsky, V. Grover, S.W. Keckler, and K. Asanovic, International Symposium on Code Generation and Optimization (CGO), February 2013. PDF

  45. "Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor," M. Gebhart, S.W. Keckler, B. Khailany, R. Krashinsky, and W.J. Dally. 45th International Symposium on Microarchitecture (MICRO), pp. 96-106, December 2012. PDF

  46. "A Compile-Time Managed Multi-Level Register File Hierarchy," M. Gebhart, S.W. Keckler, and W.J. Dally, 44th International Symposium on Microarchitecture (MICRO), pp. 465-476, December 2011. PDF

  47. "Energy-efficient Mechanisms for Managing Thread Context in Throughput Processors," M. Gebhart, D.R. Johnson, D. Tarjan, S.W. Keckler, W.J. Dally, E. Lindholm, and K. Skadron, 38th International Symposium on Computer Architecture (ISCA), pp. 235-246, June 2011. PDF

  48. "Kilo-NOC: A Network-on-Chip Architecture for Scalability and Service Guarantees in Highly-Integrated CMPs," B. Grot, J. Hestness, S.W. Keckler, and O. Mutlu. 38th International Symposium on Computer Architecture (ISCA), pp. 401-412, June 2011. PDF

  49. "Evaluation and Optimization of Multicore Performance Bottlenecks in Supercomputing Applications," J. Diamond, M. Burtscher, J.D. McCalpin, B.D. Kim, S.W. Keckler, and J.C. Browne. IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 32-43, April 2011. Best Paper Award PDF

  50. "Exploiting Criticality to Reduce Bottlenecks in Distributed Uniprocessors," B. Robatmili, M.S. Govindan, D. Burger and S.W. Keckler, 17th International Symposium on High-Performance Computer Architecture (HPCA), February 2011. PDF

  51. "Preemptive Virtual Clock: A Flexible, Efficient and Cost-effective QOS Scheme for Networks-on-a-Chip," B. Grot, S.W. Keckler, and O. Mutlu, 42nd International Symposium on Microarchitecture (MICRO), pp. 268-279, December 2009. PDF

  52. "End-to-End Validation of Architectural Power Models," M.S. Govindan, S.W. Keckler, and D. Burger, International Symposium on Low Power Electronics and Design (ISLPED), pp. 383-388, August 2009. PDF

  53. "Analysis of the TRIPS Prototype Block Predictor," N. Ranganathan, D. Burger, and S.W. Keckler, International Symposium on Performance Analysis of Software and Systems (ISPASS), pp. 195-206, April 2009. PDF

  54. "An Evaluation of the TRIPS Computer System," M. Gebhart, B.A. Maher, K.E. Coons, J. Diamond, P. Gratz, M. Marino, N. Ranganathan, B. Robatmili, A. Smith, J. Burrill, S.W. Keckler, D. Burger, and K.S. McKinley, 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 1-12, March 2009. Best Paper Award PDF

  55. "Express Cube Topologies for On-Chip Interconnects," B. Grot, J. Hestness, S.W. Keckler, and O. Mutlu, 15th International Symposium on High Performance Computer Architecture (HPCA), pp. 163-174, February 2009. PDF

  56. "Multitasking Workload Scheduling on Flexible-Core Chip Multiprocessors", D.P. Gulati, C. Kim. S. Sethumadhavan, S.W. Keckler, and D. Burger, 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 187-196, October 2008. PDF

  57. "Counting Dependence Predictors," F. Roesner, D. Burger, and S.W. Keckler, International Symposium on Computer Architecture (ISCA), pp. 215-226, June, 2008. PDF

  58. "Regional Congestion Awareness for Load Balance in Networks-on-Chip," P. Gratz, B. Grot, and S.W. Keckler, International Symposium on High Performance Computer Architecture (HPCA), pp. 203-214, February, 2008. PDF

  59. "High Performance Linear Algebra on a Spatially Distributed Processor," J. Diamond, B. Robatmili, S.W. Keckler, R. van de Geijn, K. Goto, and D. Burger, ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP), pp. 63-72, February, 2008. PDF

  60. "Composable Lightweight Processors," C. Kim, S. Sethumadhavan, M.S. Govindan, D. Gulati, H. Liu, N. Ranganathan, D. Burger, and S.W. Keckler, 40th International Symposium on Microarchitecture (MICRO), pp. 381-393, December, 2007. PDF

  61. "TRIPS: A Distributed Explicit Data Graph Execution (EDGE) Microprocessor," M.S. Govindan, K. Sankaralingam, R. Nagarajan, R. McDonald, R. Desikan, S. Drolia, P. Gratz, D. Gulati, H. Hanson, C.K. Kim, H. Liu, N. Ranganathan, S. Sethumadhavan, S. Sharif, P. Shivakumar, S.W. Keckler, and D. Burger, HotChips 19, August, 2007.

  62. "Thermal Response to DVFS: Analysis with an Intel Pentium M," H. Hanson, S.W. Keckler, K. Rajamani, J. Rubio, S. Ghiasi, and F. Rawson, International Symposium on Low Power Electronics and Design (ISLPED), pp. 219-224, August, 2007. PDF

  63. "Reconciling Performance and Programmability in Networking Systems," J. Mudigonda, H. Vin, and S.W. Keckler, ACM SIGCOMM, pp. 73-84, August, 2007. PDF

  64. "Late-Binding: Enabling Unordered Load-Store Queues," S. Sethumadhavan, F. Roesner, D. Burger, S.W. Keckler, and J. Emer, 34th International Symposium on Computer Architecture (ISCA), pp. 347-357, June 2007. PDF

  65. "Implementation and Evaluation of a Dynamically Routed Processor Operand Network," P. Gratz, K. Sankaralingam, H. Hanson, P. Shivakumar, R. McDonald, S.W. Keckler, and D. Burger, 1st International Symposium on Networks-on-Chips (NOC), pp. 7-17, May, 2007. PDF

  66. "Distributed Microarchitectural Protocols in the TRIPS Prototype Processor," K. Sankaralingam, R. Nagarajan, R. McDonald, R. Desikan, S. Drolia, S. Govindan, P. Gratz, D. Gulati, H. Hanson, C.K. Kim, H. Liu, N. Ranganathan, S. Sethumadhavan, S. Sharif, P. Shivakumar, S.W. Keckler, and D. Burger, International Symposium on Microarchitecture (MICRO), pp. 480-491, December, 2006. PDF

  67. "Dataflow Predication," A. Smith, R. Nagarajan, R. McDonald, D. Burger, S.W. Keckler, and K. McKinley, International Symposium on Microarchitecture (MICRO), pp. 89-102, December, 2006. PDF

  68. "Implementation and Evaluation of On-Chip Network Architectures," P. Gratz, C.K. Kim, R. McDonald, S.W. Keckler, and D. Burger, International Conference on Computer Design (ICCD), pp. 477-484, October, 2006. PDF

  69. "Design and Implementation of the TRIPS Primary Memory System," S. Sethumadhavan, R. McDonald, R. Desikan, D. Burger, and S.W. Keckler, International Conference on Computer Design (ICCD), pp. 470-476, October, 2006. PDF

  70. "Decomposing Memory Performance: Data Structures and Phases," K. Agaram, S.W. Keckler, C. Lin, and K.S. McKinley, International Symposium on Memory Management (ISMM), pp. 95-103, June, 2006. PDF

  71. "Critical Path Analysis of the TRIPS Microprocessor," R. Nagarajan, X. Chen, R. McDonald, S.W. Keckler, and D.C. Burger, International Symposium on Performance Analysis of Software and Systems (ISPASS), pp. 37-47, April, 2006. PDF

  72. "The Design and Implementation of the TRIPS Prototype Chip," R. McDonald, D. Burger, S.W. Keckler, Hot-Chips 17, August, 2005.

  73. "A NUCA Substrate for Flexible CMP Cache Sharing," J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S.W. Keckler, International Conference on Supercomputing (ICS), pp. 31-40, June, 2005. PDF

  74. "Breaking the GOP/Watt Barrier with EDGE Architectures," D. Burger and S.W. Keckler, Government Microcircuit Applications and Critical Technology Conference (GOMACtech), April, 2005. PDF

  75. "Scalable Selective Re-execution for EDGE Architectures," R. Desikan, L. Sethumadhavan, D. Burger, and S.W. Keckler, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October, 2004. PDF

  76. "Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures,", R. Nagarajan, S.K. Kushwaha, D. Burger, K. McKinley, C. Lin, and S.W. Keckler, International Conference on Parallel Architectures and Compilation Techniques (PACT), September, 2004. PDF

  77. "Universal Mechanisms for Data-Parallel Architectures," K. Sankaralingam, S.W. Keckler, W. Mark, and D.C. Burger, 2003 International Symposium on Microarchitecture (MICRO), pp. 303-314, December, 2003. PDF

  78. "Scalable Hardware Memory Disambiguation for High-ILP Processors," L. Sethumadhavan, R. Desikan, D.C. Burger, C.R. Moore, and S.W. Keckler, International Symposium on Microarchitecture (MICRO), pp. 399-410, December, 2003. PDF

  79. "Routed Inter-ALU Networks for ILP Scalability and Performance," K. Sankaralingam, V.A. Singh, S.W. Keckler, and D. Burger, International Conference on Computer Design (ICCD), pp. 170-177, October, 2003. PDF

  80. "Exploiting Microarchitectural Redundancy For Defect Tolerance," P. Shivakumar, S.W. Keckler, C.R. Moore, and D. Burger, International Conference on Computer Design (ICCD), pp. 481-488, October, 2003. PDF

  81. "Microprocessor Pipeline Energy Analysis," R. Natarajan, H. Hanson, S.W. Keckler, C.R. Moore, and D. Burger, IEEE International Symposium on Low Power Electronics and Design (ISLPED), pp. 282-287, August, 2003. PDF

  82. "Exploiting ILP, DLP, and TLP Using Polymorphism in the TRIPS Architecture," K. Sankaralingam, R. Nagarajan, H. Liu, J. Huh, C.K. Kim D. Burger, S.W. Keckler, and C.R. Moore, 30th Annual International Symposium on Computer Architecture (ISCA), pp. 422-433, June 2003. PDF

  83. "A Wire-Delay Scalable Microprocessor Architecture for High Performance Systems," S.W. Keckler, Doug Burger, C.R. Moore, R. Nagarajan, K. Sankaralingam, V. Agarwal, M.S. Hrishikesh, N. Ranganathan, and P. Shivakumar. International Solid-State Circuits Conference (ISSCC), pp. 1068-1069, February, 2003. PDF

  84. "An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches," C. Kim, D. Burger, and S.W. Keckler. 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 211-222, October, 2002. PDF

  85. "Modeling the Effect of Technology Trends on Soft Error Rate of Combinational Logic," P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, and L. Alvisi. 2002 International Conference on Dependable Systems and Networks (DSN), pp. 389-398, June, 2002. PDF

  86. "The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays," M.S. Hrishikesh, N.P. Jouppi, K.I. Farkas, D. Burger, S.W. Keckler, and P. Shivakumar. 29th International Symposium on Computer Architecture (ISCA), pp. 14-24, May, 2002. PDF

  87. "A Design Space Evaluation of Grid Processor Architectures," R. Nagarajan, K. Sankaralingam, D. Burger, and S.W. Keckler. 34th Annual International Symposium on Microarchitecture (MICRO), pp. 40-51, December, 2001. PDF

  88. "Static Energy Reduction Techniques for Microprocessor Caches," H. Hanson, M.S. Hrishikesh, V. Agarwal, S.W. Keckler, and D. Burger. International Conference on Computer Design (ICCD), pp. 276-283, September, 2001. PDF

  89. "Exploring the Design Space of Future CMPs," J. Huh, D. Burger, and S.W. Keckler. International Symposium on Parallel Architectures and Compilation Techniques (PACT), pp. 199-210, September, 2001. PDF

  90. "Measuring Experimental Error in Microprocessor Simulation," R. Desikan, D. Burger, and S.W. Keckler. 28th International Symposium on Computer Architecture (ISCA), pp. 266-277, June 2001. PDF

  91. "The Impact of Delay on the Design of Branch Predictors," D.A. Jiménez, S.W. Keckler, and C. Lin. 33rd International Symposium on Microarchitecture (MICRO), pp. 67-76, December 2000. PDF

  92. "Processor Mechanisms for Software Shared Memory," N.P. Carter, W.J. Dally, W.S. Lee, S.W. Keckler, and A. Chang. International Symposium on High Performance Computing, October 2000. PDF

  93. "Clock Rate Versus IPC: The End of the Road for Conventional Microarchitectures," V. Agarwal, M.S. Hrishikesh, S.W. Keckler, and D. Burger. 27th International Symposium on Computer Architecture (ISCA), pp. 248-259, June, 2000. PDF

  94. "The Effects of Explicitly Parallel Mechanisms on the Multi-ALU Processor Cluster Pipeline," A. Chang, W.J. Dally, S.W. Keckler, N.P. Carter, and W.S. Lee. International Conference on Computer Design (ICCD), pp. 474-481, October 1998. PDF

  95. "Exploiting Fine-Grain Thread Level Parallelism on the MIT Multi-ALU Processor," S.W. Keckler, W.J. Dally, D. Maskit, N.P. Carter, A. Chang, and W.S Lee. 25th Annual International Symposium on Computer Architecture (ISCA), pp. 306-317, July 1998. PDF

  96. "The MIT Multi-ALU Processor," S.W. Keckler, W.J. Dally, A. Chang, N.P. Carter, and W.S Lee. Proceedings of Hot Chips IX, pp. 1-8, August 1997. PDF

  97. "The M-Machine Multicomputer," M. Fillo, S.W. Keckler, W.J. Dally, N.P. Carter, A. Chang, Y. Gurevich, and W.S Lee. Proceedings of the 28th Annual International Symposium on Microarchitecture (MICRO), pp. 146-156, December 1995. PDF

  98. "Hardware Support for Fast Capability-based Addressing," N.P. Carter, S.W. Keckler, and W.J. Dally. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 319-327, October 1994. PDF

  99. "Processor Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism," S.W. Keckler, and W.J. Dally. International Symposium on Computer Architecture (ISCA), pp. 202-213, May 1992. PDF

Books

  1. Multicore Processors and Systems. S.W. Keckler, K. Olukotun, and H.P. Hofstee eds. Springer, New York, NY 2009. LINK

  2. The New Global Ecosystem in Advanced Computing: Implications for U.S. Competitiveness and National Security, Committee on Global Approaches to Advanced Computing; Board on Global Science and Technology; Policy and Global Affairs; National Research Council, D. A. Reed, C. Cao, T.M. Cheung, J. Crawford, D. Ernst, M.D. Hill, S.W. Keckler, D. Liddle, and K.S. McKinley, The National Academies Press, Washington, DC 2012. LINK

Journal Articles and Book Chapters

  1. "A 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Inference Accelerator with Ground-Referenced Signaling in 16nm." B. Zimmer, R. Venkatesan, Y.S. Shao, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S.G. Tell, Y. Zhang, W.J. Dally, J. Emer, C.T. Gray, S.W. Keckler, and B. Khailany. IEEE Journal of Solid-State Circuits (JSSC), 55(4), April 2020.

  2. "Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs," N.C. Crago, M. Stephenson, and S.W. Keckler, ACM Transactions on Architecture and Code Optimization (TACO), 15(4), December, 2018.

  3. "Software-Directed Techniques for Improved GPU Register File Utilization," D. Voitsechov, A. Zulfiqar, M. Stephenson, M. Gebhart, and S.W. Keckler, ACM Transactions on Architecture and Code Optimization (TACO), 14(3), October, 2018.

  4. "Designing Efficient Heterogeneous Memory Architectures," E. Bolotin, D. Nellans, O. Villa, M. O'Connor, A. Ramirez, and S.W. Keckler. IEEE Micro, 35(4), pp. 60-68, July 2015.

  5. "Toggle-aware Bandwidth Compression for GPUs," G. Pekhimenko, E. Bolotin, M. O'Connor, O. Mutlu, T.C. Mowry, and S.W. Keckler. IEEE Computer Architecture Letters, 2015. PDF

  6. "Scaling Power and Performance via Processor Composability," M.S. Govindan, B. Robatmili, D. Li, B. Maher, A. Smith, S.W. Keckler, and D. Burger. IEEE Transactions on Computers, 63(8), pp. 2025-2038, August 2014.

  7. "A Hierarchical Thread Scheduler and Register File for Energy-efficient Throughput Processors," M. Gebhart, D.R. Johnson, D. Tarjan, S.W. Keckler, W.J. Dally, E. Lindholm, and K. Skadron, ACM Transactions on Computer Systems, 30(2), pp. 8:1-8:38, April 2012.

  8. "Kilo-NOC: A Network-on-Chip Architecture for Scalability and Service Guarantees in Highly-Integrated CMPs," B. Grot, J. Hestness, S.W. Keckler, and O. Mutlu, IEEE Micro, "Annual Top Picks from Microarchitecture Conference issue," 32(3), pp. 17-25, May/June 2012.

  9. "GPUs and the Future of Parallel Computing," S.W. Keckler, W.J. Dally, B. Khailany, M. Garland, and D. Glasco, IEEE Micro, September/October, 2011.

  10. "The TRIPS OPN: A Processor Integrated NoC for Operand Bypass," P. Gratz and S.W. Keckler, Designing Network On-Chip Architectures in the Nanoscale Era, J. Flich and D. Bertozzi eds. Taylor & Francis, 2010.

  11. "On-Chip Networks for Multicore Systems," L.S. Peh, S.W. Keckler, and S. Vangal, in Multicore Processors and Systems, S.W. Keckler, K. Olukotun, and H.P. Hofstee eds. Springer, New York, NY 2009.

  12. "Composable Multicore Chips," D. Burger, S.W. Keckler, and S. Sethumadhavan, in Multicore Processors and Systems, S.W. Keckler, K. Olukotun, and H.P. Hofstee eds. Springer, New York, NY 2009.

  13. "On-Chip Interconnection Networks of the TRIPS Processor," P. Gratz, C. Kim, K. Sankaralingam, H. Hanson, P. Shivakumar, S.W. Keckler, and D. Burger, IEEE Micro, 27(5), pp. 41-50, September/October, 2007.

  14. "Research Challenges for On-Chip Interconnection Networks," J.D. Owens, W.J. Dally, R. Ho, D.N. Jayasimha, S.W. Keckler, and L. Peh, IEEE Micro, 27(5), pp. 96-108, September/October, 2007.

  15. "A NUCA Substrate for Flexible CMP Cache Sharing," J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S.W. Keckler, IEEE Transactions on Parallel and Distributed System, 18(8), pp. 1028-1040, August, 2007.

  16. "Architecture and Implementation of the TRIPS Processor," S.W. Keckler, D. Burger, K. Sankaralingam, R. Nagarajan, R. McDonald, R. Desikan, S. Drolia, M.S. Govindan, P. Gratz, D. Gulati, H. Hanson, C. Kim, H. Liu, N. Ranganathan, S. Sethumadhavan, S. Sharif, and P. Shivakumar, in Unique Chips and Systems, edited by E. John and J. Rubio, CRC Press, Boca Raton, FL, 2007.

  17. "Scalable Hardware Memory Disambiguation for High-ILP Processors," L. Sethumadhavan, R. Desikan, D.C. Burger, C.R. Moore, and S.W. Keckler, IEEE Micro, "Annual Top Picks from Microarchitecture Conference issue," 24(6), pp. 118-127, November/December, 2004.

  18. "Scaling to the End of Silicon with EDGE Architectures," D. Burger, S.W. Keckler, K.S. McKinley, M. Dahlin, L.K. John, Calvin Lin, C.R. Moore, J. Burrill, R.G. McDonald, and W. Yoder, IEEE Computer, 37:7, pp. 44-55, July, 2004.

  19. "TRIPS: A Polymorphous Architecture for Exploiting ILP, TLP, and DLP," K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, N. Ranganathan, D. Burger, S.W. Keckler, R.G. McDonald, and C.R. Moore ACM Transcations on Architecture and Code Optimization, 1:1, pp. 62-93, March, 2004.

  20. "Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture," K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S.W. Keckler, and C.R. Moore, IEEE Micro, 23:6, pp. 46-51, November/December, 2003.

  21. "Nonuniform Cache Architectures for Wire-Delay Dominated On-Chip Caches," C. Kim, D. Burger, and S.W. Keckler, IEEE Micro, 23:6, pp. 99-107, November/December, 2003.

  22. "Static Energy Reduction Techniques for Microprocessor Caches," H. Hanson, M.S. Hrishikesh, V. Agarwal, S.W. Keckler, and D. Burger. IEEE Transactions on VLSI Systems, 11(3):303-313, June, 2003.

  23. "A Technology-Scalable Architecture for Fast Clocks and High ILP," K. Sankaralingam, R. Nagarajan, D. Burger, and S.W. Keckler. in Interaction Between Compilers and Computer Architectures, edited by G. Lee and P. Yew, pp. 117-139, Kluwer Academic Publishers, 2001. PDF

  24. "Concurrent Event Handling Through Multithreading," S.W. Keckler, A. Chang, W.S. Lee, S. Chatterjee, and W.J. Dally. IEEE Transactions on Computers, 48:9, September, 1999, pp 903-916. PDF

  25. "Efficient Protected Message Interface in the MIT M-Machine," W.S. Lee, W.J. Dally, S.W. Keckler, N.P. Carter, and A. Chang. IEEE Computer, pp. 69-75, November 1998. PDF

  26. "The M-Machine Multicomputer," M. Fillo, S.W. Keckler, W.J. Dally, N.P. Carter, A. Chang, Y. Gurevich, and W.S Lee. International Journal of Parallel Programming, 25:3, pp. 183-212, June 1997.

  27. "International Symposium on Computer Architecture 1992," S.W. Keckler and W.J. Dally. Scientific Information Bulletin, Office of Naval Research Asian Office, 17:4, October-December, 1992.
Workshop Papers

  1. "Feature Map Vulnerability Evaluation in CNNs." A. Mahmoud, S.K.S. Hari, C.W. Fletcher, S. Adve, C. Sakr, N. Shanbhag, P. Molchanov, M.B. Sullivan, T. Tsai, and S.W. Keckler. Workshop on Secure and Resilient Autonomy (SARA), March 2020.

  2. "Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors," S. Jha, T. Tsai, S.K.S. Hari, M. Sullivan, Z. Kalbarczyk, S.W. Keckler, and R. Iyer, IEEE International Workshop on Automotive Reliability & Test (ART), November 2018.

  3. "An Analytical Model for Hardened Latch Selection and Exploitation," M. Sullivan, B. Zimmer, S.K.S. Hari, T. Tsai, and S.W. Keckler. 12th Workshop on Silicon Errors in Logic - System Effects (SELSE), March, 2016.

  4. "SASSIFI: Evaluating Resilience of GPU Applications," S.K.S. Hari, T. Tsai, M. Stephenson, S.W. Keckler, and J. Emer. 11th Workshop on Silicon Errors in Logic - System Effects (SELSE), April, 2015. PDF

  5. "Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications," A. Jog, E. Bolotin, Z. Guz, M. Parker, S.W. Keckler, M.T. Kandemir, and Chita R. Das. Seventh Workshop on General Purpose Processing Using GPUs (GPGPU), March, 2014. PDF

  6. "Measuring the Radiation Reliability of SRAM Structures in GPUs Designed for HPC." P. Rech, L. Carro, N. Wang, T. Tsai, S.K.S. Hari, and S.W. Keckler. 10th Workshop on Silicon Errors in Logic - System Effects (SELSE-10), April, 2014. PDF

  7. "Netrace: Dependency-Driven Trace-Based Network-on-Chip Simulation," J. Hestness, B. Grot, and S.W. Keckler, Third International Workshop on Network on Chip Architectures (NoCArc), December, 2010. PDF

  8. "Topology-aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors," B. Grot, S.W. Keckler, and O. Mutlu, Workshop on the Interaction Between Operating Systems and Computer Architecture (WIOSCA), June 2010. PDF

  9. "ReFLEX: Block Atomic Execution on Conventional ISA Cores," M. Gebhart and S.W. Keckler, Workshop on Parallel Execution of Sequential Programs on Multi-core Architectures (PESPMA), June 2010. PDF

  10. "Segment Gating for Static Energy Reduction in Networks-On-Chip," K.C. Hale, B. Grot, and S.W. Keckler, Second International Workshop on Network on Chip Architectures (NoCArc), December 2009. PDF

  11. "Multicore Optimization for Ranger," J. Diamond, B.D. Kim, M. Burtscher, S.W. Keckler, K. Pingali, and J.C. Browne, TeraGrid 09, June 2009. PDF

  12. "Hybrid Operand Communication for Dataflow Processors," D. Li, B. Robatmili, M.S. Govindan, D. Burger and S.W. Keckler, Workshop on Parallel Execution of Sequential Programs on Multi-core Architectures (PESPMA), June 2009. PDF

  13. "Scalable On-Chip Interconnect Topologies," B. Grot and S.W. Keckler, Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 2008. PDF

  14. "Multitasking Workload Scheduling on Flexible Core Chip Multiprocessors," D.P. Gulati, C. Kim. S. Sethumadhavan, S.W. Keckler, and D. Burger, Workshop on Design, Architecture and Simulation of Chip Multi-Processors (dasCMP), December, 2007. PDF

  15. "Software Infrastructure and Tools for the TRIPS Prototype," B. Yoder, J. Burrill, R. McDonald, K. Bush, K. Coons, M. Gebhart, M.S. Govindan, B. Maher, R. Nagarajan, B. Robatmili, K. Sankaralingam, S. Sharif, A. Smith, D. Burger, S.W. Keckler, and K.S. McKinley, Workshop on Modeling, Benchmarking and Simulation (MoBS), June, 2007. PDF

  16. "Exploiting Slack for Low Overhead Soft Error Reliability," P. Shivakumar and S.W. Keckler, Workshop on Silicon Errors in Logic - System Effects (SELSE), April, 2007. PDF

  17. "Power, Performance, and Thermal Management for High-Performance Systems," H. Hanson, S.W. Keckler, K. Rajamani, S. Ghiasi, F. Rawson, and J Rubio, Third Workshop on High-Performance, Power-Aware Computing, March, 2007. PDF

  18. "Power and Thermal Characteristics of a Pentium M," H. Hanson and S.W. Keckler, IBM Austin Center for Advanced Studies Conference, February 2007. PDF

  19. "Evaluation and optimization of signal processing kernels on the TRIPS architecture," K. Bush, M. Gebhart, E. Wei, N. Yudin, B. Maher, N. Nethercote, D. Burger, and S. W. Keckler, Workshop on Optimizations for DSP and Embedded Systems (ODES-4), pp. 39-48, March, 2006. PDF

  20. "Power and Performance Optimization: A Case Study with the Pentium M Processor," H. Hanson and S.W. Keckler, IBM Austin Center for Advanced Studies Conference, February 2006. PDF

  21. "The Memory Behavior of Data Structures in C - SPEC CPU2000 Benchmarks," K.K. Agaram, S.W. Keckler, C. Lin, K.S. McKinley, 2006 SPEC Benchmark Workshop, January, 2006. PDF

  22. "Coordinated Power, Energy, and Temperature Manager," H. Hanson and S.W. Keckler, Poster Session, Austin Conference on Energy-Efficient Design (ACEED), March 2005. PDF

  23. "Fault Aware Instruction Placement for Static Architectures," P. Shivakumar, D.P. Gulati, C. Lin, and S.W. Keckler, 1st Workshop on High Performance Computing Reliability Issues (HPCRI), at HPCA-2005, February, 2005. PDF

  24. "Coordinated Management: Power, Performance, Energy, and Temperature," H. Hanson and S.W. Keckler, IBM Austin Center for Advanced Studies Conference, February 2005. PDF

  25. Coordinated Power, Energy, and Temperature Management for High-Performance Microprocessors," H. Hanson, S.W. Keckler, and D. Burger," IBM Austin Center for Advanced Studies Conference, February, 2004. PDF

  26. "Lightweight Distributed Selective Re-Execution and its Implications for Value Speculation," R. Desikan, L. Sethumadhavan, R. Nagarajan, D.C. Burger, and S.W. Keckler. 1st Value Prediction Workshop, at ISCA-30, June, 2003. PDF

  27. "Exploiting Microarchitectural Redundancy For Defect Tolerance," P. Shivakumar, S.W. Keckler, C.R. Moore, and D.C. Burger, IBM Austin Center for Advanced Studies Conference, February, 2003. PDF

  28. "On-chip MRAM as a High-Bandwidth, Low-Latency Replacement for DRAM Physical Memories," R. Desikan, C.R. Lefurgy, S.W. Keckler, D.C. Burger, IBM Austin Center for Advanced Studies Conference, February, 2003.

  29. "Modeling the Effect of Technology Trends on Soft Error Rate of Combinational Logic," P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, and L. Alvisi, IBM Austin Center for Advanced Studies Workshop, February, 2002. PDF

  30. "An Adaptive Cache Structure for Future High-Performance Systems," C. Kim, D. Burger, and S.W. Keckler. IBM Austin Center for Advanced Studies Workshop, February, 2002.

  31. "A Characterization of Speech Recognition on Modern Computer Systems," K. Agaram, S.W. Keckler, and D. Burger. 4th Annual International Workshop in Workload Characterization , pp. 45-53, December, 2001. PDF

  32. "A Technology-Scalable Architecture for Fast Clocks and High ILP," K. Sankaralingam, R. Nagarajan, D.C. Burger, S.W. Keckler, IBM Austin Center for Advanced Studies Conference, February, 2001.

  33. "A Technology-Scalable Architecture for Fast Clocks and High ILP," K. Sankaralingam, R. Nagarajan, D.C. Burger, S.W. Keckler. 5th Workshop on the Interaction of Compilers and Computer Architecture, at HPCA-7, January, 2001.

  34. "Maximizing Performance/Area Implementations for Future Single-Chip Servers," J. Huh, D.C. Burger, and S.W. Keckler. IBM Austin Center for Advanced Studies Workshop, January, 2001.

  35. "Characterizing the SPHINX Speech Recognition System," K. Agaram, S.W. Keckler, and D.C. Burger. IBM Austin Center for Advanced Studies Workshop, January, 2001.
Tech Reports and Theses

  1. "Making Sense of Performance Counter Measurements on Supercomputing Applications," J. Diamond, J.D. McCalpin, M. Burtscher, B.D. Kim, S.W. Keckler, J.C. Browne, TR-10-25, Department of Computer Sciences, The University of Texas at Austin, July, 2010. PDF

  2. "Scaling Power and Performance via Processor Composability," M.S. Govindan, B. Robatmili, H. Esmaeilzadeh, B. Maher, D. Li, A. Smith, D. Burger, and S.W. Keckler, TR-10-14, Department of Computer Sciences, The University of Texas at Austin, April, 2010. PDF

  3. "Netrace: Dependency-Tracking Traces for Efficient Network-on-Chip Experimentation," J. Hestness and S.W. Keckler, TR-10-11, Department of Computer Sciences, The University of Texas at Austin, April, 2010. PDF

  4. "Compiler-assisted Hybrid Operand Communication," D. Li, B. Robatmili, M.S. Govindan, A. Smith, S.W. Keckler, and D. Burger. TR-09-33, Department of Computer Sciences, The University of Texas at Austin, November 2009. PDF

  5. "Running PARSEC 2.1 on M5," M. Gebhart, J. Hestness, E. Fatehi, P. Gratz, and S.W. Keckler. TR-09-32, Department of Computer Science, The University of Texas at Austin, October, 2009. PDF

  6. "An Evaluation of the TRIPS Computer System (Extended Technical Report)," M. Gebhart, B.A. Maher, K.E. Coons, J. Diamond, P. Gratz, M. Marino, N. Ranganathan, B. Robatmili, A. Smith, J. Burrill, S.W. Keckler, D. Burger, and K.S. McKinley, TR-08-31, Department of Computer Sciences, The University of Texas at Austin, July 2008. PDF

  7. "End-to-End Validation of Architectural Power Models," M.S. Govindan, S.W. Keckler, D. Burger. TR-08-37, Department of Computer Sciences, The University of Texas at Austin, September 2008. PDF

  8. "A Temperature-Aware Power Estimation Methodology," M.S. Govindan, S.W. Keckler, S. Nassif, and E. Acar. TR-07-43, Department of Computer Sciences, The University of Texas at Austin, August 2007. PDF

  9. "A Characterization of High Performance DSP Kernels on the TRIPS Architecture," K. Bush, M. Gebhart, D. Burger, and S.W. Keckler, TR-06-62, Department of Computer Sciences, The University of Texas at Austin, November, 2006. PDF

  10. "Implementation of the Control Unit in the TRIPS Prototype Processor," R. Nagarajan, R.G. McDonald, D. Burger, and S.W. Keckler, TR-06-34, Department of Computer Sciences, The University of Texas at Austin, June, 2006. PDF

  11. "Partition the Banks, not the Functionality, of Large-Window Load-Store Queues," S. Sethumadhavan, D. Burger, and S.W. Keckler, TR-06-39, Department of Computer Sciences, The University of Texas at Austin, June, 2006. PDF

  12. "Elastic Threads on Composable Processors," C. Kim, S. Sethumadhavan, N. Ranganathan, H. Liu, R.G. McDonald, D. Burger, and S.W. Keckler, TR-06-09, Department of Computer Sciences, The University of Texas at Austin, March, 2006. PDF

  13. "TRIPS Assembly Language (TASL) Manual," B. Yoder, J. Burrill, R. McDonald, D. Burger, S.W. Keckler, and K.S. McKinley, TR-05-21, Department of Computer Sciences, The University of Texas at Austin, May, 2005. PDF

  14. "TRIPS Application Binary Interface (ABI) Manual," A. Smith, J. Burrill, R. McDonald, N. Nethercote, B. Yoder, D. Burger, S.W. Keckler, and K.S. McKinley, TR-05-22, Department of Computer Sciences, The University of Texas at Austin, May, 2005. PDF

  15. "TRIPS Intermediate Language (TIL) Manual," A. Smith, J. Gibson, J. Burrill, R. McDonald, D. Burger, S.W. Keckler, and K.S. McKinley, TR-05-20, Department of Computer Sciences, The University of Texas at Austin, May, 2005. PDF

  16. "TRIPS Processor Reference Manual: Version 1.2," R. McDonald, D. Burger, S.W. Keckler, K. Sankaralingam, and R. Nagarajan, TR-05-19, Department of Computer Sciences, The University of Texas at Austin, March, 2005. PDF

  17. "Analysis of Polymorphous Computing Architecture (PCA) Radar-processing Benchmark," J. Rahe and S.W. Keckler, TR-03-41, Department of Computer Sciences, The University of Texas at Austin, December, 2003.

  18. "Microprocessor Pipeline Energy Analysis: Speculation and Over-Provisioning," K. Natarajan, H. Hanson, S.W. Keckler, C.R. Moore, and D. Burger, TR-03-20, Department of Computer Sciences, The University of Texas at Austin, June, 2003. PDF

  19. "Design and Analysis of Routed Inter-ALU Networks for ILP Scalability and Performance," V.A. Singh, K Sankaralingam, S.W. Keckler, and D.C. Burger, TR-03-17. Department of Computer Sciences, The University of Texas at Austin, May, 2003. PDF

  20. "A Routing Network for the Grid Processor Architecture," V.A. Singh, S.W. Keckler, and D.C. Burger. TR-03-10, Department of Computer Sciences, The University of Texas at Austin, April, 2003. PDF

  21. "Sharing Speculation: A Mechanism for Low-Latency Access to Falsely Shared Data," R. Desikan, J. Huh, D.C. Burger, and S.W. Keckler. TR-03-05, Department of Computer Sciences, The University of Texas at Austin, February, 2003. PDF

  22. "Phase Analysis of Program Memory Behavior," K.K. Agaram, S.W. Keckler, C. Lin, and K. McKinley. TR-02-67, Department of Computer Sciences, The University of Texas at Austin, 2002. PDF

  23. "On-chip MRAM as a High-Bandwidth, Low-Latency Replacement for DRAM Physical Memories," R. Desikan, S.W. Keckler, and D.C. Burger. TR-02-47, Department of Computer Sciences, The University of Texas at Austin, September, 2002. PDF

  24. "Combining Hyperblocks and Exit Prediction to Increase Front-End Bandwidth and Performance," N. Ranganathan, D. Jimenez, R. Nagarajan, D.C. Burger, S.W. Keckler, and C. Lin. TR-02-41, Department of Computer Sciences, The University of Texas at Austin, September, 2002. PDF

  25. "Modeling the Impact of Device and Pipeline Scaling on the Soft Error Rate of Processor Elements," P. Shivakumar, M. Kistler, S.W. Keckler, D.C. Burger, and L. Alvisi. TR-02-19, Department of Computer Sciences, The University of Texas at Austin, April, 2002. PDF

  26. "An Adaptive Cache Structure for Future High-Performance Systems," C. Kim, D.C. Burger, and S.W. Keckler. TR-02-10, Department of Computer Sciences, The University of Texas at Austin, February, 2002. PDF

  27. "Assessment of MRAM Technology Characteristics and Architectures," R. Desikan, S.W. Keckler, and D.C. Burger. TR-01-36, Department of Computer Sciences, The University of Texas at Austin, October, 2001. PDF

  28. "Sim-alpha: a Validated, Execution-Driven Alpha 21264 Simulator," R. Desikan, D.C. Burger, S.W. Keckler, and T.M. Austin. TR-01-23, Department of Computer Sciences, The University of Texas at Austin, October, 2001. PDF

  29. "Static Energy Reduction Techniques in Microprocessor Caches," H. Hanson, S.W. Keckler, and D.C. Burger. TR-01-18, Department of Computer Sciences, The University of Texas at Austin, June, 2001. PDF

  30. "A Technology-Scalable Architecture for Fast Clocks and High ILP," K. Sankaralingam, R. Nagarajan, D. Burger, and S.W. Keckler. TR-01-02, Department of Computer Sciences, The University of Texas at Austin, January, 2001. PDF

  31. "Characterizing the SPHINX Speech Recognition System," K. Agaram, S.W. Keckler, D.C. Burger. TR-00-33, Department of Computer Sciences, The University of Texas at Austin, December, 2000. PDF

  32. "Impact of Technology Scaling on Instruction Execution Throughput," M.S. Hrishikesh, D.C. Burger, and S.W. Keckler. TR-00-06, Department of Computer Sciences, The University of Texas at Austin, June, 2001. PDF

  33. "Technology Independent Area and Delay Estimates for Microprocessor Building Blocks," S. Gupta, S.W. Keckler, D.C. Burger. TR-00-05, Department of Computer Sciences, The University of Texas at Austin, May, 2000. PDF

  34. "SimpleScalar Simulation of the PowerPC Instruction Set Architecture," K. Sankaralingam, R. Nagarajan, S.W. Keckler, and D.C. Burger. TR-00-04, Department of Computer Sciences, The University of Texas at Austin, May, 2001. PDF

  35. "The Effect of Technology Scaling on Microarchitectural Structures," V. Agarwal, S.W. Keckler, and D.C. Burger. TR-00-02, Department of Computer Sciences, The University of Texas at Austin, May, 2001. PDF

  36. "Fast Thread Communication and Synchronization Mechanisms for a Scalable Single Chip Multiprocessor", S.W. Keckler. PhD Thesis, Massachusetts Institute of Technology, May 1998. PDF

  37. "A Coupled Multi-ALU Processing Node for a Highly Parallel Computer", S.W. Keckler. MS Thesis, Artificial Intelligence Laboratory Technical Report 1355, Massachusetts Institute of Technology, May 1992. PDF