Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

Chapter

Floating-point considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 151-171 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

History of GPU computing

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 23-39 17 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Introduction

Hwu, W. M. W., Jan 1 2016, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 1-5 5 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Introduction

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 1-21 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Introduction to data parallelism and CUDA C

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 41-62 22 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Mapping high-level programming languages to OpenCL 2.0: A compiler writer's perspective

Sung, I. J., Chung, W. H., Lee, Y. W. & Hwu, W. M., May 18 2015, Heterogeneous Computing with OpenCL 2.0: Third Edition. Elsevier Inc., p. 249-272 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel patterns: Prefix sum: An introduction to work efficiency in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 197-216 20 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel patterns: Sparse matrix-vector multiplication: An introduction to compaction and regularization in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 217-234 18 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel patterns: Convolution: With an introduction to constant memory and caches

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 173-196 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel programming and computational thinking

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 281-295 15 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Performance analysis and tuning for general purpose graphics processing units (GPGPU)

Kim, H., Vuduc, R., Baghsorkhi, S., Hwu, W. M. & Jee Choi, C., Nov 21 2012, Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU). p. 1-94 94 p. (Synthesis Lectures on Computer Architecture; vol. 20).

Research output: Chapter in Book/Report/Conference proceedingChapter

Performance considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 123-149 27 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Using GPUs to accelerate advanced MRI reconstruction with field inhomogeneity compensation

Zhuo, Y., Wu, X. L., Haldar, J. P., Marin, T., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Dec 1 2011, GPU Computing Gems Emerald Edition. Elsevier Inc., p. 709-722 14 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Conference article

Accurate and efficient predicate analysis with binary decision diagrams

Sias, J. W., Hwu, W. M. W. & August, D. I., Jan 1 2000, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 112-123 12 p.

Research output: Contribution to journalConference article

Achieving high instruction cache performance with an optimizing compiler.

Hwu, W-M. W. & Chang, P. P., May 1 1989, In : Conference Proceedings - Annual Symposium on Computer Architecture. 16, p. 242-251 10 p.

Research output: Contribution to journalConference article

Benchmark characterization

Conte, T. M. & Hwu, W. M. W., Jan 1 1991, In : Proceedings of the Annual Hawaii International Conference on System Sciences. 1, p. 364-372 9 p., 183907.

Research output: Contribution to journalConference article

C COMPILER FOR HPS I, A HIGHLY PARALLEL EXECUTION ENGINE.

Shebanow, M. C., Patt, Y. N., Hwu, W. M. & Melvin, S., Dec 1 1986, In : Proceedings of the Hawaii International Conference on System Science. 2 a, p. 520-528 9 p.

Research output: Contribution to journalConference article

Code reordering and speculation support for dynamic optimization systems

Nystrom, E. M., Barnes, R. D., Merten, M. C. & Hwu, W-M. W., Jan 1 2001, In : Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT. p. 163-174 12 p.

Research output: Contribution to journalConference article

Comparing software and hardware schemes for reducing the cost of branches.

Hwu, W. M. W., Conte, T. M. & Chang, P. P., May 1 1989, In : Conference Proceedings - Annual Symposium on Computer Architecture. 16, p. 224-233 10 p.

Research output: Contribution to journalConference article

Compiler-directed dynamic computation reuse: Rationale and initial results

Connors, D. A. & Hwu, W-M. W., Dec 1 1999, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 158-169 12 p.

Research output: Contribution to journalConference article

Compiler-directed early load-address generation

Cheng, B. C., Connors, D. A. & Hwu, W-M. W., Dec 1 1998, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-147 10 p.

Research output: Contribution to journalConference article

DESIGN CHOICES FOR THE HPSm MICROPROCESSOR CHIP.

Hwu, W. M. & Patt, Y. N., Jan 1 1987, In : Proceedings of the Hawaii International Conference on System Science. 1, p. 330-336 7 p.

Research output: Contribution to journalConference article

Field-testing IMPACT EPIC research results in Itanium 2

Sias, J. W., Ueng, S. Z., Kent, G. A., Steiner, I. M. & Hwu, W. M. W., Oct 8 2004, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. 31, p. 26-37 12 p.

Research output: Contribution to journalConference article

Hardware-driven profiling scheme for identifying program hot spots to support runtime optimization

Merten, M. C., Trick, A. R., George, C. N., Gyllenhaal, J. C. & Hwu, W. M. W., Jan 1 1999, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 136-147 12 p.

Research output: Contribution to journalConference article

Hardware mechanism for dynamic extraction and relayout of program hot spots

Merten, M. C., Trick, A. R., Nystrom, E. M., Barnes, R. D. & Hwu, W. M. W., Jan 1 2000, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 59-70 12 p.

Research output: Contribution to journalConference article

Integrated predicated and speculative execution in the IMPACT EPIC architecture

August, D. I., Connors, D. A., Mahlke, S. A., Sias, J. W., Crozier, K. M., Cheng, B. C., Eaton, P. R., Olaniran, Q. B. & Hwu, W-M. W., Jan 1 1998, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 227-237 11 p.

Research output: Contribution to journalConference article

Interpretable and globally optimal prediction for textual grounding using image concepts

Yeh, R. A., Xiong, J., Hwu, W. M. W., Do, M. N. & Schwing, A. G., Jan 1 2017, In : Advances in Neural Information Processing Systems. 2017-December, p. 1913-1923 11 p.

Research output: Contribution to journalConference article

Modulo scheduling of loops in control-intensive non-numeric programs

Lavery, D. M. & Hwu, W-M. W., Dec 1 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 126-137 12 p.

Research output: Contribution to journalConference article

Optimization of machine descriptions for efficient use

Gyllenhaal, J. C., Hwu, W. M. W. & Rau, B. R., Dec 1 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 349-358 10 p.

Research output: Contribution to journalConference article

Program decision logic approach to predicated execution

August, D. I., Sias, J. W., Puiatti, J. M., Mahlke, S. A., Connors, D. A., Crozier, K. M. & Hwu, W. M. W., Jan 1 1999, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 208-219 12 p.

Research output: Contribution to journalConference article

Region-based compilation: an introduction and motivation

Hank, R. E., Hwu, W. M. W. & Rau, B. R., Jan 1 1995, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 158-168 11 p.

Research output: Contribution to journalConference article

Run-time adaptive cache hierarchy management via reference analysis

Johnson, T. L. & Hwu, W. M. W., Jan 1 1997, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 315-326 12 p.

Research output: Contribution to journalConference article

Run-time adaptive cache management

Johnson, T. L., Connors, D. A. & Hwu, W. M. W., Jan 1 1998, In : Proceedings of the Hawaii International Conference on System Sciences. 7, p. 774-775 2 p.

Research output: Contribution to journalConference article

Simulation study of simultaneous vector prefetch performance in multiprocessor memory subsystems

Hwu, W. M. W. & Conte, T. M., May 1 1989, In : Performance Evaluation Review. 17, 1, 1 p.

Research output: Contribution to journalConference article

Speculative hedge: Regulating compile-time speculation against profile variations

Deitrich, B. L. & Hwu, W-M. W., Dec 1 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 70-79 10 p.

Research output: Contribution to journalConference article

Trace selection for compiling large C application programs to microcode

Chang, P. P. & Hwu, W-M. W., Dec 1 1988, In : MICRO: Annual Microprogramming Workshop. p. 21-29 9 p.

Research output: Contribution to journalConference article

Transmission power control for multiple access wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W. M. W., Dec 1 2000, In : Conference on Local Computer Networks. p. 12-21 10 p.

Research output: Contribution to journalConference article

Trimaran: An infrastructure for research in instruction-level parallelism

Chakrapani, L. N., Gyllenhaal, J., Hwu, W. M. W., Mahlke, S. A., Palem, K. V. & Rabbah, R. M., Oct 19 2005, In : Lecture Notes in Computer Science. 3602, p. 32-41 10 p.

Research output: Contribution to journalConference article

Unrolling-based optimizations for modulo scheduling

Lavery, D. M. & Hwu, W. M. W., Jan 1 1995, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 327-337 11 p.

Research output: Contribution to journalConference article

Conference contribution

AccDNN: An IP-Based DNN Generator for FPGAs

Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W. M. & Chen, D., Sep 7 2018, Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018. Institute of Electrical and Electronics Engineers Inc., 1 p. 8457659. (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Jan 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. Association for Computing Machinery, p. 261-272 12 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerating iterative field-compensated MR image reconstruction on GPUs

Zhuo, Y., Wu, X. L., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Aug 9 2010, 2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings. p. 820-823 4 p. 5490112. (2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerating mr image reconstruction on GPUs

Hwu, W. M. W., Nandakumar, D., Haldar, J., Atkinson, I. C., Sutton, B., Liang, Z. P. & Thulborn, K. R., Nov 17 2009, Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009. p. 1283-1286 4 p. 5193297. (Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerating reduction and scan using tensor core units

Dakkak, A., Li, C., Xiong, J., Gelado, I. & Hwu, W. M., Jun 26 2019, ICS 2019 - International Conference on Supercomputing. Association for Computing Machinery, p. 46-57 12 p. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerating sparse deep neural networks on FPGAs

Huang, S., Pearson, C., Nagi, R., Xiong, J., Chen, D. & Hwu, W. M., Sep 2019, 2019 IEEE High Performance Extreme Computing Conference, HPEC 2019. Institute of Electrical and Electronics Engineers Inc., 8916419. (2019 IEEE High Performance Extreme Computing Conference, HPEC 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Acceleration of the Pair-HMM Algorithm for DNA Variant Calling

Manikandan, G. J., Huang, S., Rupnow, K., Hwu, W. M. W. & Chen, D., Aug 16 2016, Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016. Institute of Electrical and Electronics Engineers Inc., 1 p. 7544765. (Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Adaptive Cache Management for Energy-Efficient GPU Computing

Chen, X., Chang, L. W., Rodrigues, C. I., Lv, J., Wang, Z. & Hwu, W. M., Jan 15 2015, Proceedings - 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014. January ed. IEEE Computer Society, p. 343-355 13 p. 7011400. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2015-January, no. January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Advanced MRI reconstruction toolbox with accelerating on GPU

Wu, X. L., Zhuo, Y., Gai, J., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Feb 11 2011, Proceedings of SPIE-IS and T Electronic Imaging - Parallel Processing for Imaging Applications. 78720Q. (Proceedings of SPIE - The International Society for Optical Engineering; vol. 7872).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A fast and massively-parallel inverse solver for multiple-scattering tomographic image reconstruction

Hidayetoglu, M., Pearson, C., El Hajj, I., Gurel, L., Chew, W. C. & Hwu, W. M., Aug 3 2018, Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018. Institute of Electrical and Electronics Engineers Inc., p. 64-74 11 p. 8425161. (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution