Wen-Mei W Hwu

1984 …2019

Research output per year

If you made any changes in Pure these will be visible here soon.

Research Output

2012

Floating-point considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 151-171 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

GPU Computing Gems Jade Edition

Hwu, W-M. W., Jan 1 2012, Elsevier Inc.

Research output: Book/ReportBook

High-speed interferometric synthetic aperture microscopy on a graphics processing unit

Ahmad, A., Shemonski, N., Adie, S. G., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Frontiers in Optics, FIO 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

History of GPU computing

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 23-39 17 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Implementing a GPU programming model on a non-GPU accelerator architecture

Kofsky, S. M., Johnson, D. R., Stratton, J. A., Hwu, W. M. W., Patel, S. J. & Lumetta, S. S., Mar 8 2012, Computer Architecture - ISCA 2010 International Workshops, A4MMC, AMAS-BT, EAMA, WEED, WIOSCA, Revised Selected Papers. p. 40-51 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 6161 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Interferometric synthetic aperture microscopy with computational adaptive optics for high-resolution tomography of scattering tissue

Adie, S. G., Ahmad, A., Shemonski, N., Graf, B. W., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Biomedical Optics, BIOMED 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Introduction

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 1-21 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Introduction

Hwu, W. M. W., Dec 1 2012, GPU Computing Gems Jade Edition. Elsevier Inc., p. xv-xvi

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

Introduction to data parallelism and CUDA C

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 41-62 22 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Optimization and architecture effects on GPU computing workload performance

Stratton, J. A., Anssari, N., Rodrigues, C., Sung, I. J., Obeid, N., Chang, L., Liu, G. D. & Hwu, W-M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339605. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel patterns: Sparse matrix-vector multiplication: An introduction to compaction and regularization in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 217-234 18 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel patterns: Prefix sum: An introduction to work efficiency in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 197-216 20 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel patterns: Convolution: With an introduction to constant memory and caches

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 173-196 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel programming and computational thinking

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 281-295 15 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Performance analysis and tuning for general purpose graphics processing units (GPGPU)

Kim, H., Vuduc, R., Baghsorkhi, S., Hwu, W. M. & Jee Choi, C., Nov 21 2012, Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU). p. 1-94 94 p. (Synthesis Lectures on Computer Architecture; vol. 20).

Research output: Chapter in Book/Report/Conference proceedingChapter

Performance considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 123-149 27 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Preface

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. xiii-xviii

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

TIGER: tiled iterative genome assembler.

Wu, X. L., Heo, Y., El Hajj, I., Hwu, W. M., Chen, D. & Ma, J., 2012, In : Unknown Journal. 13 Suppl 19

Research output: Contribution to journalArticle

2013

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Oct 21 2013, In : Transactions on Embedded Computing Systems. 13, 2, 25.

Research output: Contribution to journalArticle

More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs

Gai, J., Obeid, N., Holtrop, J. L., Wu, X. L., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., May 2013, In : Journal of Parallel and Distributed Computing. 73, 5, p. 686-697 12 p.

Research output: Contribution to journalArticle

Programming massively parallel processors: A hands-on approach, second edition

Kirk, D. B. & Hwu, W-M. W., Jan 1 2013, Elsevier Science. 496 p.

Research output: Book/ReportBook

Rapid computation of sodium bioscales using gpu-accelerated image reconstruction

Atkinson, I. C., Liu, G., Obeid, N., Thulborn, K. R. & Hwu, W. M., Mar 1 2013, In : International Journal of Imaging Systems and Technology. 23, 1, p. 29-35 7 p.

Research output: Contribution to journalArticle

Real-time in vivo computed optical interferometric tomography

Ahmad, A., Shemonski, N. D., Adie, S. G., Kim, H. S., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., Jun 1 2013, In : Nature Photonics. 7, 6, p. 444-448 5 p.

Research output: Contribution to journalArticle

Scalable SIMD-parallel memory allocation for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W-M. W., Jun 1 2013, In : Journal of Supercomputing. 64, 3, p. 1008-1020 13 p.

Research output: Contribution to journalArticle

Throughput-oriented kernel porting onto FPGAs

Papakonstantinou, A., Chen, D., Hwu, W. M., Cong, J. & Yun, L., Jul 12 2013, Proceedings of the 50th Annual Design Automation Conference, DAC 2013. 11. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2014

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A guide for implementing tridiagonal solvers on GPUs

Chang, L. W. & Hwu, W. M. W., Jan 1 2014, Numerical Computations with GPUs. Springer International Publishing, p. 29-44 16 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Automatic execution of single-GPU computations across multiple GPUs

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W-M. W., Jan 1 2014, PACT 2014 - Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 467-468 2 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 207-218 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 247-258 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

2015

Adaptive Cache Management for Energy-Efficient GPU Computing

Chen, X., Chang, L. W., Rodrigues, C. I., Lv, J., Wang, Z. & Hwu, W. M., Jan 15 2015, Proceedings - 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014. January ed. IEEE Computer Society, p. 343-355 13 p. 7011400. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2015-January, no. January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic parallelization of kernels in shared-memory multi-GPU nodes

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W. M. W., Jun 8 2015, ICS 2015 - Proceedings of the 29th ACM International Conference on Supercomputing. Association for Computing Machinery, p. 3-13 11 p. (Proceedings of the International Conference on Supercomputing; vol. 2015-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler Technology

Chung, W. H. J., Lyu, Y. H., Sung, I. J. R., Lee, Y. W. & Hwu, W-M. W., Dec 4 2015, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 97-129 33 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Enhancing the Usability and Utilization of Accelerated Architectures via Docker

Haydel, N., Gesing, S., Taylor, I., Madey, G., Dakkak, A., De Gonzalo, S. G. & Hwu, W. M. W., Jan 1 2015, Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015. Rana, O., Buyya, R. & Raicu, I. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 361-367 7 p. 7431432. (Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FPGA accelerated DNA error correction

Ramachandran, A., Heo, Y., Hwu, W-M. W., Ma, J. & Chen, D., Apr 22 2015, Proceedings of the 2015 Design, Automation and Test in Europe Conference and Exhibition, DATE 2015. Institute of Electrical and Electronics Engineers Inc., Vol. 2015-April. p. 1371-1376 6 p. 7092605

Research output: Chapter in Book/Report/Conference proceedingConference contribution

GPU-SM: Shared memory multi-GPU programming

Cabezas, J., Jordà, M., Gelado, I., Navarro, N. & Hwu, W-M. W., Feb 7 2015, ACM International Conference Proceeding Series. Gong, X. (ed.). Association for Computing Machinery, p. 13-24 12 p. (ACM International Conference Proceeding Series; vol. 2015-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

In-place data sliding algorithms for many-core architectures

Luna, J. G., Chang, L. W., Sung, I. J., Hwu, W-M. W. & Guil, N., Dec 8 2015, Proceedings - 2015 44th International Annual Conference on Parallel Processing, ICPP 2015. Institute of Electrical and Electronics Engineers Inc., p. 210-219 10 p. 7349576. (Proceedings of the International Conference on Parallel Processing; vol. 2015-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Introduction

Hwu, W-M. W., Dec 4 2015, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 1-5 5 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures

Kim, H. S., Hajj, I. E., Stratton, J., Lumetta, S. S. & Hwu, W-M. W., Mar 3 2015, Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015. Institute of Electrical and Electronics Engineers Inc., p. 257-268 12 p. 7054205. (Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mapping high-level programming languages to OpenCL 2.0: A compiler writer's perspective

Sung, I. J., Chung, W. H., Lee, Y. W. & Hwu, W. M., May 18 2015, Heterogeneous Computing with OpenCL 2.0: Third Edition. Elsevier Inc., p. 249-272 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W. M., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

SPEC ACCEL: A standard application suite for measuring hardware accelerator performance

Juckeland, G., Brantley, W., Chandrasekaran, S., Chapman, B., Che, S., Colgrove, M., Feng, H., Grund, A., Henschel, R., Hwu, W. M. W., Li, H., Müller, M. S., Nagel, W. E., Perminov, M., Shelepugin, P., Skadron, K., Stratton, J., Titov, A., Wang, K., Van Waveren, M. & 4 others, Whitney, B., Wienke, S., Xu, R. & Kumaran, K., Jan 1 2015, High Performance Computing Systems: Performance Modeling, Benchmarking, and Simulation - 5th International Workshop, PMBS 2014, Revised Selected Papers. Hammond, S. D., Jarvis, S. A. & Wright, S. A. (eds.). Springer-Verlag, p. 46-67 22 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 8966).

Research output: Chapter in Book/Report/Conference proceedingConference contribution