Wen-Mei W Hwu

1984 …2019

Research output per year

If you made any changes in Pure these will be visible here soon.

Research Output

2016

Architecture

Connors, D. A. & Hwu, W-M. W., Apr 19 2016, The VLSI Handbook: Second Edition. CRC Press, p. 66.1-66.23

Research output: Chapter in Book/Report/Conference proceedingChapter

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W. M., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

CUDA application development

Hwu, W. M., May 20 2016, 2008 IEEE Hot Chips 20 Symposium, HCS 2008. Institute of Electrical and Electronics Engineers Inc., 7476522. (2008 IEEE Hot Chips 20 Symposium, HCS 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Design of a power-efficient ARM processor with a timing-error detection and correction mechanism

Chen, S. J., Liu, G., Yang, H. P., Luo, C. H. & Hwu, W-M. W., Jul 2 2016, Proceedings - 29th IEEE International System on Chip Conference, SOCC 2016. Bhatia, K., Alioto, M., Zhao, D., Marshall, A. & Sridhar, R. (eds.). IEEE Computer Society, p. 217-222 6 p. 7905471. (International System on Chip Conference; vol. 0).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DySel: Lightweight dynamic selection for kernel-based data-parallel programming model

Chang, L. W., Kim, H. S. & Hwu, W. M., Mar 25 2016, ASPLOS 2016 - 21st International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 667-680 14 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 02-06-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient and scalable workflows for genomic analyses

Banerjee, S. S., Athreya, A. P., Mainzer, L. S., Jongeneel, C., Hwu, W-M. W., Kalbarczyk, Z. T. & Iyer, R. K., Jun 1 2016, DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing. Association for Computing Machinery, Inc, p. 27-36 10 p. (DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow

Chen, Y., Nguyen, T., Chen, Y., Gurumani, S. T., Liang, Y., Rupnow, K., Cong, J., Hwu, W. M. & Chen, D., Dec 2016, In : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35, 12, p. 2032-2045 14 p., 7450674.

Research output: Contribution to journalArticle

HPS papers: A retrospective

Patt, Y. N., Hwu, W. M. W., Melvin, S. W. & Shebanow, M. C., Jan 1 2016, In : IEEE Micro. 36, 4, p. 76-79 4 p., 7542473.

Research output: Contribution to journalArticle

In-Place Matrix Transposition on GPUs

Gomez-Luna, J., Sung, I. J., Chang, L. W., Gonzalez-Linares, J. M., Guil, N. & Hwu, W. M. W., Mar 1 2016, In : IEEE Transactions on Parallel and Distributed Systems. 27, 3, p. 776-788 13 p., 7059219.

Research output: Contribution to journalArticle

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism

Hajj, I. E., Gomez-Luna, J., Li, C., Chang, L. W., Milojicic, D. & Hwu, W. M., Dec 14 2016, MICRO 2016 - 49th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 7783716. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2016-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel solutions of inverse multiple scattering problems with born-type fast solvers

Hidayetoǧlu, M., Yang, C., Wang, L., Podkowa, A., Oelze, M., Hwu, W. M. & Chew, W. C., Nov 3 2016, 2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 916-920 5 p. 7734520. (2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Performance insights on executing non-graphics applications on CUDA on the NVIDIA GeForce 8800 GTX

Hwu, W. M., Kiirk, D., Ryoo, S., Rodriigues, C., Stratton, J. & Huang, K., May 31 2016, 2007 IEEE Hot Chips 19 Symposium, HCS 2007. Institute of Electrical and Electronics Engineers Inc., 7482492. (2007 IEEE Hot Chips 19 Symposium, HCS 2007).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Preface

Kirk, D. B. & Hwu, W-M. W., Dec 7 2016, Programming Massively Parallel Processors: A Hands-on Approach: Third Edition. Elsevier Inc., p. xv-xx

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

Programming Massively Parallel Processors: A Hands-on Approach: Third Edition

Kirk, D. B. & Hwu, W-M. W., Dec 7 2016, Elsevier Inc. 550 p.

Research output: Book/ReportBook

SpaceJMP: Programming with multiple virtual address spaces

El Hajj, I., Merritt, A., Zellweger, G., Milojicic, D., Achermann, R., Faraboschi, P., Hwu, W. M., Roscoe, T. & Schwan, K., Mar 25 2016, ASPLOS 2016 - 21st International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 353-368 16 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 02-06-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

WebGPU: A scalable online development platform for GPU programming courses

Dakkak, A., Pearson, C. & Hwu, W. M., Jul 18 2016, Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., p. 942-949 8 p. 7529962. (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2015

Automatic parallelization of kernels in shared-memory multi-GPU nodes

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W. M. W., Jun 8 2015, ICS 2015 - Proceedings of the 29th ACM International Conference on Supercomputing. Association for Computing Machinery, p. 3-13 11 p. (Proceedings of the International Conference on Supercomputing; vol. 2015-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler Technology

Chung, W. H. J., Lyu, Y. H., Sung, I. J. R., Lee, Y. W. & Hwu, W-M. W., Dec 4 2015, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 97-129 33 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Enhancing the Usability and Utilization of Accelerated Architectures via Docker

Haydel, N., Gesing, S., Taylor, I., Madey, G., Dakkak, A., De Gonzalo, S. G. & Hwu, W. M. W., Jan 1 2015, Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015. Rana, O., Buyya, R. & Raicu, I. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 361-367 7 p. 7431432. (Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FPGA accelerated DNA error correction

Ramachandran, A., Heo, Y., Hwu, W-M. W., Ma, J. & Chen, D., Apr 22 2015, Proceedings of the 2015 Design, Automation and Test in Europe Conference and Exhibition, DATE 2015. Institute of Electrical and Electronics Engineers Inc., Vol. 2015-April. p. 1371-1376 6 p. 7092605

Research output: Chapter in Book/Report/Conference proceedingConference contribution

GPU-SM: Shared memory multi-GPU programming

Cabezas, J., Jordà, M., Gelado, I., Navarro, N. & Hwu, W-M. W., Feb 7 2015, ACM International Conference Proceeding Series. Gong, X. (ed.). Association for Computing Machinery, p. 13-24 12 p. (ACM International Conference Proceeding Series; vol. 2015-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

In-place data sliding algorithms for many-core architectures

Luna, J. G., Chang, L. W., Sung, I. J., Hwu, W-M. W. & Guil, N., Dec 8 2015, Proceedings - 2015 44th International Annual Conference on Parallel Processing, ICPP 2015. Institute of Electrical and Electronics Engineers Inc., p. 210-219 10 p. 7349576. (Proceedings of the International Conference on Parallel Processing; vol. 2015-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures

Kim, H. S., Hajj, I. E., Stratton, J., Lumetta, S. S. & Hwu, W-M. W., Mar 3 2015, Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015. Institute of Electrical and Electronics Engineers Inc., p. 257-268 12 p. 7054205. (Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mapping high-level programming languages to OpenCL 2.0: A compiler writer's perspective

Sung, I. J., Chung, W. H., Lee, Y. W. & Hwu, W. M., May 18 2015, Heterogeneous Computing with OpenCL 2.0: Third Edition. Elsevier Inc., p. 249-272 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W. M., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

SPEC ACCEL: A standard application suite for measuring hardware accelerator performance

Juckeland, G., Brantley, W., Chandrasekaran, S., Chapman, B., Che, S., Colgrove, M., Feng, H., Grund, A., Henschel, R., Hwu, W. M. W., Li, H., Müller, M. S., Nagel, W. E., Perminov, M., Shelepugin, P., Skadron, K., Stratton, J., Titov, A., Wang, K., Van Waveren, M. & 4 others, Whitney, B., Wienke, S., Xu, R. & Kumaran, K., Jan 1 2015, High Performance Computing Systems: Performance Modeling, Benchmarking, and Simulation - 5th International Workshop, PMBS 2014, Revised Selected Papers. Hammond, S. D., Jarvis, S. A. & Wright, S. A. (eds.). Springer-Verlag, p. 46-67 22 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 8966).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Transitioning HPC software to exascale heterogeneous computing

Hwu, W-M. W., Chang, L. W., Kim, H. S., Dakkak, A. & El Hajj, I., Sep 2 2015, 2015 Computational Electromagnetics International Workshop, CEM 2015. Institute of Electrical and Electronics Engineers Inc., p. 4-5 2 p. 7237412. (2015 Computational Electromagnetics International Workshop, CEM 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2014

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 207-218 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2013

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Oct 21 2013, In : Transactions on Embedded Computing Systems. 13, 2, 25.

Research output: Contribution to journalArticle

More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs

Gai, J., Obeid, N., Holtrop, J. L., Wu, X. L., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., May 2013, In : Journal of Parallel and Distributed Computing. 73, 5, p. 686-697 12 p.

Research output: Contribution to journalArticle

Programming massively parallel processors: A hands-on approach, second edition

Kirk, D. B. & Hwu, W-M. W., Jan 1 2013, Elsevier Science. 496 p.

Research output: Book/ReportBook

Real-time in vivo computed optical interferometric tomography

Ahmad, A., Shemonski, N. D., Adie, S. G., Kim, H. S., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., Jun 1 2013, In : Nature Photonics. 7, 6, p. 444-448 5 p.

Research output: Contribution to journalArticle

Scalable SIMD-parallel memory allocation for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W-M. W., Jun 1 2013, In : Journal of Supercomputing. 64, 3, p. 1008-1020 13 p.

Research output: Contribution to journalArticle