Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

2017

RAI: A scalable project submission system for parallel programming courses

Dakkak, A., Pearson, C., Li, C. & Hwu, W. M., Jun 30 2017, Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017. Institute of Electrical and Electronics Engineers Inc., p. 315-322 8 p. 7965062. (Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rebooting the data access hierarchy of computing systems

Hwu, W-M. W., Hajj, I. E., De Gonzalo, S. G., Pearson, C., Kim, N. S., Chen, D., Xiong, J. & Sura, Z., Nov 28 2017, 2017 IEEE International Conference on Rebooting Computing, ICRC 2017 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 1-4 4 p. (2017 IEEE International Conference on Rebooting Computing, ICRC 2017 - Proceedings; vol. 2017-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scalable parallel DBIM solutions of inverse-scattering problems

Hidayetogglu, M., Pearson, C., Gurel, L., Hwu, W. M. & Chew, W. C., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 65-66 2 p. 7991889. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Thoughts on massively-parallel heterogeneous computing for solving large problems

Hwu, W. M., Hidayetogglu, M., Chew, W. C., Pearson, C., Garcia, S., Huang, S. & Dakkak, A., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 67-68 2 p. 7991890. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2016

Acceleration of the Pair-HMM Algorithm for DNA Variant Calling

Manikandan, G. J., Huang, S., Rupnow, K., Hwu, W. M. W. & Chen, D., Aug 16 2016, Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016. Institute of Electrical and Electronics Engineers Inc., 1 p. 7544765. (Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A programming system for future proofing performance critical libraries

Chang, L. W., El Hajj, I., Kim, H. S., Gómez-Luna, J., Dakkak, A. & Hwu, W-M. W., Feb 27 2016, 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016 - Proceedings. Association for Computing Machinery, 32. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP; vol. 12-16-March-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Architecture

Connors, D. A. & Hwu, W. M. W., Apr 19 2016, The VLSI Handbook: Second Edition. CRC Press, p. 66.1-66.23

Research output: Chapter in Book/Report/Conference proceedingChapter

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W. M., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

Compiler Technology

Chung, W. H. J., Lyu, Y. H., Sung, I. J. R., Lee, Y. W. & Hwu, W. M. W., 2016, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 97-129 33 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

CUDA application development

Hwu, W. M., May 20 2016, 2008 IEEE Hot Chips 20 Symposium, HCS 2008. Institute of Electrical and Electronics Engineers Inc., 7476522. (2008 IEEE Hot Chips 20 Symposium, HCS 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Design of a power-efficient ARM processor with a timing-error detection and correction mechanism

Chen, S. J., Liu, G., Yang, H. P., Luo, C. H. & Hwu, W. M., Jul 2 2016, Proceedings - 29th IEEE International System on Chip Conference, SOCC 2016. Bhatia, K., Alioto, M., Zhao, D., Marshall, A. & Sridhar, R. (eds.). IEEE Computer Society, p. 217-222 6 p. 7905471. (International System on Chip Conference; vol. 0).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DySel: Lightweight dynamic selection for kernel-based data-parallel programming model

Chang, L. W., Kim, H. S. & Hwu, W. M., Mar 25 2016, ASPLOS 2016 - 21st International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 667-680 14 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 02-06-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient and scalable workflows for genomic analyses

Banerjee, S. S., Athreya, A. P., Mainzer, L. S., Jongeneel, C. V., Hwu, W. M., Kalbarczyk, Z. T. & Iyer, R. K., Jun 1 2016, DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing. Association for Computing Machinery, Inc, p. 27-36 10 p. (DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient kernel synthesis for performance portable programming

Chang, L. W., Hajj, I. E., Rodrigues, C., Gomez-Luna, J. & Hwu, W. M., Dec 14 2016, MICRO 2016 - 49th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 7783715. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2016-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow

Chen, Y., Nguyen, T., Chen, Y., Gurumani, S. T., Liang, Y., Rupnow, K., Cong, J., Hwu, W. M. & Chen, D., Dec 2016, In : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35, 12, p. 2032-2045 14 p., 7450674.

Research output: Contribution to journalArticle

HPS papers: A retrospective

Patt, Y. N., Hwu, W. M. W., Melvin, S. W. & Shebanow, M. C., Jan 1 2016, In : IEEE Micro. 36, 4, p. 76-79 4 p., 7542473.

Research output: Contribution to journalArticle

In-Place Matrix Transposition on GPUs

Gomez-Luna, J., Sung, I. J., Chang, L. W., Gonzalez-Linares, J. M., Guil, N. & Hwu, W. M. W., Mar 1 2016, In : IEEE Transactions on Parallel and Distributed Systems. 27, 3, p. 776-788 13 p., 7059219.

Research output: Contribution to journalArticle

Introduction

Hwu, W. M. W., Jan 1 2016, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 1-5 5 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism

Hajj, I. E., Gomez-Luna, J., Li, C., Chang, L. W., Milojicic, D. & Hwu, W. M., Dec 14 2016, MICRO 2016 - 49th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 7783716. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2016-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel solutions of inverse multiple scattering problems with born-type fast solvers

Hidayetoǧlu, M., Yang, C., Wang, L., Podkowa, A., Oelze, M., Hwu, W. M. & Chew, W. C., Nov 3 2016, 2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 916-920 5 p. 7734520. (2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Performance insights on executing non-graphics applications on CUDA on the NVIDIA GeForce 8800 GTX

Hwu, W. M., Kiirk, D., Ryoo, S., Rodriigues, C., Stratton, J. & Huang, K., May 31 2016, 2007 IEEE Hot Chips 19 Symposium, HCS 2007. Institute of Electrical and Electronics Engineers Inc., 7482492. (2007 IEEE Hot Chips 19 Symposium, HCS 2007).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Preface

Kirk, D. B. & Hwu, W-M. W., Dec 7 2016, Programming Massively Parallel Processors: A Hands-on Approach: Third Edition. Elsevier Inc., p. xv-xx

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

Programming Massively Parallel Processors: A Hands-on Approach: Third Edition

Kirk, D. B. & Hwu, W-M. W., Dec 7 2016, Elsevier Inc. 550 p.

Research output: Book/Report/Conference proceedingBook

SpaceJMP: Programming with multiple virtual address spaces

El Hajj, I., Merritt, A., Zellweger, G., Milojicic, D., Achermann, R., Faraboschi, P., Hwu, W. M., Roscoe, T. & Schwan, K., Mar 25 2016, ASPLOS 2016 - 21st International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 353-368 16 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 02-06-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

WebGPU: A scalable online development platform for GPU programming courses

Dakkak, A., Pearson, C. & Hwu, W. M., Jul 18 2016, Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., p. 942-949 8 p. 7529962. (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2015

Adaptive Cache Management for Energy-Efficient GPU Computing

Chen, X., Chang, L. W., Rodrigues, C. I., Lv, J., Wang, Z. & Hwu, W. M., Jan 15 2015, Proceedings - 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014. January ed. IEEE Computer Society, p. 343-355 13 p. 7011400. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2015-January, no. January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic parallelization of kernels in shared-memory multi-GPU nodes

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W. M. W., Jun 8 2015, ICS 2015 - Proceedings of the 29th ACM International Conference on Supercomputing. Association for Computing Machinery, p. 3-13 11 p. (Proceedings of the International Conference on Supercomputing; vol. 2015-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Enhancing the Usability and Utilization of Accelerated Architectures via Docker

Haydel, N., Gesing, S., Taylor, I., Madey, G., Dakkak, A., De Gonzalo, S. G. & Hwu, W. M. W., Jan 1 2015, Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015. Rana, O., Buyya, R. & Raicu, I. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 361-367 7 p. 7431432. (Proceedings - 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FPGA accelerated DNA error correction

Ramachandran, A., Heo, Y., Hwu, W-M. W., Ma, J. & Chen, D., Apr 22 2015, Proceedings of the 2015 Design, Automation and Test in Europe Conference and Exhibition, DATE 2015. Institute of Electrical and Electronics Engineers Inc., Vol. 2015-April. p. 1371-1376 6 p. 7092605

Research output: Chapter in Book/Report/Conference proceedingConference contribution

GPU-SM: Shared memory multi-GPU programming

Cabezas, J., Jordà, M., Gelado, I., Navarro, N. & Hwu, W-M. W., Feb 7 2015, ACM International Conference Proceeding Series. Gong, X. (ed.). Association for Computing Machinery, p. 13-24 12 p. (ACM International Conference Proceeding Series; vol. 2015-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Heterogeneous System Architecture: A New Compute Platform Infrastructure

Hwu, W-M. W., Dec 4 2015, Elsevier Inc. 189 p.

Research output: Book/Report/Conference proceedingBook

In-place data sliding algorithms for many-core architectures

Luna, J. G., Chang, L. W., Sung, I. J., Hwu, W-M. W. & Guil, N., Dec 8 2015, Proceedings - 2015 44th International Annual Conference on Parallel Processing, ICPP 2015. Institute of Electrical and Electronics Engineers Inc., p. 210-219 10 p. 7349576. (Proceedings of the International Conference on Parallel Processing; vol. 2015-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures

Kim, H. S., Hajj, I. E., Stratton, J., Lumetta, S. & Hwu, W. M., Mar 3 2015, Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015. Institute of Electrical and Electronics Engineers Inc., p. 257-268 12 p. 7054205. (Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mapping high-level programming languages to OpenCL 2.0: A compiler writer's perspective

Sung, I. J., Chung, W. H., Lee, Y. W. & Hwu, W. M., May 18 2015, Heterogeneous Computing with OpenCL 2.0: Third Edition. Elsevier Inc., p. 249-272 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W. M., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

SPEC ACCEL: A standard application suite for measuring hardware accelerator performance

Juckeland, G., Brantley, W., Chandrasekaran, S., Chapman, B., Che, S., Colgrove, M., Feng, H., Grund, A., Henschel, R., Hwu, W. M. W., Li, H., Müller, M. S., Nagel, W. E., Perminov, M., Shelepugin, P., Skadron, K., Stratton, J., Titov, A., Wang, K., Van Waveren, M. & 4 others, Whitney, B., Wienke, S., Xu, R. & Kumaran, K., Jan 1 2015, High Performance Computing Systems: Performance Modeling, Benchmarking, and Simulation - 5th International Workshop, PMBS 2014, Revised Selected Papers. Hammond, S. D., Jarvis, S. A. & Wright, S. A. (eds.). Springer-Verlag, p. 46-67 22 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 8966).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Transitioning HPC software to exascale heterogeneous computing

Hwu, W-M. W., Chang, L. W., Kim, H. S., Dakkak, A. & El Hajj, I., Sep 2 2015, 2015 Computational Electromagnetics International Workshop, CEM 2015. Institute of Electrical and Electronics Engineers Inc., p. 4-5 2 p. 7237412. (2015 Computational Electromagnetics International Workshop, CEM 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2014

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A guide for implementing tridiagonal solvers on GPUs

Chang, L. W. & Hwu, W. M. W., Jan 1 2014, Numerical Computations with GPUs. Springer International Publishing, p. 29-44 16 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Automatic execution of single-GPU computations across multiple GPUs

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W-M. W., Jan 1 2014, PACT 2014 - Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 467-468 2 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 207-218 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 247-258 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

2013

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution