Wen-Mei W Hwu

1984 …2019

Research output per year

If you made any changes in Pure these will be visible here soon.

Research Output

A scalable tridiagonal solver for GPUs

Kim, H. S., Wu, S., Chang, L. W. & Hwu, W-M. W., Nov 7 2011, Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011. p. 444-453 10 p. 6047212. (Proceedings of the International Conference on Parallel Processing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A software based approach to achieving optimal performance for signature control flow checking

Warter, N. J. & Hwu, W. M. W., Dec 1 1990, Digest of Papers - FTCS (Fault-Tolerant Computing Symposium). Publ by IEEE, p. 442-449 8 p. (Digest of Papers - FTCS (Fault-Tolerant Computing Symposium)).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A study of code reuse and sharing characteristics of Java applications

Conte, M. T., Trick, A. R., Gyllenhaal, J. C. & Hwu, W-M. W., Jan 1 1998, Workload Characterization: Methodology and Case Studies - Based on the 1st Workshop on Workload Characterization. Maynard, A. M. G. & John, L. K. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 27-35 9 p. 809356. (Workload Characterization: Methodology and Case Studies - Based on the 1st Workshop on Workload Characterization; vol. 1998-November).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monkst, J. P., Ebert, J. P., Woliszi, A. A. & Hwu, W-M. W., Jan 1 2001, In : Conference on Local Computer Networks. p. 550-559 10 p., 72.

Research output: Contribution to journalArticle

A tiling-scheme Viterbi decoder in Software Defined Radio for GPUs

Lin, C. S., Liu, W. L., Yeh, W. T., Chang, L. W., Hwu, W-M. W., Chen, S. J. & Hsiung, P. A., Oct 31 2011, 7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011. 6036680. (7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic discovery of coarse-grained parallelism in media applications

Ryoo, S., Ueng, S. Z., Rodrigues, C. I., Kidd, R. E., Frank, M. I. & Hwu, W. M. W., Dec 1 2007, Transactions on High-Performance Embedded Architectures and Compilers I. p. 194-213 20 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 4050 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs

Gonzalo, S. G. D., Huang, S., Gomez-Luna, J., Hammond, S., Mutlu, O. & Hwu, W. M., Mar 5 2019, CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization. Moseley, T., Jimborean, A. & Kandemir, M. T. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 73-84 12 p. 8661187. (CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic parallelization of kernels in shared-memory multi-GPU nodes

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W. M. W., Jun 8 2015, ICS 2015 - Proceedings of the 29th ACM International Conference on Supercomputing. Association for Computing Machinery, p. 3-13 11 p. (Proceedings of the International Conference on Supercomputing; vol. 2015-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Beating in-order stalls with "flea-flicker" two-pass pipelining

Barnes, R. D., Patel, S. J., Nystrom, E. M., Navarro, N., Sias, J. W. & Hwu, W-M. W., Jan 1 2003, Proceedings - 36th International Symposium on Microarchitecture, MICRO 2003. IEEE Computer Society, p. 387-398 12 p. 1253243. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2003-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Beating in-order stalls with "Flea-Flicker" two-pass pipelining

Barnes, R. D., Sias, J. W., Nystrom, E. M., Patel, S. J., Navarro, J. & Hwu, W. M. W., Jan 1 2006, In : IEEE Transactions on Computers. 55, 1, p. 18-33 16 p.

Research output: Contribution to journalArticle

Benchmark characterization

Conte, T. M. & Hwu, W-M. W., Jan 1 1991, In : Proceedings of the Annual Hawaii International Conference on System Sciences. 1, p. 364-372 9 p., 183907.

Research output: Contribution to journalConference article

Benchmark characterization for experimental system evaluation

Conte, T. M. & Hwu, W. M. W., Jan 1 1990, Proceedings of the Hawaii International Conference on System Science. Hoevel, L. W., Shriver, B. D., Nunamaker, J. F. J., Sprague, R. H. J. & Milutinovic, V. (eds.). Publ by Western Periodicals Co, p. 6-18 13 p. (Proceedings of the Hawaii International Conference on System Science; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W. M., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

Branch recovery with compiler-assisted multiple instruction retry

Alewine, N. J., Chen, S. K., Li, C. C., Fuchs, W. K. & Hwu, W. M., Jan 1 1992, FTCS 1992 - 22nd Annual International Symposium on Fault-Tolerant Computing. Institute of Electrical and Electronics Engineers Inc., p. 66-73 8 p. 243614. (FTCS 1992 - 22nd Annual International Symposium on Fault-Tolerant Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

C COMPILER FOR HPS I, A HIGHLY PARALLEL EXECUTION ENGINE.

Shebanow, M. C., Patt, Y. N., Hwu, W. M. & Melvin, S., Dec 1 1986, In : Proceedings of the Hawaii International Conference on System Science. 2 a, p. 520-528 9 p.

Research output: Contribution to journalConference article

Characterizing the impact of predicated execution on branch prediction

Mahlke, S. A., Hank, R. E., Bringmann, R. A., Gyllenhaal, J. C., Gallagher, D. M. & Hwu, W-M. W., Dec 7 1994, Professional Engineering, 7, 21, p. 217-227 11 p.

Research output: Contribution to specialist publicationArticle

Characterizing the impact of predicated execution on branch prediction

Mahlke, S. A., Hank, R. E., Bringmann, R. A., Gyllenhaal, J. C., Gallagher, D. M. & Hwu, W-M. W., Nov 30 1994, Proceedings of the 27th Annual International Symposium on Microarchitecture, MICRO 1994. IEEE Computer Society, p. 217-227 11 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. Part F129425).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Checkpoint Repair for High-Performance Out-of-Order Execution Machines

Hwu, W. M. W. & Patt, Y. N., Dec 1987, In : IEEE Transactions on Computers. C-36, 12, p. 1496-1514 19 p.

Research output: Contribution to journalArticle

CHECKPOINT REPAIR FOR OUT-OF-ORDER EXECUTION MACHINES.

Hwu, W-M. W. & Patt, Y. N., 1987, Conference Proceedings - Annual Symposium on Computer Architecture. IEEE, p. 18-26 9 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Code coverage and input variability: Effects on architecture and compiler research

Hunter, H. C. & Hwu, W-M. W., Dec 1 2002, Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02. p. 79-87 9 p. (Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Code reordering and speculation support for dynamic optimization systems

Nystrom, E. M., Barnes, R. D., Merten, M. C. & Hwu, W-M. W., Jan 1 2001, In : Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT. p. 163-174 12 p.

Research output: Contribution to journalConference article

Collaborative (CPU + GPU) Algorithms for Triangle Counting and Truss Decomposition

Mailthody, V. S., Date, K., Qureshi, Z., Pearson, C., Nagi, R., Xiong, J. & Hwu, W. M., Nov 26 2018, 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018. Institute of Electrical and Electronics Engineers Inc., 8547517. (2018 IEEE High Performance Extreme Computing Conference, HPEC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Collaborative computing for heterogeneous integrated systems

Chang, L. W., Gómez-Luna, J., El Hajj, I., Huang, S., Chen, D. & Hwu, W-M. W., Apr 17 2017, ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 385-388 4 p. (ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Combining trace sampling with single pass methods for efficient cache simulation

Conte, T. M., Hirsch, M. A. & Hwu, W. M. W., Dec 1 1998, In : IEEE Transactions on Computers. 47, 6, p. 714-719 6 p.

Research output: Contribution to journalArticle

Comparing software and hardware schemes for reducing the cost of branches.

Hwu, W. M. W., Conte, T. M. & Chang, P. P., May 1 1989, In : Conference Proceedings - Annual Symposium on Computer Architecture. 16, p. 224-233 10 p.

Research output: Contribution to journalConference article

Comparing static and dynamic code scheduling for multiple-instruction-issue processors

Chang, P. P., Chen, W. Y., Mahlke, S. A. & Hwu, W-M. W., Sep 1 1991, MICRO 1991 - Proceedings of the 24th Annual International Symposium on Microarchitecture. IEEE Computer Society, p. 25-33 9 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

COMPARISON OF SEVERAL EVOLVING (UNIVERSITY) SUPERCOMPUTER ARCHITECTURES.

Patt, Y. N., Sheldon, R. G., Shebanow, M., Ponder, C. & Hwu, W-M. W., 1984, Unknown Host Publication Title. IEEE, p. 15-26 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer

Alewine, N. J., Chen, S. K., Fuchs, W. K. & Hwu, W. M. W., Sep 1995, In : IEEE Transactions on Computers. 44, 9, p. 1096-1107 12 p.

Research output: Contribution to journalArticle

Compiler-Based Multiple Instruction Retry

Li, C. C. J., Chen, S. K., Fuchs, W. K. & Hwu, W. M. W., Jan 1995, In : IEEE Transactions on Computers. 44, 1, p. 35-46 12 p.

Research output: Contribution to journalArticle

Compiler code transformations for superscalar-based high-performance systems

Mahlke, S. A., Chen, W. Y., Gyuenhaal, J. C., Hwu, W. M. W., Chang, P. P. & Kiyohara, T., Dec 1 1992, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, Supercomputing 1992. Werner, R. (ed.). Association for Computing Machinery, p. 808-817 10 p. (Proceedings of the International Conference on Supercomputing; vol. Part F129723).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler-directed dynamic computation reuse: Rationale and initial results

Connors, D. A. & Hwu, W-M. W., Dec 1 1999, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 158-169 12 p.

Research output: Contribution to journalConference article

Compiler-directed early load-address generation

Cheng, B. C., Connors, D. A. & Hwu, W-M. W., Dec 1 1998, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-147 10 p.

Research output: Contribution to journalConference article

Compiler Technology

Chung, W. H. J., Lyu, Y. H., Sung, I. J. R., Lee, Y. W. & Hwu, W-M. W., Dec 4 2015, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 97-129 33 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Compute unified device architecture application suitability

Hwu, W. M., Rodrigues, C., Ryoo, S. & Stratton, J., May 1 2009, In : Computing in Science and Engineering. 11, 3, p. 16-26 11 p., 4814979.

Research output: Contribution to journalArticle