Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

Dynamic Memory Disambiguation Using the Memory Conflict Buffer

Gallagher, D. M., Chen, W. Y., Mahlke, S. A., Gyllenhaal, J. C. & Hwu, W. M. W., Jan 11 1994, In : ACM SIGPLAN Notices. 29, 11, p. 183-193 11 p.

Research output: Contribution to journalArticle

Dynamic memory disambiguation using the memory conflict buffer

Gallagher, D. M., Chen, W. Y., Mahlke, S. A., Gyllenhaal, J. C. & Hwu, W. M. W., Nov 1 1994, Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 1994. Association for Computing Machinery, p. 183-193 11 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. Part F129531).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DNNBuilder: An automated tool for building high-performance DNN hardware accelerators for FPGAs

Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W. M. & Chen, D., Nov 5 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers. Institute of Electrical and Electronics Engineers Inc., a56. (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DL: A data layout transformation system for heterogeneous computing

Sung, I. J., Liu, G. D. & Hwu, W. M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339606. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Direct numerical simulation of turbulent flow in a square duct using a Graphics Processing Unit (GPU)

Shinn, A. F., Vanka, S. P. & Hwu, W. W., Dec 2 2010, 40th AIAA Fluid Dynamics Conference. 2010-5029. (40th AIAA Fluid Dynamics Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Design of a power-efficient ARM processor with a timing-error detection and correction mechanism

Chen, S. J., Liu, G., Yang, H. P., Luo, C. H. & Hwu, W. M., Jul 2 2016, Proceedings - 29th IEEE International System on Chip Conference, SOCC 2016. Bhatia, K., Alioto, M., Zhao, D., Marshall, A. & Sridhar, R. (eds.). IEEE Computer Society, p. 217-222 6 p. 7905471. (International System on Chip Conference; vol. 0).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Design evaluation of OpenCL compiler framework for coarse-grained reconfigurable arrays

Kim, H. S., Ahn, M., Stratton, J. A. & Hwu, W. M. W., Dec 1 2012, FPT 2012 - 2012 International Conference on Field-Programmable Technology. p. 313-320 8 p. 6412155. (FPT 2012 - 2012 International Conference on Field-Programmable Technology).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DESIGN CHOICES FOR THE HPSm MICROPROCESSOR CHIP.

Hwu, W. M. & Patt, Y. N., Jan 1 1987, In : Proceedings of the Hawaii International Conference on System Science. 1, p. 330-336 7 p.

Research output: Contribution to journalConference article

DeepStore: In-storage acceleration for intelligent queries

Mailthody, V. S., Qureshi, Z., Liang, W., Feng, Z., Gonzalo, S. G. D., Li, Y., Franke, H., Xiong, J., Huang, J. & Hwu, W. M., Oct 12 2019, MICRO 2019 - 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Proceedings. IEEE Computer Society, p. 224-238 15 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data relocation and prefetching for programs with large data sets

Yamada, Y., Gyllenhall, J., Haab, G. & Hwu, W-M. W., Nov 30 1994, Proceedings of the 27th Annual International Symposium on Microarchitecture, MICRO 1994. IEEE Computer Society, p. 118-127 10 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. Part F129425).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data relocation and prefetching for programs with large data sets

Yamada, Y., Gyllenhall, J., Haab, G. & Hwu, W. M., Dec 7 1994, Professional Engineering, 7, 21, p. 118-127 10 p.

Research output: Contribution to specialist publicationArticle

Data-parallel execution model

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 63-94 32 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Stratton, J. A. & Hwu, W. M. W., Jan 1 2010, PACT'10 - Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 513-522 10 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT; vol. 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Anssari, N., Stratton, J. A. & Hwu, W. M. W., Feb 1 2012, In : International Journal of Parallel Programming. 40, 1, p. 4-24 21 p.

Research output: Contribution to journalArticle

Data access microarchitectures for superscalar processors with compiler-assisted data prefetching

Chen, W. Y., Mahlke, S. A., Chang, P. P. & Hwu, W. M. W., Sep 1 1991, MICRO 1991 - Proceedings of the 24th Annual International Symposium on Microarchitecture. IEEE Computer Society, p. 69-73 5 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CUDA memories

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 95-121 27 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

CUDA-Lite: Reducing GPU programming complexity

Ueng, S. Z., Lathara, M., Baghsorkhi, S. S. & Hwu, W. M. W., Dec 1 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 1-15 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CUDA dynamic parallelism

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 435-457 23 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

CUDA application development

Hwu, W. M., May 20 2016, 2008 IEEE Hot Chips 20 Symposium, HCS 2008. Institute of Electrical and Electronics Engineers Inc., 7476522. (2008 IEEE Hot Chips 20 Symposium, HCS 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CUBA: An architecture for efficient CPU/Co-processor data communication

Gelado, I., Kelm, J. H., Ryoo, S., Lumetta, S. S., Navarro, N. & Hwu, W. M. W., Dec 15 2008, ICS'08 - Proceedings of the 2008 ACM International Conference on Supercomputing. p. 299-308 10 p. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CRITICAL ISSUES REGARDING HPS, A HIGH PERFORMANCE MICROARCHITECTURE.

Patt, Y. N., Melvin, S. W., Hwu, W. M. & Shebanow, M. C., Dec 1 1985, MICRO: Annual Microprogramming Workshop. ACM, p. 109-116 8 p. (MICRO: Annual Microprogramming Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Corezilla: Build and tame the multicore beast?

Sarno, L., Hwu, W. M. W., Lund, C., Levy, M., Larus, J. R., Reinders, J., Cameron, G., Lennard, C. & Corporation, T., Aug 2 2007, 2007 44th ACM/IEEE Design Automation Conference, DAC'07. p. 632-633 2 p. 4261259. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Control flow optimization for supercomputer scalar processing

Chang, P. P. & Hwu, W. M. W., Jun 1 1989, Proceedings of the 3rd International Conference on Supercomputing, ICS 1989. Association for Computing Machinery, p. 145-153 9 p. (Proceedings of the International Conference on Supercomputing; vol. Part F130180).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Conclusion and future outlook

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 459-469 11 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Compute unified device architecture application suitability

Hwu, W. M., Rodrigues, C., Ryoo, S. & Stratton, J., May 1 2009, In : Computing in Science and Engineering. 11, 3, p. 16-26 11 p., 4814979.

Research output: Contribution to journalArticle

Compiler Technology for Future Microprocessors

Hwu, W. M. W., Hank, R. E., Lavery, D. M., Haab, G. E., Gyllenhaal, J. C., August, D. I., Gallagher, D. M. & Mahlke, S. A., Dec 1995, In : Proceedings of the IEEE. 83, 12, p. 1625-1640 16 p.

Research output: Contribution to journalArticle

Compiler Technology

Chung, W. H. J., Lyu, Y. H., Sung, I. J. R., Lee, Y. W. & Hwu, W. M. W., Jan 1 2016, Heterogeneous System Architecture: A New Compute Platform Infrastructure. Elsevier Inc., p. 97-129 33 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Compiler-directed early load-address generation

Cheng, B. C., Connors, D. A. & Hwu, W-M. W., Dec 1 1998, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-147 10 p.

Research output: Contribution to journalConference article

Compiler-directed dynamic computation reuse: Rationale and initial results

Connors, D. A. & Hwu, W-M. W., Dec 1 1999, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 158-169 12 p.

Research output: Contribution to journalConference article

Compiler code transformations for superscalar-based high-performance systems

Mahlke, S. A., Chen, W. Y., Gyuenhaal, J. C., Hwu, W. M. W., Chang, P. P. & Kiyohara, T., Dec 1 1992, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, Supercomputing 1992. Werner, R. (ed.). Association for Computing Machinery, p. 808-817 10 p. (Proceedings of the International Conference on Supercomputing; vol. Part F129723).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler-Based Multiple Instruction Retry

Li, C. C. J., Chen, S. K., Fuchs, W. K. & Hwu, W. M. W., Jan 1995, In : IEEE Transactions on Computers. 44, 1, p. 35-46 12 p.

Research output: Contribution to journalArticle

Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer

Alewine, N. J., Chen, S. K., Fuchs, W. K. & Hwu, W. M. W., Sep 1995, In : IEEE Transactions on Computers. 44, 9, p. 1096-1107 12 p.

Research output: Contribution to journalArticle

COMPARISON OF SEVERAL EVOLVING (UNIVERSITY) SUPERCOMPUTER ARCHITECTURES.

Patt, Y. N., Sheldon, R. G., Shebanow, M., Ponder, C. & Hwu, W-M. W., 1984, Unknown Host Publication Title. IEEE, p. 15-26 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison of full and partial predicated execution support for ILP processors

Mahlke, S. A., Hank, R. E., McCormick, J. E., August, D. I. & Hwu, W. M. W., Jan 1 1995, Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 138-149 12 p. (Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison of full and partial predicated execution support for ILP processors

Mahlke, S. A., Hank, R. E., McCormick, J. E., August, D. I. & Hwu, W-M. W., 1995, ACM SIGARCH (Association for Computing Nachinery Special Interest Group on Computer Architecture) - Conference Proceedings. ACM, p. 138-149 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparing static and dynamic code scheduling for multiple-instruction-issue processors

Chang, P. P., Chen, W. Y., Mahlke, S. A. & Hwu, W. M. W., Sep 1 1991, MICRO 1991 - Proceedings of the 24th Annual International Symposium on Microarchitecture. IEEE Computer Society, p. 25-33 9 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparing software and hardware schemes for reducing the cost of branches.

Hwu, W. M. W., Conte, T. M. & Chang, P. P., May 1 1989, In : Conference Proceedings - Annual Symposium on Computer Architecture. 16, p. 224-233 10 p.

Research output: Contribution to journalConference article

Comparative performance evaluation of multi-GPU MLFMA implementation for 2-D VIE problems

Pearson, C., Hidayetoglu, M., Ren, W., Chew, W. C. & Hwu, W. M., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 63-64 2 p. 7991888. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Combining trace sampling with single pass methods for efficient cache simulation

Conte, T. M., Hirsch, M. A. & Hwu, W. M. W., Dec 1 1998, In : IEEE Transactions on Computers. 47, 6, p. 714-719 6 p.

Research output: Contribution to journalArticle

Collaborative computing for heterogeneous integrated systems

Chang, L. W., Gómez-Luna, J., El Hajj, I., Huang, S., Chen, D. & Hwu, W. M., Apr 17 2017, ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 385-388 4 p. (ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Collaborative (CPU + GPU) algorithms for triangle counting and truss decomposition on the Minsky architecture: Static graph challenge: Subgraph isomorphism

Date, K., Feng, K., Nagi, R., Xiong, J., Kim, N. S. & Hwu, W-M. W., Oct 30 2017, 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017. Institute of Electrical and Electronics Engineers Inc., 8091042. (2017 IEEE High Performance Extreme Computing Conference, HPEC 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Collaborative (CPU + GPU) Algorithms for Triangle Counting and Truss Decomposition

Mailthody, V. S., Date, K., Qureshi, Z., Pearson, C., Nagi, R., Xiong, J. & Hwu, W. M., Nov 26 2018, 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018. Institute of Electrical and Electronics Engineers Inc., 8547517. (2018 IEEE High Performance Extreme Computing Conference, HPEC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Code reordering and speculation support for dynamic optimization systems

Nystrom, E. M., Barnes, R. D., Merten, M. C. & Hwu, W-M. W., Jan 1 2001, In : Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT. p. 163-174 12 p.

Research output: Contribution to journalConference article

Code coverage and input variability: Effects on architecture and compiler research

Hunter, H. C. & Hwu, W. M. W., Dec 1 2002, Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02. p. 79-87 9 p. (Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CIGAR: Application partitioning for a CPU/coprocessor architecture

Kelm, J. H., Gelado, I., Murphy, M. J., Navarro, N., Lumetta, S. S. & Hwu, W-M. W., Dec 1 2007, 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007. p. 317-326 10 p. 4336222. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CHECKPOINT REPAIR FOR OUT-OF-ORDER EXECUTION MACHINES.

Hwu, W-M. W. & Patt, Y. N., 1987, Conference Proceedings - Annual Symposium on Computer Architecture. IEEE, p. 18-26 9 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Checkpoint Repair for High-Performance Out-of-Order Execution Machines

Hwu, W. M. W. & Patt, Y. N., Dec 1987, In : IEEE Transactions on Computers. C-36, 12, p. 1496-1514 19 p.

Research output: Contribution to journalArticle

Characterizing the impact of predicated execution on branch prediction

Mahlke, S. A., Hank, R. E., Bringmann, R. A., Gyllenhaal, J. C., Gallagher, D. M. & Hwu, W-M. W., Dec 7 1994, Professional Engineering, 7, 21, p. 217-227 11 p.

Research output: Contribution to specialist publicationArticle