Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Article

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

Particle accelerators
Program processors
Throughput
Data storage equipment
Data transfer

Java bytecode to native code translation: The Caffeine prototype and preliminary results

Hsieh, C. H. A., Gyllenhaal, J. C. & Hwu, W-M. W., 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 90-97 8 p.

Research output: Contribution to journalArticle

Caffeine
Internet
Hardware
Data storage equipment

Modulo Schedule Buffers

Merten, M. C. & Hwu, W-M. W., 2001, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-149 12 p.

Research output: Contribution to journalArticle

Hardware
Signal processing
Scheduling

More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs

Gai, J., Obeid, N., Holtrop, J. L., Wu, X. L., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., May 2013, In : Journal of Parallel and Distributed Computing. 73, 5, p. 686-697 12 p.

Research output: Contribution to journalArticle

Otto Toeplitz
Image Reconstruction
Magnetic resonance imaging
High Resolution
Image reconstruction

Optimization of machine descriptions for efficient use

Gyllenhaal, J. C., Hwu, W-M. W. & Rau, B. R., 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 349-358 10 p.

Research output: Contribution to journalArticle

High level languages
Sun
Scheduling

Optimization of Machine Descriptions for Efficient Use

Gyllenhaal, J. C., Hwu, W-M. W. & Rau, B. R., Jan 1 1998, In : International Journal of Parallel Programming. 26, 4, p. 417-447 31 p.

Research output: Contribution to journalArticle

High level languages
Compiler
Optimizing Compilers
Instruction Level Parallelism
Optimization

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W-M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Data transfer
Communication

Optimizing NET compilers for improved java performance

Hsieh, C. H. A., Conte, M. T., Johnson, T. L., Gyllenhaal, J. C. & Hwu, W-M. W., Jun 1 1997, Computer, 30, 6, p. 67-75 9 p.

Research output: Contribution to specialist publicationArticle

Partial reverse if-conversion framework for balancing control flow and predication

August, D. I., Hwu, W-M. W. & Mahlke, S. A., Jan 1 1999, In : International Journal of Parallel Programming. 27, 5, p. 381-423 43 p.

Research output: Contribution to journalArticle

Flow Control
Flow control
Balancing
Reverse
Partial

Performance implications of synchronization support for parallel fortran programs

Anik, S. & Hwu, W-M. W., Aug 1994, In : Journal of Parallel and Distributed Computing. 22, 2, p. 202-215 14 p.

Research output: Contribution to journalArticle

Synchronization
Shared Memory
Data storage equipment
Scheduling
Data Dependence

Profile-assisted instruction scheduling

Chen, W. Y., Mahlke, S. A., Warter, N. J., Anik, S. & Hwu, W-M. W., Apr 1 1994, In : International Journal of Parallel Programming. 22, 2, p. 151-181 31 p.

Research output: Contribution to journalArticle

Instruction Scheduling
Scheduling
Compiler
Superscalar
Instruction Level Parallelism

Profile‐guided automatic inline expansion for C programs

Chang, P. P., Mahlke, S. A., Chen, W. Y. & Hwu, W-M. W., May 1992, In : Software: Practice and Experience. 22, 5, p. 349-369 21 p.

Research output: Contribution to journalArticle

Plant expansion
Program compilers
Information use
Hazards

Program Decision Logic Optimization Using Predication and Control Speculation

Hwu, W-M. W., August, D. I. & Sias, J. W., Nov 2001, In : Proceedings of the IEEE. 89, 11, p. 1660-1675 16 p.

Research output: Contribution to journalArticle

Binary decision diagrams
Global optimization
Flow control

Program optimization carving for GPU computing

Ryoo, S., Rodrigues, C. I., Stone, S. S., Stratton, J. A., Ueng, S. Z., Baghsorkhi, S. S. & Hwu, W-M. W., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1389-1401 13 p.

Research output: Contribution to journalArticle

Configuration
Optimization
Computing
Many-core
Random Sampling

Rapid computation of sodium bioscales using gpu-accelerated image reconstruction

Atkinson, I. C., Liu, G., Obeid, N., Thulborn, K. R. & Hwu, W-M. W., Mar 1 2013, In : International Journal of Imaging Systems and Technology. 23, 1, p. 29-35 7 p.

Research output: Contribution to journalArticle

Image reconstruction
Sodium
Tissue
Imaging techniques
Program processors

Real-time in vivo computed optical interferometric tomography

Ahmad, A., Shemonski, N. D., Adie, S. G., Kim, H. S., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., Jun 1 2013, In : Nature Photonics. 7, 6, p. 444-448 5 p.

Research output: Contribution to journalArticle

Optical tomography
Tomography
tomography
high resolution
Tissue

Region-based compilation: an introduction and motivation

Hank, R. E., Hwu, W-M. W. & Rau, B. R., 1995, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 158-168 11 p.

Research output: Contribution to journalArticle

Scheduling

Region-based compilation: Introduction, motivation, and initial experience

Hank, R. E., Hwu, W-M. W. & Rau, B. R., Jan 1 1997, In : International Journal of Parallel Programming. 25, 2, p. 113-146 34 p.

Research output: Contribution to journalArticle

Compilation
Compiler
Scheduling
Unit
Instruction Level Parallelism

Reverse If-Conversion

Warter, N. J., Mahlke, S. A., Hwu, W-M. W. & Rau, B. R., Jan 6 1993, In : ACM SIGPLAN Notices. 28, 6, p. 290-299 10 p.

Research output: Contribution to journalArticle

Flow graphs
Scheduling
Data storage equipment
Dynamic analysis
Clocks

Run-time adaptive cache management

Johnson, T. L., Connors, D. A. & Hwu, W-M. W., 1998, In : Proceedings of the Hawaii International Conference on System Sciences. 7, p. 774-775 2 p.

Research output: Contribution to journalArticle

Hardware
Data storage equipment
Costs

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W-M. W., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

Electronic data interchange
Particle accelerators
Hardware
Parallel programming
Maintainability

Run-time cache bypassing

Johnson, T. L., Connors, D. A., Merten, M. C. & Hwu, W-M. W., Dec 1 1999, In : IEEE Transactions on Computers. 48, 12, p. 1338-1354 17 p.

Research output: Contribution to journalArticle

Cache
Data storage equipment
Intelligent control
Computer programming languages
Hardware

Run-time spatial locality detection and optimization

Johnson, T. L., Merten, M. C. & Hwu, W-M. W., 1997, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 57-64 8 p.

Research output: Contribution to journalArticle

Data storage equipment

Scalable SIMD-parallel memory allocation for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W-M. W., Jun 1 2013, In : Journal of Supercomputing. 64, 3, p. 1008-1020 13 p.

Research output: Contribution to journalArticle

Storage allocation (computer)
Many-core
Throughput
Data storage equipment
Computer systems programming

Semi-Coherent DMA: An Alternative I/OCoherency Management for Embedded Systems

Min, S. W., Alian, M., Hwu, W-M. W. & Kim, N. S., Aug 22 2018, (Accepted/In press) In : IEEE Computer Architecture Letters.

Research output: Contribution to journalArticle

Dynamic mechanical analysis
Embedded systems
Program processors
Data storage equipment
Bandwidth

Sentinel Scheduling: A Model for Compiler-Controlled Speculative Execution

Mahlke, S. A., Chen, W. Y., Bringmann, R. A., Hank, R. E., Hwu, W-M. W., Rau, B. R. & Schlansker, M. S., Jan 11 1993, In : ACM Transactions on Computer Systems (TOCS). 11, 4, p. 376-408 33 p.

Research output: Contribution to journalArticle

Scheduling
Recovery

Sentinel Scheduling for VLIW and Superscalar Processors

Mahlke, S. A., Chen, W. Y., Hwu, W-M. W., Rau, B. R. & Schlansker, M. S., Jan 9 1992, In : ACM SIGPLAN Notices. 27, 9, p. 238-247 10 p.

Research output: Contribution to journalArticle

Scheduling

Simulation study of simultaneous vector prefetch performance in multiprocessor memory subsystems

Hwu, W-M. W. & Conte, T. M., May 1989, In : Performance Evaluation Review. 17, 1, 1 p.

Research output: Contribution to journalArticle

Data storage equipment
Supercomputers
Bandwidth

The concurrency challenge

Hwu, W-M. W., Keutzer, K. & Mattson, T. G., Aug 21 2008, In : IEEE Design and Test of Computers. 25, 4, p. 312-320 9 p.

Research output: Contribution to journalArticle

Microprocessor chips
Semiconductor materials
Hardware
Industry

The Effect of Code Expanding Optimizations on Instruction Cache Design

Chen, W. Y., Chung, P. P. & Hwu, W-M. W., Sep 1993, In : IEEE Transactions on Computers. 42, 9, p. 1045-1057 13 p.

Research output: Contribution to journalArticle

Cache
Optimization
Superscalar
Placement
Cancel

The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors

Chang, P. P., Lavery, D. M., Mahlke, S. A., Chen, W. Y. & Hwu, W-M. W., Mar 1995, In : IEEE Transactions on Computers. 44, 3, p. 353-370 18 p.

Research output: Contribution to journalArticle

Superscalar
Scheduling
Compiler
Hardware
Schedule

The superblock: An effective technique for VLIW and superscalar compilation

Hwu, W-M. W., Mahlke, S. A., Chen, W. Y., Chang, P. P., Warter, N. J., Bringmann, R. A., Ouellette, R. G., Hank, R. E., Kiyohara, T., Haab, G. E., Holm, J. G. & Lavery, D. M., May 1 1993, In : The Journal of Supercomputing. 7, 1-2, p. 229-248 20 p.

Research output: Contribution to journalArticle

Superscalar
Instruction Level Parallelism
Compilation
Scheduling
Compiler

The Susceptibility of Programs to Context Switching

Hwu, W-M. W., Sep 1994, In : IEEE Transactions on Computers. 43, 9, p. 994-1003 10 p.

Research output: Contribution to journalArticle

Susceptibility
Switches
Switch
Multiprogramming
Computer systems

Three Architectural Models for Compiler-Controlled Speculative Execution

Chang, P. P., Warter, N. J., Mahlke, S. A., Chen, W. Y. & Hwu, W-M. W., Apr 1995, In : IEEE Transactions on Computers. 44, 4, p. 481-494 14 p.

Research output: Contribution to journalArticle

Speculative Execution
Hazard
Compiler
Hazards
Branch

TIGER: tiled iterative genome assembler.

Wu, X. L., Heo, Y., El Hajj, I., Hwu, W-M. W., Chen, D. & Ma, J., 2012, In : Unknown Journal. 13 Suppl 19

Research output: Contribution to journalArticle

Tigers
Genome
Genes
Data storage equipment
Sequencing

Tolerating cache-miss latency with multipass pipelines

Barnes, R. D., Ryoo, S. & Hwu, W-M. W., Jan 1 2006, In : IEEE Micro. 26, 1, p. 40-47 8 p.

Research output: Contribution to journalArticle

Pipelines
Scheduling
Data storage equipment

Toward application-aware security and reliability

Iyer, R. K., Kalbarczyk, Z. T., Pattabiraman, K., Healey, W., Hwu, W-M. W., Klemperer, P. & Farivar, R., Jan 1 2007, In : IEEE Security and Privacy. 5, 1, p. 57-62 6 p.

Research output: Contribution to journalArticle

hardware
Hardware
corruption
Computer systems
Values

Transmission power control for multiple access wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2000, In : Conference on Local Computer Networks. p. 12-21 10 p.

Research output: Contribution to journalArticle

Packet networks
Power control
Network protocols
Wireless ad hoc networks
Collision avoidance

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

Cluster computing
Computer systems programming
Data storage equipment
Parallel programming
Electric fuses

Using profile information to assist classic code optimizations

Chang, P. P., Mahlke, S. A. & Hwu, W-M. W., Dec 1991, In : Software: Practice and Experience. 21, 12, p. 1301-1321 21 p.

Research output: Contribution to journalArticle

Global optimization

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

Parallel processing systems
Parallel Computing
Parallel algorithms
Parallel Algorithms
Many-core
Book

GPU Computing Gems Emerald Edition

Hwu, W-M. W., Jan 1 2011, Elsevier Inc.

Research output: Book/ReportBook

Gems
Computer vision
Parallel programming
Medical imaging
Graphics processing unit

GPU Computing Gems Jade Edition

Hwu, W-M. W., Jan 1 2012, Elsevier Inc.

Research output: Book/ReportBook

Gems
Finance
Graphics processing unit
Environmental engineering
Computer systems programming
Specifications
Program processors
Sanders
Hardware
Data storage equipment

Programming massively parallel processors: A hands-on approach, second edition

Kirk, D. B. & Hwu, W-M. W., Jan 1 2013, Elsevier Science. 496 p.

Research output: Book/ReportBook

Parallel programming
Program processors
Parallel processing systems
Magnetic resonance imaging
Sales

Programming Massively Parallel Processors: A Hands-on Approach: Third Edition

Kirk, D. B. & Hwu, W-M. W., Dec 7 2016, Elsevier Inc. 550 p.

Research output: Book/ReportBook

Parallel programming
Program processors
Parallel processing systems
Software engineering
Students
Chapter

A guide for implementing tridiagonal solvers on GPUs

Chang, L. W. & Hwu, W-M. W., Jan 1 2014, Numerical Computations with GPUs. Springer International Publishing, p. 29-44 16 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Graphics processing unit

An introduction to OpenCLTM

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 297-313 17 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Application case study: Advanced MRI reconstruction

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 235-264 30 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Magnetic resonance imaging