Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Filter
Article
Article

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Sutton, B. P. & Liang, Z. P., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1307-1318 12 p.

Research output: Contribution to journalArticle

Magnetic Resonance Imaging
Graphics Processing Unit
Magnetic resonance
Imaging techniques
Percent

Accelerator architectures: A ten-year retrospective

Hwu, W. M. & Patel, S., Nov 1 2018, In : IEEE Micro. 38, 6, p. 56-62 7 p., 8585394.

Research output: Contribution to journalArticle

Particle accelerators
Learning systems
Field programmable gate arrays (FPGA)
Education
Throughput

Advances in Benchmarking Techniques: New Standards and Quantitative Metrics

Conte, T. M. & Hwu, W-M. W., Jan 1 1995, In : Advances in Computers. 41, C, p. 231-253 23 p.

Research output: Contribution to journalArticle

Benchmarking
Systems analysis
Computer workstations
Computer systems
Specifications

Algorithm and data optimization techniques for scaling to massively threaded systems

Stratton, J. A., Rodrigues, C., Sung, I. J. R., Chang, L. W., Anssari, N., Liu, G. D., Hwu, W-M. W. & Obeid, N., Aug 29 2012, Computer, 45, 8, p. 26-32 7 p.

Research output: Contribution to specialist publicationArticle

Scalability
Graphics processing unit

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W-M. W., May 1 2010, In : ACM SIGPLAN Notices. 45, 5, p. 105-114 10 p.

Research output: Contribution to journalArticle

Flow graphs
Data storage equipment
Graphics processing unit
Flow control
Analytical models

An architectural framework for runtime optimization

Merten, M. C., Trick, A. R., Barnes, R. D., Nystrom, E. M., George, C. N., Gyllenhaal, J. C. & Hwu, W-M. W., Jun 1 2001, In : IEEE Transactions on Computers. 50, 6, p. 567-589 23 p.

Research output: Contribution to journalArticle

Optimization
Hardware
Straightening
Branch Prediction
Reoptimization

An asymmetric dstributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., Mar 1 2010, In : ACM SIGPLAN Notices. 45, 3, p. 347-358 12 p.

Research output: Contribution to journalArticle

Computer systems
Data storage equipment
Particle accelerators
Program processors
Data transfer

A new framework for debugging globally optimized code

Wu, L. C., Mirani, R., Patil, H., Olsen, B. & Hwu, W-M. W., May 1999, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 34, 5, p. 181-191 11 p.

Research output: Contribution to journalArticle

Recovery
Experiments

An execution profiler for window‐oriented applications

Gupta, A. & Hwu, W-M. W., May 1993, In : Software: Practice and Experience. 23, 5, p. 487-510 24 p.

Research output: Contribution to journalArticle

Servers
Display devices
Computer program listings
Computer systems

An Experimental Single-Chip Data Flow CPU

Uvieghara, G. A., Hwu, W-M. W., Nakagome, Y., Jeong, D. K., Hodges, D. A., Patt, Y. N. & Lee, D. D., Jan 1992, In : IEEE Journal of Solid-State Circuits. 27, 1, p. 17-28 12 p.

Research output: Contribution to journalArticle

Program processors
Reduced instruction set computing
Data storage equipment
Interfaces (computer)
Clocks

A power controlled multiple access protocol for wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2001, In : Proceedings - IEEE INFOCOM. 1, p. 219-228 10 p.

Research output: Contribution to journalArticle

Packet networks
Network protocols
Collision avoidance
Ad hoc networks
Power transmission

A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monkst, J. P., Ebert, J. P., Woliszi, A. A. & Hwu, W-M. W., Jan 1 2001, In : Conference on Local Computer Networks. p. 550-559 10 p., 72.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks

Beating in-order stalls with "Flea-Flicker" two-pass pipelining

Barnes, R. D., Sias, J. W., Nystrom, E. M., Patel, S. J., Navarro, J. & Hwu, W. M. W., Jan 1 2006, In : IEEE Transactions on Computers. 55, 1, p. 18-33 16 p.

Research output: Contribution to journalArticle

Pipelining
Pipelines
Latency
Compiler
Percent

Benchmark Characterization

Conte, T. M. & Hwu, W-M. W., Jan 1991, Computer, 24, 1, p. 48-56 9 p.

Research output: Contribution to specialist publicationArticle

Digital storage
Computer operating systems

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

Bloom Filter
Error correction
Error Correction
Sequencing
High Throughput

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W. M., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

Error correction
Error Correction
Data storage equipment
Genome
Efficiency
Bottom-up
Scalability
Context
Subgraph
Benchmark

Characterizing the impact of predicated execution on branch prediction

Mahlke, S. A., Hank, R. E., Bringmann, R. A., Gyllenhaal, J. C., Gallagher, D. M. & Hwu, W-M. W., Dec 7 1994, Professional Engineering, 7, 21, p. 217-227 11 p.

Research output: Contribution to specialist publicationArticle

Checkpoint Repair for High-Performance Out-of-Order Execution Machines

Hwu, W-M. W. & Patt, Y. N., Dec 1987, In : IEEE Transactions on Computers. C-36, 12, p. 1496-1514 19 p.

Research output: Contribution to journalArticle

Checkpoint
Repair
High Performance
Branch Prediction
Cache memory

Combining trace sampling with single pass methods for efficient cache simulation

Conte, T. M., Hirsch, M. A. & Hwu, W. M. W., Dec 1 1998, In : IEEE Transactions on Computers. 47, 6, p. 714-719 6 p.

Research output: Contribution to journalArticle

Memory Hierarchy
Cache
Trace
Sampling
Data storage equipment

Comparing software and hardware schemes for reducing the cost of branches.

Hwu, W-M. W., Conte, T. M. & Chang, P. P., May 1989, In : Conference Proceedings - Annual Symposium on Computer Architecture. 16, p. 224-233 10 p.

Research output: Contribution to journalArticle

Hardware
Pipelines
Costs
Throughput

Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer

Alewine, N. J., Chen, S. K., Fuchs, W. K. & Hwu, W-M. W., Sep 1995, In : IEEE Transactions on Computers. 44, 9, p. 1096-1107 12 p.

Research output: Contribution to journalArticle

Rollback Recovery
Compiler
Buffer
Hazards
Hardware

Compiler-Based Multiple Instruction Retry

Li, C. C. J., Chen, S. K., Fuchs, W. K. & Hwu, W-M. W., Jan 1995, In : IEEE Transactions on Computers. 44, 1, p. 35-46 12 p.

Research output: Contribution to journalArticle

Compiler
Experiments
Compilation
Termination
Interference

Compiler Technology for Future Microprocessors

Hwu, W. M. W., Hank, R. E., Lavery, D. M., Haab, G. E., Gyllenhaal, J. C., August, D. I., Gallagher, D. M. & Mahlke, S. A., Dec 1995, In : Proceedings of the IEEE. 83, 12, p. 1625-1640 16 p.

Research output: Contribution to journalArticle

Microprocessor chips
Hardware
Processing

Compute unified device architecture application suitability

Hwu, W. M., Rodrigues, C., Ryoo, S. & Stratton, J., May 1 2009, In : Computing in Science and Engineering. 11, 3, p. 16-26 11 p., 4814979.

Research output: Contribution to journalArticle

Graphics processing unit

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Anssari, N., Stratton, J. A. & Hwu, W-M. W., Feb 1 2012, In : International Journal of Parallel Programming. 40, 1, p. 4-24 21 p.

Research output: Contribution to journalArticle

Many-core
Parallelism
Layout
Grid
Data storage equipment

Data relocation and prefetching for programs with large data sets

Yamada, Y., Gyllenhall, J., Haab, G. & Hwu, W. M., Dec 7 1994, Professional Engineering, 7, 21, p. 118-127 10 p.

Research output: Contribution to specialist publicationArticle

Copying
Relocation
Hardware
Data storage equipment

Dynamic Memory Disambiguation Using the Memory Conflict Buffer

Gallagher, D. M., Chen, W. Y., Mahlke, S. A., Gyllenhaal, J. C. & Hwu, W-M. W., Jan 11 1994, In : ACM SIGPLAN Notices. 29, 11, p. 183-193 11 p.

Research output: Contribution to journalArticle

Data storage equipment
Scheduling
Computer hardware
Repair

EcoG: A power-efficient GPU cluster architecture for scientific computing

Showerman, M., Enos, J. J., Steffen, C. P., Treichler, S., Gropp, W. D. & Hwu, W-M. W., Mar 1 2011, In : Computing in Science and Engineering. 13, 2, p. 83-87 5 p., 5725240.

Research output: Contribution to journalArticle

Natural sciences computing
Graphics processing unit

Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Oct 21 2013, In : Transactions on Embedded Computing Systems. 13, 2, 25.

Research output: Contribution to journalArticle

Field programmable gate arrays (FPGA)
Particle accelerators
Imaging techniques
Processing

Efficient Instruction Sequencing with Inline Target Insertion

Hwu, W-M. W. & Chang, P. P., Dec 1992, In : IEEE Transactions on Computers. 41, 12, p. 1537-1551 15 p.

Research output: Contribution to journalArticle

Sequencing
Insertion
Branch
Pipelines
Hardware

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W. M. W., Aug 1 2012, In : ACM SIGPLAN Notices. 47, 8, p. 23-33 11 p.

Research output: Contribution to journalArticle

Data storage equipment
Sampling
Monitoring
Hardware
Graphics processing unit

Energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monks, J., Ebert, J. P., Hwu, W-M. W. & Wolisz, A., Feb 21 2003, In : Computer Networks. 41, 3, p. 313-330 18 p.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks
Flow control
Telecommunication
Hardware
Processing

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow

Chen, Y., Nguyen, T., Chen, Y., Gurumani, S. T., Liang, Y., Rupnow, K., Cong, J., Hwu, W-M. W. & Chen, D., Dec 2016, In : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35, 12, p. 2032-2045 14 p., 7450674.

Research output: Contribution to journalArticle

Field programmable gate arrays (FPGA)
Bandwidth
High level languages
Core levels
High level synthesis

Hardware-compiler co-design for adjustable data power savings

Hunter, H. C., Nystrom, E. M., Connors, D. A. & Hwu, W-M. W., Jun 1 2009, In : Microprocessors and Microsystems. 33, 4, p. 244-253 10 p.

Research output: Contribution to journalArticle

Static random access storage
Hardware
Processing
Information management
Telecommunication

Hardware-driven profiling scheme for identifying program hot spots to support runtime optimization

Merten, M. C., Trick, A. R., George, C. N., Gyllenhaal, J. C. & Hwu, W-M. W., 1999, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 136-147 12 p.

Research output: Contribution to journalArticle

Hardware
Pipelines
Monitoring
Experiments

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Nov 2000, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 35, 11, p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Dec 2000, In : Operating Systems Review (ACM). 34, 5, p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Hardware support for dynamic activation of Compiler-directed Computation Reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Jan 1 2000, In : International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS. p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Heterogeneous Computing Meets Near-Memory Acceleration and High-Level Synthesis in the Post-Moore Era

Kim, N. S., Chen, D., Xiong, J. & Hwu, W-M. W., Jan 1 2017, In : IEEE Micro. 37, 4, p. 10-18 9 p., 8013455.

Research output: Contribution to journalArticle

Data storage equipment
Particle accelerators
Energy efficiency
Bandwidth
High level synthesis

High-throughput Ant Colony Optimization on graphics processing units

Cecilia, J. M., Llanes, A., Abellán, J. L., Gómez-Luna, J., Chang, L. W. & Hwu, W-M. W., Mar 2018, In : Journal of Parallel and Distributed Computing. 113, p. 261-274 14 p.

Research output: Contribution to journalArticle

Vectorization
Ant colony optimization
Graphics Processing Unit
High Throughput
Optimization Algorithm

HPS IMPLEMENTATION OF VAX; INITIAL DESIGN AND ANALYSIS.

Hwu, W-M. W., Melvin, S., Shebanow, M., Chen, C., Wei, J. J. & Patt, Y., 1986, In : Proceedings of the Hawaii International Conference on System Science. 1, p. 282-291 10 p.

Research output: Contribution to journalArticle

Substrates
Engines
Experiments

HPS papers: A retrospective

Patt, Y. N., Hwu, W-M. W., Melvin, S. W. & Shebanow, M. C., 2016, In : IEEE Micro. 36, 4, p. 76-79 4 p., 7542473.

Research output: Contribution to journalArticle

Computer architecture
Substrates
Reduced instruction set computing
Flow control

Incremental compiler transformations for multiple instruction retry

Chen, SK. K., Alewine, N. J., Fuchs, W. K. & Hwu, WM. W., Dec 1994, In : Software: Practice and Experience. 24, 12, p. 1179-1198 20 p.

Research output: Contribution to journalArticle

Hazardous materials spills
Application programs
Hazards

Inline Function Expansion for Compiling C Programs

Chang, P. P. & Hwu, W. W., Jun 21 1989, In : ACM SIGPLAN Notices. 24, 7, p. 246-257 12 p.

Research output: Contribution to journalArticle

UNIX
Information use
Costs
Experiments

In-Place Matrix Transposition on GPUs

Gomez-Luna, J., Sung, I. J., Chang, L. W., Gonzalez-Linares, J. M., Guil, N. & Hwu, W. M. W., Mar 1 2016, In : IEEE Transactions on Parallel and Distributed Systems. 27, 3, p. 776-788 13 p., 7059219.

Research output: Contribution to journalArticle

Tile
Throughput
Fast Fourier transforms
Algebra
Program processors

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

Particle accelerators
Program processors
Throughput
Data storage equipment
Data transfer

Java bytecode to native code translation: The Caffeine prototype and preliminary results

Hsieh, C. H. A., Gyllenhaal, J. C. & Hwu, W-M. W., 1996, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 90-97 8 p.

Research output: Contribution to journalArticle

Caffeine
Internet
Hardware
Data storage equipment

Modulo Schedule Buffers

Merten, M. C. & Hwu, W-M. W., 2001, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-149 12 p.

Research output: Contribution to journalArticle

Hardware
Signal processing
Scheduling