Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Filter
Article
2018

Accelerator architectures: A ten-year retrospective

Hwu, W-M. W. & Patel, S. J., Nov 1 2018, In : IEEE Micro. 38, 6, p. 56-62 7 p., 8585394.

Research output: Contribution to journalArticle

Particle accelerators
Learning systems
Field programmable gate arrays (FPGA)
Education
Throughput

High-throughput Ant Colony Optimization on graphics processing units

Cecilia, J. M., Llanes, A., Abellán, J. L., Gómez-Luna, J., Chang, L. W. & Hwu, W-M. W., Mar 2018, In : Journal of Parallel and Distributed Computing. 113, p. 261-274 14 p.

Research output: Contribution to journalArticle

Vectorization
Ant colony optimization
Graphics Processing Unit
High Throughput
Optimization Algorithm

Semi-Coherent DMA: An Alternative I/OCoherency Management for Embedded Systems

Min, S. W., Alian, M., Hwu, W-M. W. & Kim, N. S., Aug 22 2018, (Accepted/In press) In : IEEE Computer Architecture Letters.

Research output: Contribution to journalArticle

Dynamic mechanical analysis
Embedded systems
Program processors
Data storage equipment
Bandwidth
2017

Heterogeneous Computing Meets Near-Memory Acceleration and High-Level Synthesis in the Post-Moore Era

Kim, N. S., Chen, D., Xiong, J. & Hwu, W-M. W., Jan 1 2017, In : IEEE Micro. 37, 4, p. 10-18 9 p., 8013455.

Research output: Contribution to journalArticle

Data storage equipment
Particle accelerators
Energy efficiency
Bandwidth
High level synthesis
2016

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W-M. W., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

Error correction
Error Correction
Data storage equipment
Genome
Efficiency

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow

Chen, Y., Nguyen, T., Chen, Y., Gurumani, S. T., Liang, Y., Rupnow, K., Cong, J., Hwu, W-M. W. & Chen, D., Dec 2016, In : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35, 12, p. 2032-2045 14 p., 7450674.

Research output: Contribution to journalArticle

Field programmable gate arrays (FPGA)
Bandwidth
High level languages
Core levels
High level synthesis

HPS papers: A retrospective

Patt, Y. N., Hwu, W-M. W., Melvin, S. W. & Shebanow, M. C., 2016, In : IEEE Micro. 36, 4, p. 76-79 4 p., 7542473.

Research output: Contribution to journalArticle

Computer architecture
Substrates
Reduced instruction set computing
Flow control

In-Place Matrix Transposition on GPUs

Gomez-Luna, J., Sung, I. J., Chang, L. W., Gonzalez-Linares, J. M., Guil, N. & Hwu, W-M. W., Mar 1 2016, In : IEEE Transactions on Parallel and Distributed Systems. 27, 3, p. 776-788 13 p., 7059219.

Research output: Contribution to journalArticle

Tile
Throughput
Fast Fourier transforms
Algebra
Program processors
2015

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W-M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Data transfer
Communication

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W-M. W., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

Electronic data interchange
Particle accelerators
Hardware
Parallel programming
Maintainability
2014

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W-M. W., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

Bloom Filter
Error correction
Error Correction
Sequencing
High Throughput

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

Particle accelerators
Program processors
Throughput
Data storage equipment
Data transfer

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

Cluster computing
Computer systems programming
Data storage equipment
Parallel programming
Electric fuses

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

Parallel processing systems
Parallel Computing
Parallel algorithms
Parallel Algorithms
Many-core
2013

Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Oct 21 2013, In : Transactions on Embedded Computing Systems. 13, 2, 25.

Research output: Contribution to journalArticle

Field programmable gate arrays (FPGA)
Particle accelerators
Imaging techniques
Processing

More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs

Gai, J., Obeid, N., Holtrop, J. L., Wu, X. L., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., May 2013, In : Journal of Parallel and Distributed Computing. 73, 5, p. 686-697 12 p.

Research output: Contribution to journalArticle

Otto Toeplitz
Image Reconstruction
Magnetic resonance imaging
High Resolution
Image reconstruction

Rapid computation of sodium bioscales using gpu-accelerated image reconstruction

Atkinson, I. C., Liu, G., Obeid, N., Thulborn, K. R. & Hwu, W-M. W., Mar 1 2013, In : International Journal of Imaging Systems and Technology. 23, 1, p. 29-35 7 p.

Research output: Contribution to journalArticle

Image reconstruction
Sodium
Tissue
Imaging techniques
Program processors

Real-time in vivo computed optical interferometric tomography

Ahmad, A., Shemonski, N. D., Adie, S. G., Kim, H. S., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., Jun 1 2013, In : Nature Photonics. 7, 6, p. 444-448 5 p.

Research output: Contribution to journalArticle

Optical tomography
Tomography
tomography
high resolution
Tissue

Scalable SIMD-parallel memory allocation for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W-M. W., Jun 1 2013, In : Journal of Supercomputing. 64, 3, p. 1008-1020 13 p.

Research output: Contribution to journalArticle

Storage allocation (computer)
Many-core
Throughput
Data storage equipment
Computer systems programming
2012

Algorithm and data optimization techniques for scaling to massively threaded systems

Stratton, J. A., Rodrigues, C., Sung, I. J. R., Chang, L. W., Anssari, N., Liu, G. D., Hwu, W-M. W. & Obeid, N., Aug 29 2012, Computer, 45, 8, p. 26-32 7 p.

Research output: Contribution to specialist publicationArticle

Scalability
Graphics processing unit

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Anssari, N., Stratton, J. A. & Hwu, W-M. W., Feb 1 2012, In : International Journal of Parallel Programming. 40, 1, p. 4-24 21 p.

Research output: Contribution to journalArticle

Many-core
Parallelism
Layout
Grid
Data storage equipment

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W-M. W., Aug 1 2012, In : ACM SIGPLAN Notices. 47, 8, p. 23-33 11 p.

Research output: Contribution to journalArticle

Data storage equipment
Sampling
Monitoring
Hardware
Graphics processing unit

TIGER: tiled iterative genome assembler.

Wu, X. L., Heo, Y., El Hajj, I., Hwu, W-M. W., Chen, D. & Ma, J., 2012, In : Unknown Journal. 13 Suppl 19

Research output: Contribution to journalArticle

Tigers
Genome
Genes
Data storage equipment
Sequencing
2011

EcoG: A power-efficient GPU cluster architecture for scientific computing

Showerman, M., Enos, J. J., Steffen, C. P., Treichler, S., Gropp, W. D. & Hwu, W-M. W., Mar 1 2011, In : Computing in Science and Engineering. 13, 2, p. 83-87 5 p., 5725240.

Research output: Contribution to journalArticle

Natural sciences computing
Graphics processing unit
2010

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W-M. W., May 1 2010, In : ACM SIGPLAN Notices. 45, 5, p. 105-114 10 p.

Research output: Contribution to journalArticle

Flow graphs
Data storage equipment
Graphics processing unit
Flow control
Analytical models

An asymmetric dstributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. J. & Hwu, W-M. W., Mar 1 2010, In : ACM SIGPLAN Notices. 45, 3, p. 347-358 12 p.

Research output: Contribution to journalArticle

Computer systems
Data storage equipment
Particle accelerators
Program processors
Data transfer
2009

Compute unified device architecture application suitability

Hwu, W-M. W., Rodrigues, C., Ryoo, S. & Stratton, J., May 1 2009, In : Computing in Science and Engineering. 11, 3, p. 16-26 11 p., 4814979.

Research output: Contribution to journalArticle

Graphics processing unit

Hardware-compiler co-design for adjustable data power savings

Hunter, H. C., Nystrom, E. M., Connors, D. A. & Hwu, W-M. W., Jun 1 2009, In : Microprocessors and Microsystems. 33, 4, p. 244-253 10 p.

Research output: Contribution to journalArticle

Static random access storage
Hardware
Processing
Information management
Telecommunication
2008

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Sutton, B. P. & Liang, Z. P., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1307-1318 12 p.

Research output: Contribution to journalArticle

Magnetic Resonance Imaging
Graphics Processing Unit
Magnetic resonance
Imaging techniques
Percent

Program optimization carving for GPU computing

Ryoo, S., Rodrigues, C. I., Stone, S. S., Stratton, J. A., Ueng, S. Z., Baghsorkhi, S. S. & Hwu, W-M. W., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1389-1401 13 p.

Research output: Contribution to journalArticle

Configuration
Optimization
Computing
Many-core
Random Sampling

The concurrency challenge

Hwu, W-M. W., Keutzer, K. & Mattson, T. G., Aug 21 2008, In : IEEE Design and Test of Computers. 25, 4, p. 312-320 9 p.

Research output: Contribution to journalArticle

Microprocessor chips
Semiconductor materials
Hardware
Industry
2007

Toward application-aware security and reliability

Iyer, R. K., Kalbarczyk, Z. T., Pattabiraman, K., Healey, W., Hwu, W-M. W., Klemperer, P. & Farivar, R., Jan 1 2007, In : IEEE Security and Privacy. 5, 1, p. 57-62 6 p.

Research output: Contribution to journalArticle

hardware
Hardware
corruption
Computer systems
Values
2006

Beating in-order stalls with "Flea-Flicker" two-pass pipelining

Barnes, R. D., Sias, J. W., Nystrom, E. M., Patel, S. J., Navarro, J. & Hwu, W-M. W., Jan 1 2006, In : IEEE Transactions on Computers. 55, 1, p. 18-33 16 p.

Research output: Contribution to journalArticle

Pipelining
Pipelines
Latency
Compiler
Percent

Tolerating cache-miss latency with multipass pipelines

Barnes, R. D., Ryoo, S. & Hwu, W-M. W., Jan 1 2006, In : IEEE Micro. 26, 1, p. 40-47 8 p.

Research output: Contribution to journalArticle

Pipelines
Scheduling
Data storage equipment
2004
Bottom-up
Scalability
Context
Subgraph
Benchmark
2003

Energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monks, J., Ebert, J. P., Hwu, W-M. W. & Wolisz, A., Feb 21 2003, In : Computer Networks. 41, 3, p. 313-330 18 p.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks
2001

An architectural framework for runtime optimization

Merten, M. C., Trick, A. R., Barnes, R. D., Nystrom, E. M., George, C. N., Gyllenhaal, J. C. & Hwu, W-M. W., Jun 1 2001, In : IEEE Transactions on Computers. 50, 6, p. 567-589 23 p.

Research output: Contribution to journalArticle

Optimization
Hardware
Straightening
Branch Prediction
Reoptimization

A power controlled multiple access protocol for wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2001, In : Proceedings - IEEE INFOCOM. 1, p. 219-228 10 p.

Research output: Contribution to journalArticle

Packet networks
Network protocols
Collision avoidance
Ad hoc networks
Power transmission

A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monkst, J. P., Ebert, J. P., Woliszi, A. A. & Hwu, W-M. W., Jan 1 2001, In : Conference on Local Computer Networks. p. 550-559 10 p., 72.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks
Flow control
Telecommunication
Hardware
Processing

Modulo Schedule Buffers

Merten, M. C. & Hwu, W-M. W., 2001, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-149 12 p.

Research output: Contribution to journalArticle

Hardware
Signal processing
Scheduling

Program Decision Logic Optimization Using Predication and Control Speculation

Hwu, W-M. W., August, D. I. & Sias, J. W., Nov 2001, In : Proceedings of the IEEE. 89, 11, p. 1660-1675 16 p.

Research output: Contribution to journalArticle

Binary decision diagrams
Global optimization
Flow control
2000

Hardware mechanism for dynamic extraction and relayout of program hot spots

Merten, M. C., Trick, A. R., Nystrom, E. M., Barnes, R. D. & Hwu, W-M. W., 2000, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 59-70 12 p.

Research output: Contribution to journalArticle

Hardware
Switches
Data storage equipment

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Dec 2000, In : Operating Systems Review (ACM). 34, 5, p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Nov 2000, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 35, 11, p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Hardware support for dynamic activation of Compiler-directed Computation Reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W-M. W., Jan 1 2000, In : International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS. p. 222-233 12 p.

Research output: Contribution to journalArticle

Chemical activation
Hardware
Redundancy

Transmission power control for multiple access wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2000, In : Conference on Local Computer Networks. p. 12-21 10 p.

Research output: Contribution to journalArticle

Packet networks
Power control
Network protocols
Wireless ad hoc networks
Collision avoidance
1999

A new framework for debugging globally optimized code

Wu, L. C., Mirani, R., Patil, H., Olsen, B. & Hwu, W-M. W., May 1999, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 34, 5, p. 181-191 11 p.

Research output: Contribution to journalArticle

Recovery
Experiments

Hardware-driven profiling scheme for identifying program hot spots to support runtime optimization

Merten, M. C., Trick, A. R., George, C. N., Gyllenhaal, J. C. & Hwu, W-M. W., 1999, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 136-147 12 p.

Research output: Contribution to journalArticle

Hardware
Pipelines
Monitoring
Experiments

Partial reverse if-conversion framework for balancing control flow and predication

August, D. I., Hwu, W-M. W. & Mahlke, S. A., Jan 1 1999, In : International Journal of Parallel Programming. 27, 5, p. 381-423 43 p.

Research output: Contribution to journalArticle

Flow Control
Flow control
Balancing
Reverse
Partial