Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

Filter
Article
2018

Accelerator architectures: A ten-year retrospective

Hwu, W. M. & Patel, S., Nov 1 2018, In : IEEE Micro. 38, 6, p. 56-62 7 p., 8585394.

Research output: Contribution to journalArticle

High-throughput Ant Colony Optimization on graphics processing units

Cecilia, J. M., Llanes, A., Abellán, J. L., Gómez-Luna, J., Chang, L. W. & Hwu, W. M. W., Mar 2018, In : Journal of Parallel and Distributed Computing. 113, p. 261-274 14 p.

Research output: Contribution to journalArticle

Semi-Coherent DMA: An Alternative I/O Coherency Management for Embedded Systems

Min, S., Alian, M., Hwu, W. M. & Kim, N. S., Jul 1 2018, In : IEEE Computer Architecture Letters. 17, 2, p. 221-224 4 p., 8444757.

Research output: Contribution to journalArticle

2017

Heterogeneous Computing Meets Near-Memory Acceleration and High-Level Synthesis in the Post-Moore Era

Kim, N. S., Chen, D., Xiong, J. & Hwu, W. M. W., Jan 1 2017, In : IEEE Micro. 37, 4, p. 10-18 9 p., 8013455.

Research output: Contribution to journalArticle

2016

BLESS 2: Accurate, memory-efficient and fast error correction method

Heo, Y., Ramachandran, A., Hwu, W. M., Ma, J. & Chen, D., Aug 1 2016, In : Bioinformatics. 32, 15, p. 2369-2371 3 p.

Research output: Contribution to journalArticle

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow

Chen, Y., Nguyen, T., Chen, Y., Gurumani, S. T., Liang, Y., Rupnow, K., Cong, J., Hwu, W. M. & Chen, D., Dec 2016, In : IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35, 12, p. 2032-2045 14 p., 7450674.

Research output: Contribution to journalArticle

HPS papers: A retrospective

Patt, Y. N., Hwu, W. M. W., Melvin, S. W. & Shebanow, M. C., Jan 1 2016, In : IEEE Micro. 36, 4, p. 76-79 4 p., 7542473.

Research output: Contribution to journalArticle

In-Place Matrix Transposition on GPUs

Gomez-Luna, J., Sung, I. J., Chang, L. W., Gonzalez-Linares, J. M., Guil, N. & Hwu, W. M. W., Mar 1 2016, In : IEEE Transactions on Parallel and Distributed Systems. 27, 3, p. 776-788 13 p., 7059219.

Research output: Contribution to journalArticle

2015

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Takizawa, H., Hirasawa, S., Sugawara, M., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2015, In : Scientific Programming. 2015, 576498.

Research output: Contribution to journalArticle

Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications

Cabezas, J., Gelado, I., Stone, J. E., Navarro, N., Kirk, D. B. & Hwu, W. M., May 1 2015, In : IEEE Transactions on Parallel and Distributed Systems. 26, 5, p. 1405-1418 14 p., 6803940.

Research output: Contribution to journalArticle

2014

BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads

Heo, Y., Wu, X. L., Chen, D., Ma, J. & Hwu, W. M., May 15 2014, In : Bioinformatics. 30, 10, p. 1354-1362 9 p.

Research output: Contribution to journalArticle

In-place transposition of rectangular matrices on accelerators

Sung, I. J., Gómez-Luna, J., González-Linares, J. M., Guil, N. & Hwu, W-M. W., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 207-218 12 p.

Research output: Contribution to journalArticle

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

2013

Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W. M. W., Oct 21 2013, In : Transactions on Embedded Computing Systems. 13, 2, 25.

Research output: Contribution to journalArticle

More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs

Gai, J., Obeid, N., Holtrop, J. L., Wu, X. L., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., May 2013, In : Journal of Parallel and Distributed Computing. 73, 5, p. 686-697 12 p.

Research output: Contribution to journalArticle

Rapid computation of sodium bioscales using gpu-accelerated image reconstruction

Atkinson, I. C., Liu, G., Obeid, N., Thulborn, K. R. & Hwu, W. M., Mar 1 2013, In : International Journal of Imaging Systems and Technology. 23, 1, p. 29-35 7 p.

Research output: Contribution to journalArticle

Real-time in vivo computed optical interferometric tomography

Ahmad, A., Shemonski, N. D., Adie, S. G., Kim, H. S., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., Jun 1 2013, In : Nature Photonics. 7, 6, p. 444-448 5 p.

Research output: Contribution to journalArticle

Scalable SIMD-parallel memory allocation for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W. M., Jun 1 2013, In : Journal of Supercomputing. 64, 3, p. 1008-1020 13 p.

Research output: Contribution to journalArticle

2012

Algorithm and data optimization techniques for scaling to massively threaded systems

Stratton, J. A., Rodrigues, C., Sung, I. J. R., Chang, L. W., Anssari, N., Liu, G. D., Hwu, W-M. W. & Obeid, N., Aug 29 2012, Computer, 45, 8, p. 26-32 7 p.

Research output: Contribution to specialist publicationArticle

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Anssari, N., Stratton, J. A. & Hwu, W. M. W., Feb 1 2012, In : International Journal of Parallel Programming. 40, 1, p. 4-24 21 p.

Research output: Contribution to journalArticle

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W. M. W., Aug 1 2012, In : ACM SIGPLAN Notices. 47, 8, p. 23-33 11 p.

Research output: Contribution to journalArticle

TIGER: tiled iterative genome assembler.

Wu, X. L., Heo, Y., El Hajj, I., Hwu, W. M., Chen, D. & Ma, J., 2012, In : Unknown Journal. 13 Suppl 19, p. S18

Research output: Contribution to journalArticle

2011

EcoG: A power-efficient GPU cluster architecture for scientific computing

Showerman, M., Enos, J. J., Steffen, C. P., Treichler, S., Gropp, W. D. & Hwu, W-M. W., Mar 1 2011, In : Computing in Science and Engineering. 13, 2, p. 83-87 5 p., 5725240.

Research output: Contribution to journalArticle

2010

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W. M. W., May 1 2010, In : ACM SIGPLAN Notices. 45, 5, p. 105-114 10 p.

Research output: Contribution to journalArticle

An asymmetric dstributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., Mar 1 2010, In : ACM SIGPLAN Notices. 45, 3, p. 347-358 12 p.

Research output: Contribution to journalArticle

2009

Compute unified device architecture application suitability

Hwu, W. M., Rodrigues, C., Ryoo, S. & Stratton, J., May 1 2009, In : Computing in Science and Engineering. 11, 3, p. 16-26 11 p., 4814979.

Research output: Contribution to journalArticle

Hardware-compiler co-design for adjustable data power savings

Hunter, H. C., Nystrom, E. M., Connors, D. A. & Hwu, W. M. W., Jun 1 2009, In : Microprocessors and Microsystems. 33, 4, p. 244-253 10 p.

Research output: Contribution to journalArticle

2008

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Sutton, B. P. & Liang, Z. P., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1307-1318 12 p.

Research output: Contribution to journalArticle

Program optimization carving for GPU computing

Ryoo, S., Rodrigues, C. I., Stone, S. S., Stratton, J. A., Ueng, S. Z., Baghsorkhi, S. S. & Hwu, W. M. W., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1389-1401 13 p.

Research output: Contribution to journalArticle

The concurrency challenge

Hwu, W. M., Keutzer, K. & Mattson, T. G., Aug 21 2008, In : IEEE Design and Test of Computers. 25, 4, p. 312-320 9 p.

Research output: Contribution to journalArticle

2007

Toward application-aware security and reliability

Iyer, R. K., Kalbarczyk, Z., Pattabiraman, K., Healey, W., Hwu, W. W., Klemperer, P. & Farivar, R., Jan 1 2007, In : IEEE Security and Privacy. 5, 1, p. 57-62 6 p.

Research output: Contribution to journalArticle

2006

Beating in-order stalls with "Flea-Flicker" two-pass pipelining

Barnes, R. D., Sias, J. W., Nystrom, E. M., Patel, S. J., Navarro, J. & Hwu, W. M. W., Jan 1 2006, In : IEEE Transactions on Computers. 55, 1, p. 18-33 16 p.

Research output: Contribution to journalArticle

Tolerating cache-miss latency with multipass pipelines

Barnes, R. D., Ryoo, S. & Hwu, W. M. W., Jan 1 2006, In : IEEE Micro. 26, 1, p. 40-47 8 p.

Research output: Contribution to journalArticle

2004
2003

Energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monks, J., Ebert, J. P., Hwu, W-M. W. & Wolisz, A., Feb 21 2003, In : Computer Networks. 41, 3, p. 313-330 18 p.

Research output: Contribution to journalArticle

2001

An architectural framework for runtime optimization

Merten, M. C., Trick, A. R., Barnes, R. D., Nystrom, E. M., George, C. N., Gyllenhaal, J. C. & Hwu, W. M. W., Jun 1 2001, In : IEEE Transactions on Computers. 50, 6, p. 567-589 23 p.

Research output: Contribution to journalArticle

A power controlled multiple access protocol for wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2001, In : Proceedings - IEEE INFOCOM. 1, p. 219-228 10 p.

Research output: Contribution to journalArticle

A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monkst, J. P., Ebert, J. P., Woliszi, A. A. & Hwu, W-M. W., Jan 1 2001, In : Conference on Local Computer Networks. p. 550-559 10 p., 72.

Research output: Contribution to journalArticle

Modulo Schedule Buffers

Merten, M. C. & Hwu, W-M. W., 2001, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-149 12 p.

Research output: Contribution to journalArticle

Program Decision Logic Optimization Using Predication and Control Speculation

Hwu, W. M. W., August, D. I. & Sias, J. W., Nov 2001, In : Proceedings of the IEEE. 89, 11, p. 1660-1675 16 p.

Research output: Contribution to journalArticle

2000

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W. M. W., Nov 2000, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 35, 11, p. 222-233 12 p.

Research output: Contribution to journalArticle

Hardware support for dynamic activation of compiler-directed computation reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W. M. W., Dec 2000, In : Operating Systems Review (ACM). 34, 5, p. 222-233 12 p.

Research output: Contribution to journalArticle

Hardware support for dynamic activation of Compiler-directed Computation Reuse

Connors, D. A., Hunter, H. C., Cheng, B. C. & Hwu, W. M. W., Jan 1 2000, In : International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS. p. 222-233 12 p.

Research output: Contribution to journalArticle

1999

A new framework for debugging globally optimized code

Wu, L. C., Mirani, R., Patil, H., Olsen, B. & Hwu, W-M. W., May 1999, In : SIGPLAN Notices (ACM Special Interest Group on Programming Languages). 34, 5, p. 181-191 11 p.

Research output: Contribution to journalArticle

Partial reverse if-conversion framework for balancing control flow and predication

August, D. I., Hwu, W. M. W. & Mahlke, S. A., Jan 1 1999, In : International Journal of Parallel Programming. 27, 5, p. 381-423 43 p.

Research output: Contribution to journalArticle

Run-time cache bypassing

Johnson, T. L., Connors, D. A., Merten, M. C. & Hwu, W. M. W., Dec 1 1999, In : IEEE Transactions on Computers. 48, 12, p. 1338-1354 17 p.

Research output: Contribution to journalArticle

1998

Combining trace sampling with single pass methods for efficient cache simulation

Conte, T. M., Hirsch, M. A. & Hwu, W. M. W., Dec 1 1998, In : IEEE Transactions on Computers. 47, 6, p. 714-719 6 p.

Research output: Contribution to journalArticle

Optimization of Machine Descriptions for Efficient Use

Gyllenhaal, J. C., Hwu, W. M. W. & Rau, B. R., Jan 1 1998, In : International Journal of Parallel Programming. 26, 4, p. 417-447 31 p.

Research output: Contribution to journalArticle