Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

2009

High performance computation and display of molecular orbitals on and multi-core cpus

Stone, J. E., Saam, J., Hardy, D. J., Vandivort, K. L., Hwu, W-M. W. & Schulten, K. J., Jul 23 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2. 1 p. (Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Molecular orbitals
Display devices
Program processors
Quantum chemistry
Orbital calculations

High-performance CUDA kernel execution on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Nov 24 2009, ICS'09 - Proceedings of the 23rd International Conference on Supercomputing. p. 515-516 2 p. 1542357. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Particle accelerators

Long time-scale simulations of in vivo diffusion using GPU hardware

Roberts, E., Stone, J. E., Sepúlveda, L., Hwu, W-M. W. & Luthey-Schulten, Z. A., Nov 25 2009, IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium. 5160930. (IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Graphics processing unit

Optimization of tele-immersion codes

Sidelnik, A., Sung, I. J., Wu, W., Garzarán, M. J., Hwu, W. M., Nahrstedt, K., Padua, D. & Patel, S. J., Jul 23 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2. 1 p. (Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer vision
Parallel programming
Tuning
Productivity
Graphics processing unit

The parallelization of video processing: From programming models to applications

Lin, D., Huang, X., Nguyen, Q., Blackburn, J., Rodrigues, C., Huang, T. S., Do, M. N., Patel, S. J. & Hwu, W-M. W., Jan 1 2009, In : IEEE Signal Processing Magazine. 26, 6, p. 103-112 10 p.

Research output: Contribution to journalReview article

Video Processing
Parallelization
Programming Model
Digital Video
Processing
2008

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Sutton, B. P. & Liang, Z. P., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1307-1318 12 p.

Research output: Contribution to journalArticle

Magnetic Resonance Imaging
Graphics Processing Unit
Magnetic resonance
Imaging techniques
Percent

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Dec 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. p. 261-272 12 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magnetic resonance
Imaging techniques
Image quality
Data storage equipment
Image reconstruction

Accelerator architectures

Patel, S. J. & Hwu, W-M. W., Oct 17 2008, In : IEEE Micro. 28, 4, p. 4-12 9 p.

Research output: Contribution to journalEditorial

Particle accelerators
Sampling

Application acceleration with the explicitly parallel operations system - The EPOS processor

Papakonstantinou, A., Chen, D. & Hwu, W-M. W., Sep 29 2008, p. 20-25. 6 p.

Research output: Contribution to conferencePaper

Data flow graphs
High level languages
Wire
Engines
Hardware

CUBA: An architecture for efficient CPU/Co-processor data communication

Gelado, I., Kelm, J. H., Ryoo, S., Lumetta, S. S., Navarro, N. & Hwu, W-M. W., 2008, ICS'08 - Proceedings of the 2008 ACM International Conference on Supercomputing. p. 299-308 10 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Communication
Data structures
Data storage equipment
Coprocessor

CUDA-Lite: Reducing GPU programming complexity

Ueng, S. Z., Lathara, M., Baghsorkhi, S. S. & Hwu, W-M. W., Dec 1 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 1-15 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer programming
Programming
Many-core
Data storage equipment
Coding

GPU acceleration of cutoff pair potentials for molecular modeling applications

Rodrigues, C. I., Hardy, D. J., Stone, J. E., Schulten, K. J. & Hwu, W-M. W., Dec 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. p. 273-282 10 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Molecular modeling
Atoms
Decomposition
Data storage equipment
Computer programming

Iteration disambiguation for parallelism identification in time-sliced applications

Ryoo, S., Rodrigues, C. I. & Hwu, W-M. W., Oct 27 2008, Languages and Compilers for Parallel Computing - 20th International Workshop, LCPC 2007, Revised Selected Papers. p. 110-124 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5234 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallelism
Iteration
Slice
Heap
Microprocessor chips

MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs

Stratton, J. A., Stone, S. S. & Hwu, W-M. W., Dec 1 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 16-30 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient Implementation
Program processors
Parallel programming
Parallel Programming
kernel

Message from the program chair

Hwu, W-M. W., Oct 1 2008, In : Proceedings - International Symposium on Computer Architecture. 4556707.

Research output: Contribution to journalEditorial

Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Ryoo, S., Rodrigues, C. I., Baghsorkhi, S. S., Stone, S. S., Kirk, D. B. & Hwu, W-M. W., 2008, PPoPP'08 - Proceedings of the 2008 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 73-82 10 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Bandwidth
Graphics processing unit
Hardware

Program optimization carving for GPU computing

Ryoo, S., Rodrigues, C. I., Stone, S. S., Stratton, J. A., Ueng, S. Z., Baghsorkhi, S. S. & Hwu, W-M. W., Oct 1 2008, In : Journal of Parallel and Distributed Computing. 68, 10, p. 1389-1401 13 p.

Research output: Contribution to journalArticle

Configuration
Optimization
Computing
Many-core
Random Sampling

Program optimization space pruning for a multithreaded GPU

Ryoo, S., Rodrigues, C. I., Stone, S. S., Baghsorkhi, S. S., Ueng, S. Z., Stratton, J. A. & Hwu, W-M. W., May 19 2008, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization. p. 195-204 10 p. (Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Graphics processing unit
Tuning
Inspection

The concurrency challenge

Hwu, W-M. W., Keutzer, K. & Mattson, T. G., Aug 21 2008, In : IEEE Design and Test of Computers. 25, 4, p. 312-320 9 p.

Research output: Contribution to journalArticle

Microprocessor chips
Semiconductor materials
Hardware
Industry

Visualization and analysis of GPU summer school applicants and participants

Wah, E., Johnson, E., Auvil, L., Thakkar, U., Hwu, W-M. W., Kirk, D., Dunning, T. H. & Glotzer, S. C., 2008, Proceedings - 4th IEEE International Conference on eScience, eScience 2008. p. 362-363 2 p. 4736797

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Visualization
Association rules
Parallel processing systems
Particle accelerators
Data mining
2007

Automatic discovery of coarse-grained parallelism in media applications

Ryoo, S., Ueng, S. Z., Rodrigues, C. I., Kidd, R. E., Frank, M. I. & Hwu, W-M. W., Dec 1 2007, Transactions on High-Performance Embedded Architectures and Compilers I. p. 194-213 20 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 4050 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallelism
Hardware
Processing
Computer programming languages
Particle accelerators

CIGAR: Application partitioning for a CPU/coprocessor architecture

Kelm, J. H., Gelado, I., Murphy, M. J., Navarro, N., Lumetta, S. S. & Hwu, W-M. W., Dec 1 2007, 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007. p. 317-326 10 p. 4336222. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Partitioning
Prototyping
Embedded Processor
Methodology

Corezilla: Build and tame the multicore beast?

Sarno, L., Hwu, W-M. W., Lund, C., Levy, M., Larus, J. R., Reinders, J., Cameron, G., Lennard, C. & Corporation, T., Aug 2 2007, 2007 44th ACM/IEEE Design Automation Conference, DAC'07. p. 632-633 2 p. 4261259. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Software engineering
Systems analysis
Hardware

Implicitly parallel programming models for thousand-core microprocessors

Hwu, W-M. W., Ryoo, S., Ueng, S. Z., Keim, J. H., Gelado, I., Stone, S. S., Kidd, R. E., Baghsorkhi, S. S., Mahesri, A. A., Tsao, S. C., Navarro, N., Lumetta, S. S., Frank, M. I. & Patel, S. J., Aug 2 2007, 2007 44th ACM/IEEE Design Automation Conference, DAC'07. p. 754-759 6 p. 4261284. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel programming
Microprocessor chips
Hardware
Parallel algorithms
Computer programming languages

Toward application-aware security and reliability

Iyer, R. K., Kalbarczyk, Z. T., Pattabiraman, K., Healey, W., Hwu, W-M. W., Klemperer, P. & Farivar, R., Jan 1 2007, In : IEEE Security and Privacy. 5, 1, p. 57-62 6 p.

Research output: Contribution to journalArticle

hardware
Hardware
corruption
Computer systems
Values
2006

Beating in-order stalls with "Flea-Flicker" two-pass pipelining

Barnes, R. D., Sias, J. W., Nystrom, E. M., Patel, S. J., Navarro, J. & Hwu, W-M. W., Jan 1 2006, In : IEEE Transactions on Computers. 55, 1, p. 18-33 16 p.

Research output: Contribution to journalArticle

Pipelining
Pipelines
Latency
Compiler
Percent

Improved Superblock optimization in GCC

Kidd, R. & Hwu, W-M. W., 2006, Proceedings of the GCC Developers' Summit 2006. p. 85-96 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flow control
Scheduling

Tolerating cache-miss latency with multipass pipelines

Barnes, R. D., Ryoo, S. & Hwu, W-M. W., Jan 1 2006, In : IEEE Micro. 26, 1, p. 40-47 8 p.

Research output: Contribution to journalArticle

Pipelines
Scheduling
Data storage equipment
2005

"Flea-flicker" Multipass pipelining: An alternative to the high-power out-of-order offense

Barnes, R. D., Ryoo, S. & Hwu, W-M. W., Dec 1 2005, MICRO-38: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture. p. 319-330 12 p. 1540970. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scheduling
Pipelines
Energy efficiency
Microprocessor chips
Hardware

The future of computer architecture research: An industrial perspective

Hwu, W-M. W. & Patel, S. J., Dec 12 2005, Proceedings - 11th International Symposium on High-Performance Computer Architecture, HPCA-11 2005. 1 p. (Proceedings - International Symposium on High-Performance Computer Architecture).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Industrial research
Computer architecture
Industry
Hardware

Trimaran: An infrastructure for research in instruction-level parallelism

Chakrapani, L. N., Gyllenhaal, J., Hwu, W-M. W., Mahlke, S. A., Palem, K. V. & Rabbah, R. M., Oct 19 2005, In : Lecture Notes in Computer Science. 3602, p. 32-41 10 p.

Research output: Contribution to journalConference article

Instruction Level Parallelism
Performance Monitoring
Infrastructure
Compiler Optimization
Module
2004
Bottom-up
Scalability
Context
Subgraph
Benchmark

Field-testing IMPACT EPIC research results in Itanium 2

Sias, J. W., Ueng, S. Z., Kent, G. A., Steiner, I. M. & Hwu, W-M. W., Oct 8 2004, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. 31, p. 26-37 12 p.

Research output: Contribution to journalConference article

Testing
Microprocessor chips

Importance of heap specialization in pointer analysis

Nystrom, E. M., Kim, H. S. & Hwu, W-M. W., Sep 29 2004, p. 43-48. 6 p.

Research output: Contribution to conferencePaper

Visibility
Data storage equipment
2003

Architecture

Connors, D. A. & Hwu, W-M. W., Jan 1 2003, Memory, Microprocessor, and ASIC. CRC Press, p. 11-1-11-22

Research output: Chapter in Book/Report/Conference proceedingChapter

Microprocessor chips
Computer hardware
Computer architecture
Industry
Computer systems

Beating in-order stalls with "flea-flicker" two-pass pipelining

Barnes, R. D., Patel, S. J., Nystrom, E. M., Navarro, N., Sias, J. W. & Hwu, W-M. W., Jan 1 2003, Proceedings - 36th International Symposium on Microarchitecture, MICRO 2003. IEEE Computer Society, p. 387-398 12 p. 1253243. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2003-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pipelines
Transistors

Energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monks, J., Ebert, J. P., Hwu, W-M. W. & Wolisz, A., Feb 21 2003, In : Computer Networks. 41, 3, p. 313-330 18 p.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks
2002

Architecture

Connors, D. A. & Hwu, W-M. W., Jan 1 2002, The Mechatronics Handbook. CRC Press, p. 42-1-42-21

Research output: Chapter in Book/Report/Conference proceedingChapter

Microprocessor chips
Computer hardware
Computer architecture
Industry
Computer systems

Code coverage and input variability: Effects on architecture and compiler research

Hunter, H. C. & Hwu, W-M. W., Dec 1 2002, Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02. p. 79-87 9 p. (Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Telecommunication
Benchmarking
Experiments
Compliance

Vacuum packing: Extracting hardware-detected program phases for post-link optimization

Barnes, R. D., Nystrom, E. M., Merten, M. C. & Hwu, W-M. W., Jan 1 2002, Proceedings - 35th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2002. IEEE Computer Society, p. 233-244 12 p. 1176253. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2002-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vacuum
Hardware
Phase transitions
2001

An architectural framework for runtime optimization

Merten, M. C., Trick, A. R., Barnes, R. D., Nystrom, E. M., George, C. N., Gyllenhaal, J. C. & Hwu, W-M. W., Jun 1 2001, In : IEEE Transactions on Computers. 50, 6, p. 567-589 23 p.

Research output: Contribution to journalArticle

Optimization
Hardware
Straightening
Branch Prediction
Reoptimization

A power controlled multiple access protocol for wireless packet networks

Monks, J. P., Bharghavan, V. & Hwu, W-M. W., 2001, In : Proceedings - IEEE INFOCOM. 1, p. 219-228 10 p.

Research output: Contribution to journalArticle

Packet networks
Network protocols
Collision avoidance
Ad hoc networks
Power transmission

A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks

Monkst, J. P., Ebert, J. P., Woliszi, A. A. & Hwu, W-M. W., Jan 1 2001, In : Conference on Local Computer Networks. p. 550-559 10 p., 72.

Research output: Contribution to journalArticle

Power control
Wireless networks
Energy conservation
Topology
Packet networks

Code reordering and speculation support for dynamic optimization systems

Nystrom, E. M., Barnes, R. D., Merten, M. C. & Hwu, W-M. W., Jan 1 2001, In : Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT. p. 163-174 12 p.

Research output: Contribution to journalConference article

Speculation
Dynamic Optimization
Reordering
Exception
Optimization
Flow control
Telecommunication
Hardware
Processing

Modulo Schedule Buffers

Merten, M. C. & Hwu, W-M. W., 2001, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 138-149 12 p.

Research output: Contribution to journalArticle

Hardware
Signal processing
Scheduling

Program Decision Logic Optimization Using Predication and Control Speculation

Hwu, W-M. W., August, D. I. & Sias, J. W., Jan 1 2001, In : Proceedings of the IEEE. 89, 11, p. 1660-1675 16 p.

Research output: Contribution to journalArticle

Binary decision diagrams
Global optimization
Flow control
2000

Accurate and efficient predicate analysis with binary decision diagrams

Sias, J. W., Hwu, W-M. W. & August, D. I., Dec 1 2000, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 112-123 12 p.

Research output: Contribution to journalConference article

Binary decision diagrams
Flow control
Substrates

An empirical study of function pointers using SPEC benchmarks

Cheng, B. C. & Hwu, W-M. W., Jan 1 2000, Languages and Compilers for Parallel Computing - 12th International Workshop, LCPC 1999, Proceedings. Carter, L. & Ferrante, J. (eds.). Springer-Verlag, p. 490-493 4 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 1863).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Empirical Study
Benchmark
Extractor
Graph in graph theory
Compiler

Hardware mechanism for dynamic extraction and relayout of program hot spots

Merten, M. C., Trick, A. R., Nystrom, E. M., Barnes, R. D. & Hwu, W-M. W., 2000, In : Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 59-70 12 p.

Research output: Contribution to journalArticle

Hardware
Switches
Data storage equipment