Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

2012

CUDA dynamic parallelism

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 435-457 23 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

CUDA memories

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 95-121 27 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Data storage equipment

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Anssari, N., Stratton, J. A. & Hwu, W-M. W., Feb 1 2012, In : International Journal of Parallel Programming. 40, 1, p. 4-24 21 p.

Research output: Contribution to journalArticle

Many-core
Parallelism
Layout
Grid
Data storage equipment

Data-parallel execution model

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 63-94 32 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Design evaluation of OpenCL compiler framework for coarse-grained reconfigurable arrays

Kim, H. S., Ahn, M., Stratton, J. A. & Hwu, W-M. W., Dec 1 2012, FPT 2012 - 2012 International Conference on Field-Programmable Technology. p. 313-320 8 p. 6412155. (FPT 2012 - 2012 International Conference on Field-Programmable Technology).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler
Evaluation
Parallel Programming
kernel
Programming Model

DL: A data layout transformation system for heterogeneous computing

Sung, I. J., Liu, G. D. & Hwu, W-M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339606. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Program processors
Bandwidth
Dynamic random access storage
Graphics processing unit

Efficient pattern-based time series classification on GPU

Chang, K. W., Deka, B., Hwu, W-M. W. & Roth, D., 2012, Proceedings - 12th IEEE International Conference on Data Mining, ICDM 2012. p. 131-140 10 p. 6413748

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chemical reactions
Time series
Dynamic programming

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W-M. W., Mar 22 2012, PPoPP'12 - Proceedings of the 2012 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 23-33 11 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Sampling
Monitoring
Hardware
Graphics processing unit

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W. M. W., Aug 1 2012, In : ACM SIGPLAN Notices. 47, 8, p. 23-33 11 p.

Research output: Contribution to journalArticle

Data storage equipment
Sampling
Monitoring
Hardware
Graphics processing unit

Floating-point considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 151-171 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

GPU Computing Gems Jade Edition

Hwu, W-M. W., Jan 1 2012, Elsevier Inc.

Research output: Book/ReportBook

Gems
Finance
Graphics processing unit
Environmental engineering
Computer systems programming

High-speed interferometric synthetic aperture microscopy on a graphics processing unit

Ahmad, A., Shemonski, N., Adie, S. G., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Frontiers in Optics, FIO 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

synthetic apertures
high speed
microscopy
tomography
imaging techniques

History of GPU computing

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 23-39 17 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Graphics processing unit

Implementing a GPU programming model on a non-GPU accelerator architecture

Kofsky, S. M., Johnson, D. R., Stratton, J. A., Hwu, W. M. W., Patel, S. J. & Lumetta, S. S., Mar 8 2012, Computer Architecture - ISCA 2010 International Workshops, A4MMC, AMAS-BT, EAMA, WEED, WIOSCA, Revised Selected Papers. p. 40-51 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 6161 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Accelerator
Programming Model
Particle accelerators
Parallel architectures
Degradation

Interferometric synthetic aperture microscopy with computational adaptive optics for high-resolution tomography of scattering tissue

Adie, S. G., Ahmad, A., Shemonski, N., Graf, B. W., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Biomedical Optics, BIOMED 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

synthetic apertures
adaptive optics
Microscopy
tomography
Tomography

Introduction

Hwu, W. M. W., Dec 1 2012, GPU Computing Gems Jade Edition. Elsevier Inc., p. xv-xvi

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

Introduction

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 1-21 21 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Introduction to data parallelism and CUDA C

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 41-62 22 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Optimization and architecture effects on GPU computing workload performance

Stratton, J. A., Anssari, N., Rodrigues, C., Sung, I. J., Obeid, N., Chang, L., Liu, G. D. & Hwu, W-M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339605. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Bandwidth
Dynamic random access storage
Coarsening
Throughput

Parallel patterns: Sparse matrix-vector multiplication: An introduction to compaction and regularization in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 217-234 18 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel algorithms
Compaction

Parallel patterns: Convolution: With an introduction to constant memory and caches

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 173-196 24 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Convolution
Data storage equipment

Parallel patterns: Prefix sum: An introduction to work efficiency in parallel algorithms

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 197-216 20 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel algorithms

Parallel programming and computational thinking

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 281-295 15 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Parallel programming

Performance analysis and tuning for general purpose graphics processing units (GPGPU)

Kim, H., Vuduc, R., Baghsorkhi, S., Hwu, W-M. W. & Jee Choi, C., Nov 21 2012, Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU). p. 1-94 94 p. (Synthesis Lectures on Computer Architecture; vol. 20).

Research output: Chapter in Book/Report/Conference proceedingChapter

Tuning
Data storage equipment
Hardware
Cache memory
Memory architecture

Performance considerations

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. 123-149 27 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Preface

Kirk, D. B. & Hwu, W-M. W., Jan 1 2012, Programming Massively Parallel Processors: A Hands-on Approach, Second Edition. Elsevier Science, p. xiii-xviii

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

TIGER: tiled iterative genome assembler.

Wu, X. L., Heo, Y., El Hajj, I., Hwu, W. M., Chen, D. & Ma, J., 2012, In : Unknown Journal. 13 Suppl 19

Research output: Contribution to journalArticle

Tigers
Genome
Genes
Data storage equipment
Sequencing
2011

Advanced MRI reconstruction toolbox with accelerating on GPU

Wu, X. L., Zhuo, Y., Gai, J., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Feb 11 2011, Proceedings of SPIE-IS and T Electronic Imaging - Parallel Processing for Imaging Applications. 78720Q. (Proceedings of SPIE - The International Society for Optical Engineering; vol. 7872).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magnetic Resonance Imaging
Magnetic resonance
magnetic resonance
Imaging techniques
Reconstruction Algorithm

A scalable tridiagonal solver for GPUs

Kim, H. S., Wu, S., Chang, L. W. & Hwu, W-M. W., Nov 7 2011, Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011. p. 444-453 10 p. 6047212. (Proceedings of the International Conference on Parallel Processing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tridiagonal matrix
Cyclic Reduction
Parallel algorithms
Parallel Algorithms
Tridiagonal Systems

A tiling-scheme Viterbi decoder in Software Defined Radio for GPUs

Lin, C. S., Liu, W. L., Yeh, W. T., Chang, L. W., Hwu, W-M. W., Chen, S. J. & Hsiung, P. A., Oct 31 2011, 7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011. 6036680. (7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Decoding
radio
Hamming distance
Merging
Program processors

EcoG: A power-efficient GPU cluster architecture for scientific computing

Showerman, M., Enos, J. J., Steffen, C. P., Treichler, S., Gropp, W. D. & Hwu, W-M. W., Mar 1 2011, In : Computing in Science and Engineering. 13, 2, p. 83-87 5 p., 5725240.

Research output: Contribution to journalArticle

Natural sciences computing
Graphics processing unit

GPU Computing Gems Emerald Edition

Hwu, W-M. W., Jan 1 2011, Elsevier Inc.

Research output: Book/ReportBook

Gems
Computer vision
Parallel programming
Medical imaging
Graphics processing unit

Impatient MRI: Illinois Massively Parallel Acceleration Toolkit for image reconstruction with enhanced throughput in MRI

Wu, X. L., Gai, J., Lam, F., Fu, M., Haldar, J. P., Zhuo, Y., Liang, Z. P., Hwu, W. M. & Sutton, B. P., Nov 2 2011, 2011 8th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI'11. p. 69-72 4 p. 5872356. (Proceedings - International Symposium on Biomedical Imaging).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Magnetic resonance
Image reconstruction
Magnetic Resonance Imaging
Throughput

Introduction

Hwu, W-M. W., Dec 1 2011, GPU Computing Gems Emerald Edition. Elsevier Inc.

Research output: Chapter in Book/Report/Conference proceedingForeword/postscript

Multilevel granularity parallelism synthesis on FPGAs

Papakonstantinou, A., Liang, Y., Stratton, J. A., Gururaj, K., Chen, D., Hwu, W-M. W. & Cong, J., Jun 17 2011, Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011. p. 178-185 8 p. 5771270. (Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Particle accelerators
Clocks
Hardware
High level synthesis

Parallel implementation of multi-dimensional ensemble empirical mode decomposition

Chang, L. W., Lo, M. T., Anssari, N., Hsu, K. H., Huang, N. E. & Hwu, W. M. W., Aug 18 2011, 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. p. 1621-1624 4 p. 5946808. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Decomposition
Computer programming
Graphics processing unit

Using GPUs to accelerate advanced MRI reconstruction with field inhomogeneity compensation

Zhuo, Y., Wu, X. L., Haldar, J. P., Marin, T., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Dec 1 2011, GPU Computing Gems Emerald Edition. Elsevier Inc., p. 709-722 14 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Magnetic resonance imaging
Biochemistry
Parallel programming
Image reconstruction
Data acquisition
2010

Accelerating iterative field-compensated MR image reconstruction on GPUs

Zhuo, Y., Wu, X. L., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Aug 9 2010, 2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings. p. 820-823 4 p. 5490112. (2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Image reconstruction
Magnetic Fields
Magnetic fields
Physics

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W. M. W., Mar 15 2010, PPoPP'10 - Proceedings of the 2010 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 105-114 10 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flow graphs
Data storage equipment
Graphics processing unit
Flow control
Analytical models

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W-M. W., May 1 2010, In : ACM SIGPLAN Notices. 45, 5, p. 105-114 10 p.

Research output: Contribution to journalArticle

Flow graphs
Data storage equipment
Graphics processing unit
Flow control
Analytical models

An asymmetric distributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., May 19 2010, ASPLOS XV - 15th International Conference on Architectural Support for Programming Languages and Operating Systems. p. 347-358 12 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer systems
Data storage equipment
Particle accelerators
Program processors
Data transfer

An asymmetric dstributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., Mar 1 2010, In : ACM SIGPLAN Notices. 45, 3, p. 347-358 12 p.

Research output: Contribution to journalArticle

Computer systems
Data storage equipment
Particle accelerators
Program processors
Data transfer

An effective GPU implementation of breadth-first search

Luo, L., Wong, M. & Hwu, W. M., Sep 7 2010, Proceedings of the 47th Design Automation Conference, DAC '10. p. 52-55 4 p. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Breadth-first Search
Program processors
Design Automation
Computational complexity
Accelerate

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Stratton, J. A. & Hwu, W-M. W., Jan 1 2010, PACT'10 - Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 513-522 10 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT; vol. 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Many-core
Parallelism
Layout
Grid
Data storage equipment

Direct numerical simulation of turbulent flow in a square duct using a Graphics Processing Unit (GPU)

Shinn, A. F., Vanka, S. P. & Hwu, W-M. W., 2010, 40th AIAA Fluid Dynamics Conference. 2010-5029

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Direct numerical simulation
Ducts
Turbulent flow
Large eddy simulation
Reynolds number

Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

Stratton, J. A., Grover, V., Marathe, J., Aarts, B., Murphy, M., Hu, Z. & Hwu, W-M. W., Jul 1 2010, Proceedings of the 2010 CGO - The 8th International Symposium on Code Generation and Optimization. p. 111-119 9 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compilation
Program processors
Thread
Programming Model
Multithreading

Exploiting more parallelism from applications having generalized reductions on GPU architectures

Wu, X. L., Obeid, N. & Hwu, W-M. W., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1175-1180 6 p. 5577899. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Graphics processing unit
Communication

High-performance computing with accelerators

Kindratenko, V., Wilhelmson, R., Brunner, R. J., Martíez, T. J. & Hwu, W-M. W., Jul 1 2010, In : Computing in Science and Engineering. 12, 4, p. 12-16 5 p., 5492949.

Research output: Contribution to journalEditorial

Particle accelerators

Sparse regularization in MRI iterative reconstruction using GPUs

Zhuo, Y., Sutton, B., Wu, X. L., Haldar, J., Hwu, W. M. & Liang, Z. P., Dec 1 2010, Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010. p. 578-582 5 p. 5640008. (Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010; vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Magnetic resonance imaging
Image reconstruction
Program processors
Communication

XMalloc: A scalable lock-free dynamic memory allocator for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W-M. W., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1134-1139 6 p. 5577907. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Processing
Graphics processing unit