Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

Filter
Conference contribution
2006

Improved Superblock optimization in GCC

Kidd, R. & Hwu, W-M. W., 2006, Proceedings of the GCC Developers' Summit 2006. p. 85-96 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2007

Automatic discovery of coarse-grained parallelism in media applications

Ryoo, S., Ueng, S. Z., Rodrigues, C. I., Kidd, R. E., Frank, M. I. & Hwu, W. M. W., Dec 1 2007, Transactions on High-Performance Embedded Architectures and Compilers I. p. 194-213 20 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 4050 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CIGAR: Application partitioning for a CPU/coprocessor architecture

Kelm, J. H., Gelado, I., Murphy, M. J., Navarro, N., Lumetta, S. S. & Hwu, W-M. W., Dec 1 2007, 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007. p. 317-326 10 p. 4336222. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Corezilla: Build and tame the multicore beast?

Sarno, L., Hwu, W. M. W., Lund, C., Levy, M., Larus, J. R., Reinders, J., Cameron, G., Lennard, C. & Corporation, T., Aug 2 2007, 2007 44th ACM/IEEE Design Automation Conference, DAC'07. p. 632-633 2 p. 4261259. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Implicitly parallel programming models for thousand-core microprocessors

Hwu, W-M. W., Ryoo, S., Ueng, S. Z., Keim, J. H., Gelado, I., Stone, S. S., Kidd, R. E., Baghsorkhi, S. S., Mahesri, A. A., Tsao, S. C., Navarro, N., Lumetta, S. S., Frank, M. I. & Patel, S. J., Aug 2 2007, 2007 44th ACM/IEEE Design Automation Conference, DAC'07. p. 754-759 6 p. 4261284. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2008

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Jan 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. Association for Computing Machinery, p. 261-272 12 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CUBA: An architecture for efficient CPU/Co-processor data communication

Gelado, I., Kelm, J. H., Ryoo, S., Lumetta, S. S., Navarro, N. & Hwu, W. M. W., Dec 15 2008, ICS'08 - Proceedings of the 2008 ACM International Conference on Supercomputing. p. 299-308 10 p. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CUDA-Lite: Reducing GPU programming complexity

Ueng, S. Z., Lathara, M., Baghsorkhi, S. S. & Hwu, W. M. W., 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 1-15 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

GPU acceleration of cutoff pair potentials for molecular modeling applications

Rodrigues, C. I., Hardy, D. J., Stone, J. E., Schulten, K. & Hwu, W. M. W., Dec 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. p. 273-282 10 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Iteration disambiguation for parallelism identification in time-sliced applications

Ryoo, S., Rodrigues, C. I. & Hwu, W. M. W., Oct 27 2008, Languages and Compilers for Parallel Computing - 20th International Workshop, LCPC 2007, Revised Selected Papers. p. 110-124 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5234 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs

Stratton, J. A., Stone, S. S. & Hwu, W-M. W., Dec 1 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 16-30 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Ryoo, S., Rodrigues, C. I., Baghsorkhi, S. S., Stone, S. S., Kirk, D. B. & Hwu, W. M. W., Dec 1 2008, PPoPP'08 - Proceedings of the 2008 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 73-82 10 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program optimization space pruning for a multithreaded GPU

Ryoo, S., Rodrigues, C. I., Stone, S. S., Baghsorkhi, S. S., Ueng, S. Z., Stratton, J. A. & Hwu, W. M. W., May 19 2008, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization. p. 195-204 10 p. (Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Visualization and analysis of GPU summer school applicants and participants

Wah, E., Johnson, E., Auvil, L., Thakkar, U., Hwu, W. M., Kirk, D., Dunning, T. H. & Glotzer, S. C., Dec 1 2008, Proceedings - 4th IEEE International Conference on eScience, eScience 2008. p. 362-363 2 p. 4736797. (Proceedings - 4th IEEE International Conference on eScience, eScience 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2009

Accelerating mr image reconstruction on GPUs

Hwu, W. M. W., Nandakumar, D., Haldar, J., Atkinson, I. C., Sutton, B., Liang, Z. P. & Thulborn, K. R., Nov 17 2009, Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009. p. 1283-1286 4 p. 5193297. (Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W. M. W., Nov 11 2009, 2009 IEEE 7th Symposium on Application Specific Processors, SASP 2009. p. 35-42 8 p. 5226333. (2009 IEEE 7th Symposium on Application Specific Processors, SASP 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

GPU clusters for high-performance computing

Kindratenko, V., Enos, J. J., Shi, G., Showerman, M. T., Arnold, G. W., Stone, J. E., Phillips, J. C. & Hwu, W-M. W., Dec 21 2009, 2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER '09. 5289128. (Proceedings - IEEE International Conference on Cluster Computing, ICCC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

High performance computation and display of molecular orbitals on and multi-core cpus

Stone, J. E., Saam, J., Hardy, D. J., Vandivort, K. L., Hwu, W-M. W. & Schulten, K. J., Jul 23 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2. 1 p. (Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

High-performance CUDA kernel execution on FPGAs

Papakonstantinou, A., Gururaj, K., Stratton, J. A., Chen, D., Cong, J. & Hwu, W-M. W., Nov 24 2009, ICS'09 - Proceedings of the 23rd International Conference on Supercomputing. p. 515-516 2 p. 1542357. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Long time-scale simulations of in vivo diffusion using GPU hardware

Roberts, E., Stone, J. E., Sepúlveda, L., Hwu, W-M. W. & Luthey-Schulten, Z. A., Nov 25 2009, IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium. 5160930. (IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Optimization of tele-immersion codes

Sidelnik, A., Sung, I. J., Wu, W., Garzarán, M. J., Hwu, W. M., Nahrstedt, K., Padua, D. & Patel, S. J., Jul 23 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2. 1 p. (Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2010

Accelerating iterative field-compensated MR image reconstruction on GPUs

Zhuo, Y., Wu, X. L., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Aug 9 2010, 2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings. p. 820-823 4 p. 5490112. (2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W. M. W., Mar 15 2010, PPoPP'10 - Proceedings of the 2010 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 105-114 10 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

An asymmetric distributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., 2010, ASPLOS XV - 15th International Conference on Architectural Support for Programming Languages and Operating Systems. p. 347-358 12 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

An effective GPU implementation of breadth-first search

Luo, L., Wong, M. & Hwu, W. M., Sep 7 2010, Proceedings of the 47th Design Automation Conference, DAC '10. p. 52-55 4 p. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

Sung, I. J., Stratton, J. A. & Hwu, W. M. W., Jan 1 2010, PACT'10 - Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 513-522 10 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT; vol. 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Direct numerical simulation of turbulent flow in a square duct using a Graphics Processing Unit (GPU)

Shinn, A. F., Vanka, S. P. & Hwu, W. W., Dec 2 2010, 40th AIAA Fluid Dynamics Conference. 2010-5029. (40th AIAA Fluid Dynamics Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

Stratton, J. A., Grover, V., Marathe, J., Aarts, B., Murphy, M., Hu, Z. & Hwu, W-M. W., Jul 1 2010, Proceedings of the 2010 CGO - The 8th International Symposium on Code Generation and Optimization. p. 111-119 9 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Exploiting more parallelism from applications having generalized reductions on GPU architectures

Wu, X. L., Obeid, N. & Hwu, W-M. W., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1175-1180 6 p. 5577899. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sparse regularization in MRI iterative reconstruction using GPUs

Zhuo, Y., Sutton, B., Wu, X. L., Haldar, J., Hwu, W. M. & Liang, Z. P., Dec 1 2010, Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010. p. 578-582 5 p. 5640008. (Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010; vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

XMalloc: A scalable lock-free dynamic memory allocator for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W. M., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1134-1139 6 p. 5577907. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2011

Advanced MRI reconstruction toolbox with accelerating on GPU

Wu, X. L., Zhuo, Y., Gai, J., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Feb 11 2011, Proceedings of SPIE-IS and T Electronic Imaging - Parallel Processing for Imaging Applications. 78720Q. (Proceedings of SPIE - The International Society for Optical Engineering; vol. 7872).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A scalable tridiagonal solver for GPUs

Kim, H. S., Wu, S., Chang, L. W. & Hwu, W-M. W., Nov 7 2011, Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011. p. 444-453 10 p. 6047212. (Proceedings of the International Conference on Parallel Processing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A tiling-scheme Viterbi decoder in Software Defined Radio for GPUs

Lin, C. S., Liu, W. L., Yeh, W. T., Chang, L. W., Hwu, W. M. W., Chen, S. J. & Hsiung, P. A., 2011, 7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011. 6036680. (7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Impatient MRI: Illinois Massively Parallel Acceleration Toolkit for image reconstruction with enhanced throughput in MRI

Wu, X. L., Gai, J., Lam, F., Fu, M., Haldar, J. P., Zhuo, Y., Liang, Z. P., Hwu, W. M. & Sutton, B. P., Nov 2 2011, 2011 8th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI'11. p. 69-72 4 p. 5872356. (Proceedings - International Symposium on Biomedical Imaging).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Multilevel granularity parallelism synthesis on FPGAs

Papakonstantinou, A., Liang, Y., Stratton, J. A., Gururaj, K., Chen, D., Hwu, W-M. W. & Cong, J., Jun 17 2011, Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011. p. 178-185 8 p. 5771270. (Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel implementation of multi-dimensional ensemble empirical mode decomposition

Chang, L. W., Lo, M. T., Anssari, N., Hsu, K. H., Huang, N. E. & Hwu, W. M. W., Aug 18 2011, 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. p. 1621-1624 4 p. 5946808. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2012

A scalable, numerically stable, high-performance tridiagonal solver using GPUs

Chang, L. W., Stratton, J. A., Kim, H. S. & Hwu, W-M. W., Dec 1 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012. 6468510. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Design evaluation of OpenCL compiler framework for coarse-grained reconfigurable arrays

Kim, H. S., Ahn, M., Stratton, J. A. & Hwu, W. M. W., Dec 1 2012, FPT 2012 - 2012 International Conference on Field-Programmable Technology. p. 313-320 8 p. 6412155. (FPT 2012 - 2012 International Conference on Field-Programmable Technology).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DL: A data layout transformation system for heterogeneous computing

Sung, I. J., Liu, G. D. & Hwu, W. M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339606. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient pattern-based time series classification on GPU

Chang, K. W., Deka, B., Hwu, W-M. W. & Roth, D., 2012, Proceedings - 12th IEEE International Conference on Data Mining, ICDM 2012. p. 131-140 10 p. 6413748

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors

Baghsorkhi, S. S., Gelado, I., Delahaye, M. & Hwu, W-M. W., Mar 22 2012, PPoPP'12 - Proceedings of the 2012 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 23-33 11 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

High-speed interferometric synthetic aperture microscopy on a graphics processing unit

Ahmad, A., Shemonski, N., Adie, S. G., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Frontiers in Optics, FIO 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Implementing a GPU programming model on a non-GPU accelerator architecture

Kofsky, S. M., Johnson, D. R., Stratton, J. A., Hwu, W. M. W., Patel, S. J. & Lumetta, S. S., Mar 8 2012, Computer Architecture - ISCA 2010 International Workshops, A4MMC, AMAS-BT, EAMA, WEED, WIOSCA, Revised Selected Papers. p. 40-51 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 6161 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Interferometric synthetic aperture microscopy with computational adaptive optics for high-resolution tomography of scattering tissue

Adie, S. G., Ahmad, A., Shemonski, N., Graf, B. W., Kim, H., Hwu, W. M. W., Carney, P. S. & Boppart, S. A., 2012, Biomedical Optics, BIOMED 2012.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Optimization and architecture effects on GPU computing workload performance

Stratton, J. A., Anssari, N., Rodrigues, C., Sung, I. J., Obeid, N., Chang, L., Liu, G. D. & Hwu, W-M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339605. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2013

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W. M., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Throughput-oriented kernel porting onto FPGAs

Papakonstantinou, A., Chen, D., Hwu, W. M., Cong, J. & Yun, L., Jul 12 2013, Proceedings of the 50th Annual Design Automation Conference, DAC 2013. 11. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2014

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution