Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Triangle Counting and Truss Decomposition using FPGA

Huang, S., El-Hadedy, M., Hao, C., Li, Q., Mailthody, V. S., Date, K., Xiong, J., Chen, D., Nagi, R. & Hwu, W-M. W., Nov 26 2018, 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018. Institute of Electrical and Electronics Engineers Inc., 8547536. (2018 IEEE High Performance Extreme Computing Conference, HPEC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Decomposition
Graphics processing unit

Trimaran: An infrastructure for research in instruction-level parallelism

Chakrapani, L. N., Gyllenhaal, J., Hwu, W-M. W., Mahlke, S. A., Palem, K. V. & Rabbah, R. M., Oct 19 2005, In : Lecture Notes in Computer Science. 3602, p. 32-41 10 p.

Research output: Contribution to journalConference article

Instruction Level Parallelism
Performance Monitoring
Infrastructure
Compiler Optimization
Module

TrIMS: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service

Dakkak, A., Li, C., De Gonzalo, S. G., Xiong, J. & Hwu, W. M., Jul 2019, Proceedings - 2019 IEEE International Conference on Cloud Computing, CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. Bertino, E., Chang, C. K., Chen, P., Damiani, E., Goul, M. & Oyama, K. (eds.). IEEE Computer Society, p. 372-382 11 p. 8814494. (IEEE International Conference on Cloud Computing, CLOUD; vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Containers
Pipelines
Image classification
Cloud computing
Program processors

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 247-258 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cluster computing
Computer systems programming
Data storage equipment
Parallel programming
Electric fuses

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Aug 2014, In : ACM SIGPLAN Notices. 49, 8, p. 247-258 12 p.

Research output: Contribution to journalArticle

Cluster computing
Computer systems programming
Data storage equipment
Parallel programming
Electric fuses

Unrolling-based optimizations for modulo scheduling

Lavery, D. M. & Hwu, W. M. W., Dec 1 1995, In : Proceedings of the Annual International Symposium on Microarchitecture. p. 327-337 11 p.

Research output: Contribution to journalConference article

Scheduling
Throughput

Using GPUs to accelerate advanced MRI reconstruction with field inhomogeneity compensation

Zhuo, Y., Wu, X. L., Haldar, J. P., Marin, T., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Dec 1 2011, GPU Computing Gems Emerald Edition. Elsevier Inc., p. 709-722 14 p.

Research output: Chapter in Book/Report/Conference proceedingChapter

Magnetic resonance imaging
Biochemistry
Parallel programming
Image reconstruction
Data acquisition

Using profile information to assist advanced compiler optimization and scheduling

Chen, W., Bringmann, R., Mahlke, S., Anik, S., Kiyohara, T., Warter, N., Lavery, D., Hwu, W. M., Hank, R. & Gyllenhaal, J., Jan 1 1993, Languages and Compilers for Parallel Computing - 5th International Workshop, Proceedings. Padua, D., Nicolau, A., Gelernter, D. & Banerjee, U. (eds.). Springer-Verlag, p. 31-48 18 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 757 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler Optimization
Global optimization
Instruction Level Parallelism
Flow control
Scheduling

Using profile information to assist classic code optimizations

Chang, P. P., Mahlke, S. A. & Hwu, WM. W., Dec 1991, In : Software: Practice and Experience. 21, 12, p. 1301-1321 21 p.

Research output: Contribution to journalArticle

Global optimization

Vacuum packing: Extracting hardware-detected program phases for post-link optimization

Barnes, R. D., Nystrom, E. M., Merten, M. C. & Hwu, W-M. W., Jan 1 2002, Proceedings - 35th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2002. IEEE Computer Society, p. 233-244 12 p. 1176253. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2002-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vacuum
Hardware
Phase transitions

Visualization and analysis of GPU summer school applicants and participants

Wah, E., Johnson, E., Auvil, L., Thakkar, U., Hwu, W. M., Kirk, D., Dunning, T. H. & Glotzer, S. C., Dec 1 2008, Proceedings - 4th IEEE International Conference on eScience, eScience 2008. p. 362-363 2 p. 4736797. (Proceedings - 4th IEEE International Conference on eScience, eScience 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Visualization
Association rules
Parallel processing systems
Particle accelerators
Data mining

WebGPU: A scalable online development platform for GPU programming courses

Dakkak, A., Pearson, C. & Hwu, W. M., Jul 18 2016, Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., p. 942-949 8 p. 7529962. (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer programming
Students
Parallel programming
Computer science
Graphics processing unit

What is ahead for parallel computing

Hwu, W-M. W., Jul 2014, In : Journal of Parallel and Distributed Computing. 74, 7, p. 2574-2581 8 p.

Research output: Contribution to journalArticle

Parallel processing systems
Parallel Computing
Parallel algorithms
Parallel Algorithms
Many-core

XMalloc: A scalable lock-free dynamic memory allocator for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W. M., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1134-1139 6 p. 5577907. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Processing
Graphics processing unit

Xprof profiling the execution of x window programs

Gupta, A. & Hwu, W. M. W., Jun 1 1992, Proceedings of the 1992 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/PERFORMANCE 1992. Gaither, B. D. (ed.). Association for Computing Machinery, Inc, p. 253-254 2 p. (Proceedings of the 1992 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/PERFORMANCE 1992).

Research output: Chapter in Book/Report/Conference proceedingConference contribution