Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Filter
Conference contribution

Xprof profiling the execution of x window programs

Gupta, A. & Hwu, W. M. W., Jun 1 1992, Proceedings of the 1992 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/PERFORMANCE 1992. Gaither, B. D. (ed.). Association for Computing Machinery, Inc, p. 253-254 2 p. (Proceedings of the 1992 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/PERFORMANCE 1992).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

XMalloc: A scalable lock-free dynamic memory allocator for many-core machines

Huang, X., Rodrigues, C. I., Jones, S., Buck, I. & Hwu, W. M., Nov 19 2010, Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010. p. 1134-1139 6 p. 5577907. (Proceedings - 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Processing
Graphics processing unit

WebGPU: A scalable online development platform for GPU programming courses

Dakkak, A., Pearson, C. & Hwu, W. M., Jul 18 2016, Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016. Institute of Electrical and Electronics Engineers Inc., p. 942-949 8 p. 7529962. (Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer programming
Students
Parallel programming
Computer science
Graphics processing unit

Visualization and analysis of GPU summer school applicants and participants

Wah, E., Johnson, E., Auvil, L., Thakkar, U., Hwu, W. M., Kirk, D., Dunning, T. H. & Glotzer, S. C., Dec 1 2008, Proceedings - 4th IEEE International Conference on eScience, eScience 2008. p. 362-363 2 p. 4736797. (Proceedings - 4th IEEE International Conference on eScience, eScience 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Visualization
Association rules
Parallel processing systems
Particle accelerators
Data mining

Vacuum packing: Extracting hardware-detected program phases for post-link optimization

Barnes, R. D., Nystrom, E. M., Merten, M. C. & Hwu, W-M. W., Jan 1 2002, Proceedings - 35th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2002. IEEE Computer Society, p. 233-244 12 p. 1176253. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2002-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vacuum
Hardware
Phase transitions

Using profile information to assist advanced compiler optimization and scheduling

Chen, W., Bringmann, R., Mahlke, S., Anik, S., Kiyohara, T., Warter, N., Lavery, D., Hwu, W. M., Hank, R. & Gyllenhaal, J., Jan 1 1993, Languages and Compilers for Parallel Computing - 5th International Workshop, Proceedings. Padua, D., Nicolau, A., Gelernter, D. & Banerjee, U. (eds.). Springer-Verlag, p. 31-48 18 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 757 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Compiler Optimization
Global optimization
Instruction Level Parallelism
Flow control
Scheduling

Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Rodrigues, C., Jablin, T., Dakkak, A. & Hwu, W. M., Mar 10 2014, PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 247-258 12 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cluster computing
Computer systems programming
Data storage equipment
Parallel programming
Electric fuses

TrIMS: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service

Dakkak, A., Li, C., De Gonzalo, S. G., Xiong, J. & Hwu, W. M., Jul 2019, Proceedings - 2019 IEEE International Conference on Cloud Computing, CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. Bertino, E., Chang, C. K., Chen, P., Damiani, E., Goul, M. & Oyama, K. (eds.). IEEE Computer Society, p. 372-382 11 p. 8814494. (IEEE International Conference on Cloud Computing, CLOUD; vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Containers
Pipelines
Image classification
Cloud computing
Program processors

Triangle Counting and Truss Decomposition using FPGA

Huang, S., El-Hadedy, M., Hao, C., Li, Q., Mailthody, V. S., Date, K., Xiong, J., Chen, D., Nagi, R. & Hwu, W-M. W., Nov 26 2018, 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018. Institute of Electrical and Electronics Engineers Inc., 8547536. (2018 IEEE High Performance Extreme Computing Conference, HPEC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Decomposition
Graphics processing unit

Transitioning HPC software to exascale heterogeneous computing

Hwu, W-M. W., Chang, L. W., Kim, H. S., Dakkak, A. & El Hajj, I., Sep 2 2015, 2015 Computational Electromagnetics International Workshop, CEM 2015. Institute of Electrical and Electronics Engineers Inc., p. 4-5 2 p. 7237412. (2015 Computational Electromagnetics International Workshop, CEM 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Heterogeneous Computing
Productivity
Computer systems programming
Unit
Software

Tolerating data access latency with register preloading

Chen, W. Y., Mahlke, S. A., Hwu, W-M. W., Kiyohara, T. & Chang, P. P., Aug 1 1992, Proceedings of the 6th International Conference on Supercomputing, ICS 1992. Association for Computing Machinery, p. 104-113 10 p. (Proceedings of the International Conference on Supercomputing; vol. Part F129617).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Supercomputers
Hardware
Data storage equipment

Throughput-oriented kernel porting onto FPGAs

Papakonstantinou, A., Chen, D., Hwu, W-M. W., Cong, J. & Yun, L., Jul 12 2013, Proceedings of the 50th Annual Design Automation Conference, DAC 2013. 11. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field Programmable Gate Array
Field programmable gate arrays (FPGA)
Throughput
kernel
Coloring

Thoughts on massively-parallel heterogeneous computing for solving large problems

Hwu, W. M., Hidayetogglu, M., Chew, W. C., Pearson, C., Garcia, S., Huang, S. & Dakkak, A., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 67-68 2 p. 7991890. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Supercomputers
Fruits
Innovation
Scattering
scaling

The future of computer architecture research: An industrial perspective

Hwu, W. M. & Patel, S., Dec 12 2005, Proceedings - 11th International Symposium on High-Performance Computer Architecture, HPCA-11 2005. 1 p. (Proceedings - International Symposium on High-Performance Computer Architecture).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Industrial research
Computer architecture
Industry
Hardware

The benefit of predicated execution for software pipelining

Warter, N. J., Lavery, D. R. & Hwu, W-M. W., Jan 1 1993, Proceedings of the 26th Hawaii International Conference on System Sciences, HICSS 1993. IEEE Computer Society, p. 497-506 10 p. 1198122. (Proceedings of the Annual Hawaii International Conference on System Sciences; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scheduling algorithms
Microprocessor chips
Hardware
Costs
Experiments

The application of compiler-assisted multiple-instruction retry to VLIW architectures

Chen, S. K., Fuchs, W. K. & Hwu, W-M. W., Jan 1 1994, Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, FTPDS 1994. Pradhan, D. & Avresky, D. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 51-58 8 p. 494474. (Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, FTPDS 1994).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Very long instruction word architecture
Hazards
Hardware

Systematic prototyping of superscalar computer architectures

Conte, T. M. & Hwu, W-M. W., Jan 1 1992, Proceedings - 3rd International Workshop on Rapid System Prototyping: Shortening the Path from Specification to Prototype, RSP 1992. IEEE Computer Society, p. 161-170 10 p. 243910. (Proceedings of the International Workshop on Rapid System Prototyping; vol. 1992-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer architecture
Architectural design
Hardware
Data storage equipment

Supercomputing for Full-Wave Tomographic Image Reconstruction in Near-Real Time

Hidayetoǧlu, M., Hwu, W-M. W. & Cho Chew, W., Jan 1 2018, 2018 IEEE Antennas and Propagation Society International Symposium and USNC/URSI National Radio Science Meeting, APSURSI 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 1841-1842 2 p. 8608869. (2018 IEEE Antennas and Propagation Society International Symposium and USNC/URSI National Radio Science Meeting, APSURSI 2018 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

image reconstruction
Image reconstruction
multipoles
Program processors
Iterative methods

Superblock formation using static program analysis

Hank, R. E., Mahlke, S. A., Bringmann, R. A., Gyllenhaal, J. C. & Hwu, W-M. W., Jan 1 1994, Proceedings of the Annual International Symposium on Microarchitecture. Anon (ed.). Publ by IEEE, p. 247-255 9 p. (Proceedings of the Annual International Symposium on Microarchitecture).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scheduling

Study of the cache and branch performance issues with running Java on current hardware platforms

Hsieh, C. H. A., Conte, M. T., Johnson, T. L., Gyllenhaal, J. C. & Hwu, W-M. W., 1997, Digest of Papers - COMPCON - IEEE Computer Society International Conference. Anon (ed.). IEEE, p. 211-216 6 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Caffeine
Hardware

Speculative execution exception recovery using write-back suppression

Bringmann, R. A., Mahlke, S. A., Hank, R. E., Gyllenhaal, J. C. & Hwu, W-M. W., 1994, Proceedings of the Annual International Symposium on Microarchitecture. Anon (ed.). Publ by IEEE, p. 214-223 10 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Recovery
Experiments

SPEC ACCEL: A standard application suite for measuring hardware accelerator performance

Juckeland, G., Brantley, W., Chandrasekaran, S., Chapman, B., Che, S., Colgrove, M., Feng, H., Grund, A., Henschel, R., Hwu, W. M. W., Li, H., Müller, M. S., Nagel, W. E., Perminov, M., Shelepugin, P., Skadron, K., Stratton, J., Titov, A., Wang, K., Van Waveren, M. & 4 others, Whitney, B., Wienke, S., Xu, R. & Kumaran, K., Jan 1 2015, High Performance Computing Systems: Performance Modeling, Benchmarking, and Simulation - 5th International Workshop, PMBS 2014, Revised Selected Papers. Hammond, S. D., Jarvis, S. A. & Wright, S. A. (eds.). Springer-Verlag, p. 46-67 22 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 8966).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware Accelerator
Accelerator
Particle accelerators
Hardware
Benchmark

Sparse regularization in MRI iterative reconstruction using GPUs

Zhuo, Y., Sutton, B., Wu, X. L., Haldar, J., Hwu, W. M. & Liang, Z. P., Dec 1 2010, Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010. p. 578-582 5 p. 5640008. (Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010; vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Magnetic resonance imaging
Image reconstruction
Program processors
Communication

SpaceJMP: Programming with multiple virtual address spaces

El Hajj, I., Merritt, A., Zellweger, G., Milojicic, D., Achermann, R., Faraboschi, P., Hwu, W. M., Roscoe, T. & Schwan, K., Mar 25 2016, ASPLOS 2016 - 21st International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 353-368 16 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 02-06-April-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Virtual addresses
Computer programming
Data storage equipment
Computer operating systems
Data structures

Sentinel scheduling for VLIW and superscalar processors

Mahlke, S. A., Chen, W. Y., Hwu, W-M. W., Rau, B. R. & Schlansker, M. S., Jan 1 1992, International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS. 9 ed. Publ by ACM, p. 238-247 10 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS; vol. 27, no. 9).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scheduling

Seeing the invisible: Limited-view imaging with multiple-scattering reconstruction

Hidayetoglu, M., Hwu, W-M. W. & Chew, W. C., Feb 21 2018, 2018 United States National Committee of URSI National Radio Science Meeting, USNC-URSI NRSM 2018. Institute of Electrical and Electronics Engineers Inc., p. 1-2 2 p. (2018 United States National Committee of URSI National Radio Science Meeting, USNC-URSI NRSM 2018; vol. 2018-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Multiple scattering
Image reconstruction
image reconstruction
Imaging techniques
Forward scattering

Scalable parallel DBIM solutions of inverse-scattering problems

Hidayetogglu, M., Pearson, C., Gurel, L., Hwu, W. M. & Chew, W. C., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 65-66 2 p. 7991889. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

inverse scattering
Iterative methods
Scattering
distributing
Lighting

RUN-TIME GENERATION OF HPS MICROINSTRUCTIONS FROM A VAX INSTRUCTION STREAM.

Patt, Y. N., Melvin, S. W., Hwu, W. M., Shebanow, M. C., Chen, C. & We, J., Dec 1 1986, MICRO: Annual Microprogramming Workshop. IEEE, p. 75-81 7 p. (MICRO: Annual Microprogramming Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Engines
Specifications

Reverse if-conversion

Warter, N. J., Mahlke, S. A., Hwu, W. M. W. & Rau, B. R., Dec 1 1993, Proc ACM SIGPLAN 93 Conf Program Lang Des Implementation. Anon (ed.). Publ by ACM, p. 290-299 10 p. (Proc ACM SIGPLAN 93 Conf Program Lang Des Implementation).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flow graphs
Scheduling

Rebooting the data access hierarchy of computing systems

Hwu, W-M. W., Hajj, I. E., De Gonzalo, S. G., Pearson, C., Kim, N. S., Chen, D., Xiong, J. & Sura, Z., Nov 28 2017, 2017 IEEE International Conference on Rebooting Computing, ICRC 2017 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 1-4 4 p. (2017 IEEE International Conference on Rebooting Computing, ICRC 2017 - Proceedings; vol. 2017-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

hierarchies
compilers
touch
reuse
lessons learned

RAI: A scalable project submission system for parallel programming courses

Dakkak, A., Pearson, C., Li, C. & Hwu, W. M., Jun 30 2017, Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017. Institute of Electrical and Electronics Engineers Inc., p. 315-322 8 p. 7965062. (Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallel programming
Computer programming
Computer hardware
Scalability
Students

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference

Ankit, A., El Hajj, I., Rahul Chalamalasetti, S., Ndu, G., Foltin, M., Williams, R. S., Faraboschi, P., Hwu, W. M., Paul Strachan, J., Roy, K. & Milojicic, D. S., Apr 4 2019, ASPLOS 2019 - 24th International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 715-731 17 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access
Memristors
Particle accelerators
Learning systems
Image recognition
Energy efficiency

Program optimization space pruning for a multithreaded GPU

Ryoo, S., Rodrigues, C. I., Stone, S. S., Baghsorkhi, S. S., Ueng, S. Z., Stratton, J. A. & Hwu, W. M. W., May 19 2008, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization. p. 195-204 10 p. (Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Graphics processing unit
Tuning
Inspection

Performance insights on executing non-graphics applications on CUDA on the NVIDIA GeForce 8800 GTX

Hwu, W. M., Kiirk, D., Ryoo, S., Rodriigues, C., Stratton, J. & Huang, K., May 31 2016, 2007 IEEE Hot Chips 19 Symposium, HCS 2007. Institute of Electrical and Electronics Engineers Inc., 7482492. (2007 IEEE Hot Chips 19 Symposium, HCS 2007).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Systems analysis
Specifications
Data storage equipment
Processing
Graphics processing unit

Parallel solutions of inverse multiple scattering problems with born-type fast solvers

Hidayetoǧlu, M., Yang, C., Wang, L., Podkowa, A., Oelze, M., Hwu, W. M. & Chew, W. C., Nov 3 2016, 2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc., p. 916-920 5 p. 7734520. (2016 Progress In Electromagnetics Research Symposium, PIERS 2016 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Multiple scattering
scattering
Iterative methods
multipoles
Scattering

Parallel implementation of multi-dimensional ensemble empirical mode decomposition

Chang, L. W., Lo, M. T., Anssari, N., Hsu, K. H., Huang, N. E. & Hwu, W. M. W., Aug 18 2011, 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. p. 1621-1624 4 p. 5946808. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Decomposition
Computer programming
Graphics processing unit

Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Ryoo, S., Rodrigues, C. I., Baghsorkhi, S. S., Stone, S. S., Kirk, D. B. & Hwu, W. M. W., Dec 1 2008, PPoPP'08 - Proceedings of the 2008 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 73-82 10 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Bandwidth
Graphics processing unit
Hardware

Optimization of tele-immersion codes

Sidelnik, A., Sung, I. J., Wu, W., Garzarán, M. J., Hwu, W. M., Nahrstedt, K., Padua, D. & Patel, S. J., Jul 23 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2. 1 p. (Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer vision
Parallel programming
Tuning
Productivity
Graphics processing unit

Optimization and architecture effects on GPU computing workload performance

Stratton, J. A., Anssari, N., Rodrigues, C., Sung, I. J., Obeid, N., Chang, L., Liu, G. D. & Hwu, W-M. W., Dec 12 2012, 2012 Innovative Parallel Computing, InPar 2012. 6339605. (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Bandwidth
Dynamic random access storage
Coarsening
Throughput

ON TUNING THE MICROARCHITECTURE OF AN HPS IMPLEMENTATION OF THE VAX.

Wilson, J. E., Melvin, S., Shebanow, M., Hwu, W-M. W. & Patt, Y. N., 1987, MICRO: Annual Microprogramming Workshop. ACM, p. 162-167 6 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tuning

NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems

Pearson, C., Chung, I. H., Sura, Z., Hwu, W. M. & Xiong, J., Jan 1 2018, High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers. Weiland, M., Yokota, R., Alam, S. & Shalf, J. (eds.). Springer-Verlag, p. 448-454 7 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 11203 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data Transfer
Data transfer
Particle accelerators
Computer systems programming
Hardware

Near-Memory and In-Storage FPGA Acceleration for Emerging Cognitive Computing Workloads

Dhar, A., Huang, S., Xiong, J., Jamsek, D., Mesnet, B., Huang, J., Kim, N. S., Hwu, W. M. & Chen, D., Jul 2019, Proceedings - 2019 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019. IEEE Computer Society, p. 68-75 8 p. 8839401. (Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI; vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Particle accelerators
Data storage equipment
Bandwidth
Data transfer

Multilevel granularity parallelism synthesis on FPGAs

Papakonstantinou, A., Liang, Y., Stratton, J. A., Gururaj, K., Chen, D., Hwu, W-M. W. & Cong, J., Jun 17 2011, Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011. p. 178-185 8 p. 5771270. (Proceedings - IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Particle accelerators
Clocks
Hardware
High level synthesis

MLModelScope: Evaluate and introspect cognitive pipelines

Li, C., Dakkak, A., Xiong, J. & Hwu, W. M., Jul 2019, Proceedings - 2019 IEEE World Congress on Services, SERVICES 2019. Chang, C. K., Chen, P., Goul, M., Oyama, K., Reiff-Marganiec, S., Sun, Y., Wang, S. & Wang, Z. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 335-338 4 p. 8817116. (Proceedings - 2019 IEEE World Congress on Services, SERVICES 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pipelines
Learning systems
Innovation
Hardware
Deep learning

MemXCT: Memory-centric X-ray CT reconstruction with massive parallelization

Hidayetolu, M., Biçer, T., De Gonzalo, S. G., Ren, B., Gürsoy, D., Kettimuthu, R., Foster, I. T. & Hwu, W. M. W., Nov 17 2019, Proceedings of SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, a85. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
X rays
Supercomputers
Synchrotrons
Image quality

MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs

Stratton, J. A., Stone, S. S. & Hwu, W-M. W., Dec 1 2008, Languages and Compilers for Parallel Computing - 21st International Workshop, LCPC 2008, Revised Selected Papers. p. 16-30 15 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 5335 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Efficient Implementation
Program processors
Parallel programming
Parallel Programming
kernel

Long time-scale simulations of in vivo diffusion using GPU hardware

Roberts, E., Stone, J. E., Sepúlveda, L., Hwu, W-M. W. & Luthey-Schulten, Z. A., Nov 25 2009, IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium. 5160930. (IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Graphics processing unit

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures

Kim, H. S., Hajj, I. E., Stratton, J., Lumetta, S. S. & Hwu, W-M. W., Mar 3 2015, Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015. Institute of Electrical and Electronics Engineers Inc., p. 257-268 12 p. 7054205. (Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer programming
Locality
Thread
Programming Model
Program processors

Large inverse-scattering solutions with DBIM on GPU-enabled supercomputers

Hidayetogglu, M., Pearson, C., Chew, W. C., Gurel, L. & Hwu, W-M. W., May 1 2017, 2017 International Applied Computational Electromagnetics Society Symposium - Italy, ACES 2017. Institute of Electrical and Electronics Engineers Inc., 7916310. (2017 International Applied Computational Electromagnetics Society Symposium - Italy, ACES 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

supercomputers
Supercomputers
Inverse Scattering
inverse scattering
Supercomputer

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism

Hajj, I. E., Gomez-Luna, J., Li, C., Chang, L. W., Milojicic, D. & Hwu, W. M., Dec 14 2016, MICRO 2016 - 49th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 7783716. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2016-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Agglomeration
Electric fuses
Throughput
Graphics processing unit