Wen-Mei W Hwu

1984 …2019
If you made any changes in Pure, your changes will be visible here soon.

Research Output 1984 2019

Filter
Conference contribution
Conference contribution

AccDNN: An IP-Based DNN Generator for FPGAs

Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W-M. W. & Chen, D., Sep 7 2018, Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018. Institute of Electrical and Electronics Engineers Inc., 1 p. 8457659. (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Field programmable gate arrays (FPGA)
Data storage equipment
Network layers
Cloud computing
Resource allocation

Accelerating advanced MRI reconstructions on GPUs

Stone, S. S., Haldar, J. P., Tsao, S. C., Hwu, W. M. W., Liang, Z. P. & Sutton, B. P., Dec 1 2008, Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08. p. 261-272 12 p. (Conference on Computing Frontiers - Proceedings of the 2008 Conference on Computing Frontiers, CF'08).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magnetic resonance
Imaging techniques
Image quality
Data storage equipment
Image reconstruction

Accelerating iterative field-compensated MR image reconstruction on GPUs

Zhuo, Y., Wu, X. L., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Aug 9 2010, 2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings. p. 820-823 4 p. 5490112. (2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Image reconstruction
Magnetic Fields
Magnetic fields
Physics

Accelerating mr image reconstruction on GPUs

Hwu, W. M. W., Nandakumar, D., Haldar, J., Atkinson, I. C., Sutton, B., Liang, Z. P. & Thulborn, K. R., Nov 17 2009, Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009. p. 1283-1286 4 p. 5193297. (Proceedings - 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2009).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer-Assisted Image Processing
Image reconstruction
Imaging techniques
Graphics processing unit

Accelerating reduction and scan using tensor core units

Dakkak, A., Li, C., Xiong, J., Gelado, I. & Hwu, W. M., Jun 26 2019, ICS 2019 - International Conference on Supercomputing. Association for Computing Machinery, p. 46-57 12 p. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tensors
Energy efficiency
Electric power utilization
Bandwidth
Data storage equipment

Acceleration of the Pair-HMM Algorithm for DNA Variant Calling

Manikandan, G. J., Huang, S., Rupnow, K., Hwu, W-M. W. & Chen, D., Aug 16 2016, Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016. Institute of Electrical and Electronics Engineers Inc., 1 p. 7544765. (Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DNA
Particle accelerators
High level synthesis
System-on-chip

Adaptive cache bypass and insertion for many-core accelerators

Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z. & Hwu, W. M. W., Jan 1 2014, 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014. Association for Computing Machinery, p. 1-8 8 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Particle accelerators
Data storage equipment
Energy efficiency
Graphics processing unit

Adaptive Cache Management for Energy-Efficient GPU Computing

Chen, X., Chang, L. W., Rodrigues, C. I., Lv, J., Wang, Z. & Hwu, W-M. W., Jan 15 2015, Proceedings - 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014. January ed. IEEE Computer Society, p. 343-355 13 p. 7011400. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2015-January, no. January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Energy efficiency
Throughput
Graphics processing unit

Advanced MRI reconstruction toolbox with accelerating on GPU

Wu, X. L., Zhuo, Y., Gai, J., Lam, F., Fu, M., Haldar, J. P., Hwu, W. M., Liang, Z. P. & Sutton, B. P., Feb 11 2011, Proceedings of SPIE-IS and T Electronic Imaging - Parallel Processing for Imaging Applications. 78720Q. (Proceedings of SPIE - The International Society for Optical Engineering; vol. 7872).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magnetic Resonance Imaging
Magnetic resonance
magnetic resonance
Imaging techniques
Reconstruction Algorithm

A fast and massively-parallel inverse solver for multiple-scattering tomographic image reconstruction

Hidayetoglu, M., Pearson, C., El Hajj, I., Gurel, L., Chew, W. C. & Hwu, W-M. W., Aug 3 2018, Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018. Institute of Electrical and Electronics Engineers Inc., p. 64-74 11 p. 8425161. (Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Multiple scattering
Image reconstruction
Scattering
Forward scattering
Iterative methods

An adaptive performance modeling tool for GPU architectures

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D. & Hwu, W. M. W., Mar 15 2010, PPoPP'10 - Proceedings of the 2010 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. p. 105-114 10 p. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flow graphs
Data storage equipment
Graphics processing unit
Flow control
Analytical models

Analysis and modeling of collaborative execution strategies for heterogeneous CPU-FPGA architectures

Huang, S., De Gonzalo, S. G., El-Hadedy, M., Chang, L. W., Gómez-Luna, J., Milojicic, D., El Hajj, I., Chalamalasetti, S. R., Mutlu, O., Chen, D. & Hwu, W. M., Apr 4 2019, ICPE 2019 - Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 79-90 12 p. (ICPE 2019 - Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Field programmable gate arrays (FPGA)
Particle accelerators
Data storage equipment
Computer programming languages

An analytical approach to scheduling code for superscalar and VLIW architectures

Chen, S. K., Fuchs, W. & Hwu, W-M. W., Jan 1 1994, Proceedings of the 1994 International Conference on Parallel Processing, ICPP 1994. Institute of Electrical and Electronics Engineers Inc., 4115732. (Proceedings of the International Conference on Parallel Processing; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Very long instruction word architecture
Superscalar
Scheduling
Speedup
Subroutines

An architecture framework for introducing predicated execution into embedded microprocessors

Connors, D. A., Puiatti, J. M., August, D. I., Crozier, K. M. & Hwu, W. M. W., Dec 1 1999, Euro-Par 1999 - Parallel Processing: 5th International Conference, Proceedings. p. 1301-1311 11 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 1685 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Instruction Level Parallelism
Microprocessor
Microprocessor chips
Branch
High Performance

An asymmetric distributed shared memory model for heterogeneous parallel systems

Gelado, I., Cabezas, J., Navarro, N., Stone, J. E., Patel, S. & Hwu, W. M. W., May 19 2010, ASPLOS XV - 15th International Conference on Architectural Support for Programming Languages and Operating Systems. p. 347-358 12 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer systems
Data storage equipment
Particle accelerators
Program processors
Data transfer

An effective GPU implementation of breadth-first search

Luo, L., Wong, M. & Hwu, W. M., Sep 7 2010, Proceedings of the 47th Design Automation Conference, DAC '10. p. 52-55 4 p. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Breadth-first Search
Program processors
Design Automation
Computational complexity
Accelerate

An efficient GPU implementation technique for higher-order 3D stencils

Anjum, O., Simon, G. D. G., Hidayetoglu, M. & Hwu, W. M., Aug 2019, Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019. Xiao, Z., Yang, L. T., Balaji, P., Li, T., Li, K. & Zomaya, A. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 552-561 10 p. 8855722. (Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Bandwidth
Graphics processing unit
Grid
Scaling

An empirical study of function pointers using SPEC benchmarks

Cheng, B. C. & Hwu, W-M. W., Jan 1 2000, Languages and Compilers for Parallel Computing - 12th International Workshop, LCPC 1999, Proceedings. Carter, L. & Ferrante, J. (eds.). Springer-Verlag, p. 490-493 4 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 1863).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Empirical Study
Benchmark
Extractor
Graph in graph theory
Compiler

An experimental single-chip data flow CPU

Uvieghara, G. A., Hwu, W-M. W., Nakagome, Y., Jeong, D. K., Lee, D., Hodges, D. A. & Patt, Y., 1990, 90 Symp VLSI Circuits. Publ by IEEE, p. 119-120 2 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Data storage equipment
Interfaces (computer)
Transistors
Throughput

Application of compiler-assisted multiple-instruction retry to VLIW architectures

Chen, S. K., Fuchs, W. K. & Hwu, W-M. W., 1995, Proceedings of the Conference on Fault-Tolerant Parallel and Distributed Systems. IEEE, p. 51-58 8 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Very long instruction word architecture
Hazards
Hardware

Application-Transparent near-memory processing architecture with memory channel network

Alian, M., Min, S. W., Asgharimoghaddam, H., Dhar, A., Wang, D. K., Roewer, T., McPadden, A., O'Halloran, O., Chen, D., Xiong, J., Kim, D., Hwu, W. M. & Kim, N. S., Dec 12 2018, Proceedings - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018. IEEE Computer Society, p. 802-814 13 p. 8574587. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2018-October).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer networks
Data storage equipment
Processing
Servers
Distributed computer systems

A programming system for future proofing performance critical libraries

Chang, L. W., El Hajj, I., Kim, H. S., Gómez-Luna, J., Dakkak, A. & Hwu, W-M. W., Feb 27 2016, 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016 - Proceedings. Association for Computing Machinery, 32. (Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP; vol. 12-16-March-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer systems programming
Tuning
Knobs
Coarsening
Chemical analysis

Architectural support for compiler-synthesized dynamic branch prediction strategies: Rationale and initial results

August, D. I., Connors, D. A., Gyllenhaal, J. C. & Hwu, W-M. W., 1997, IEEE High-Performance Computer Architecture Symposium Proceedings. Anon (ed.). IEEE, p. 84-93 10 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Statistics
Experiments

A scalable, numerically stable, high-performance tridiagonal solver using GPUs

Chang, L. W., Stratton, J. A., Kim, H. S. & Hwu, W-M. W., Dec 1 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012. 6468510. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Program processors
Throughput
Graphics processing unit

A scalable tridiagonal solver for GPUs

Kim, H. S., Wu, S., Chang, L. W. & Hwu, W-M. W., Nov 7 2011, Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011. p. 444-453 10 p. 6047212. (Proceedings of the International Conference on Parallel Processing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tridiagonal matrix
Cyclic Reduction
Parallel algorithms
Parallel Algorithms
Tridiagonal Systems

A software based approach to achieving optimal performance for signature control flow checking

Warter, N. J. & Hwu, W. M. W., Dec 1 1990, Digest of Papers - FTCS (Fault-Tolerant Computing Symposium). Publ by IEEE, p. 442-449 8 p. (Digest of Papers - FTCS (Fault-Tolerant Computing Symposium)).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flow control
Flow graphs
Computer architecture
Hardware

A study of code reuse and sharing characteristics of Java applications

Conte, M. T., Trick, A. R., Gyllenhaal, J. C. & Hwu, W-M. W., Jan 1 1998, Workload Characterization: Methodology and Case Studies - Based on the 1st Workshop on Workload Characterization. Maynard, A. M. G. & John, L. K. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 27-35 9 p. 809356. (Workload Characterization: Methodology and Case Studies - Based on the 1st Workshop on Workload Characterization; vol. 1998-November).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Internet
Web crawler

A study of the effects of compiler-controlled speculation on instruction and data caches

Bringmann, R. A., Mahlke, S. A. & Hwu, W-M. W., Jan 1 1995, Proceedings of the 28th Annual Hawaii International Conference on System Sciences, HICSS 1995. IEEE Computer Society, p. 211-220 10 p. 375392. (Proceedings of the Annual Hawaii International Conference on System Sciences; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

A tiling-scheme Viterbi decoder in Software Defined Radio for GPUs

Lin, C. S., Liu, W. L., Yeh, W. T., Chang, L. W., Hwu, W-M. W., Chen, S. J. & Hsiung, P. A., Oct 31 2011, 7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011. 6036680. (7th International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Decoding
radio
Hamming distance
Merging
Program processors

Automatic discovery of coarse-grained parallelism in media applications

Ryoo, S., Ueng, S. Z., Rodrigues, C. I., Kidd, R. E., Frank, M. I. & Hwu, W. M. W., Dec 1 2007, Transactions on High-Performance Embedded Architectures and Compilers I. p. 194-213 20 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 4050 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parallelism
Hardware
Processing
Computer programming languages
Particle accelerators

Automatic execution of single-GPU computations across multiple GPUs

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W-M. W., Jan 1 2014, PACT 2014 - Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques. Institute of Electrical and Electronics Engineers Inc., p. 467-468 2 p. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

kernel
Decompose
Runtime Systems
Data Distribution
Interconnect

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs

Gonzalo, S. G. D., Huang, S., Gomez-Luna, J., Hammond, S., Mutlu, O. & Hwu, W. M., Mar 5 2019, CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization. Moseley, T., Jimborean, A. & Kandemir, M. T. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 73-84 12 p. 8661187. (CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Portability
Shuffle
Hardware
Domain-specific Languages
Programming

Automatic parallelization of kernels in shared-memory multi-GPU nodes

Cabezas, J., Vilanova, L., Gelado, I., Jablin, T. B., Navarro, N. & Hwu, W-M. W., Jun 8 2015, ICS 2015 - Proceedings of the 29th ACM International Conference on Supercomputing. Association for Computing Machinery, p. 3-13 11 p. (Proceedings of the International Conference on Supercomputing; vol. 2015-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Graphics processing unit
Scheduling
Costs

Beating in-order stalls with "flea-flicker" two-pass pipelining

Barnes, R. D., Patel, S. J., Nystrom, E. M., Navarro, N., Sias, J. W. & Hwu, W-M. W., Jan 1 2003, Proceedings - 36th International Symposium on Microarchitecture, MICRO 2003. IEEE Computer Society, p. 387-398 12 p. 1253243. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. 2003-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pipelines
Transistors

Benchmark characterization for experimental system evaluation

Conte, T. M. & Hwu, W. M. W., Jan 1 1990, Proceedings of the Hawaii International Conference on System Science. Hoevel, L. W., Shriver, B. D., Nunamaker, J. F. J., Sprague, R. H. J. & Milutinovic, V. (eds.). Publ by Western Periodicals Co, p. 6-18 13 p. (Proceedings of the Hawaii International Conference on System Science; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Data storage equipment
Benchmarking
Systems analysis

Branch recovery with compiler-assisted multiple instruction retry

Alewine, N. J., Chen, S. K., Li, C. C., Fuchs, W. K. & Hwu, W. M., Jan 1 1992, FTCS 1992 - 22nd Annual International Symposium on Fault-Tolerant Computing. Institute of Electrical and Electronics Engineers Inc., p. 66-73 8 p. 243614. (FTCS 1992 - 22nd Annual International Symposium on Fault-Tolerant Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware
Recovery
Hazards

Chai: Collaborative heterogeneous applications for integrated-Architectures

Ǵomez-Luna, J., Hajj, I. E., Chang, L. W., Garćia-Flores, V., De Gonzalo, S. G., Jablin, T. B., Pẽna, A. J. & Hwu, W-M. W., Jul 11 2017, ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software. Institute of Electrical and Electronics Engineers Inc., p. 43-54 12 p. 7975269. (ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer programming languages
Program processors
Data storage equipment
Specifications
Experiments

Characterizing the impact of predicated execution on branch prediction

Mahlke, S. A., Hank, R. E., Bringmann, R. A., Gyllenhaal, J. C., Gallagher, D. M. & Hwu, W-M. W., Nov 30 1994, Proceedings of the 27th Annual International Symposium on Microarchitecture, MICRO 1994. IEEE Computer Society, p. 217-227 11 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO; vol. Part F129425).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

CHECKPOINT REPAIR FOR OUT-OF-ORDER EXECUTION MACHINES.

Hwu, W-M. W. & Patt, Y. N., 1987, Conference Proceedings - Annual Symposium on Computer Architecture. IEEE, p. 18-26 9 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Repair
Cache memory
Supercomputers
Engines
Hardware

CIGAR: Application partitioning for a CPU/coprocessor architecture

Kelm, J. H., Gelado, I., Murphy, M. J., Navarro, N., Lumetta, S. S. & Hwu, W-M. W., Dec 1 2007, 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007. p. 317-326 10 p. 4336222. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Partitioning
Prototyping
Embedded Processor
Methodology

ClMPI: An opencl extension for interoperation with the message passing interface

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H. & Hwu, W. M. W., Jan 1 2013, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, p. 1138-1148 11 p. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Message Passing Interface
Message passing
Data transfer
Data Transfer
Program processors

Code coverage and input variability: Effects on architecture and compiler research

Hunter, H. C. & Hwu, W-M. W., Dec 1 2002, Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02. p. 79-87 9 p. (Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Telecommunication
Benchmarking
Experiments
Compliance

Collaborative (CPU + GPU) Algorithms for Triangle Counting and Truss Decomposition

Mailthody, V. S., Date, K., Qureshi, Z., Pearson, C., Nagi, R., Xiong, J. & Hwu, W-M. W., Nov 26 2018, 2018 IEEE High Performance Extreme Computing Conference, HPEC 2018. Institute of Electrical and Electronics Engineers Inc., 8547517. (2018 IEEE High Performance Extreme Computing Conference, HPEC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Decomposition
Hardware
Graphics processing unit

Collaborative (CPU + GPU) algorithms for triangle counting and truss decomposition on the Minsky architecture: Static graph challenge: Subgraph isomorphism

Date, K., Feng, K., Nagi, R., Xiong, J., Kim, N. S. & Hwu, W-M. W., Oct 30 2017, 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017. Institute of Electrical and Electronics Engineers Inc., 8091042. (2017 IEEE High Performance Extreme Computing Conference, HPEC 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Program processors
Decomposition
Data storage equipment
Benchmarking
Graphics processing unit

Collaborative computing for heterogeneous integrated systems

Chang, L. W., Gómez-Luna, J., El Hajj, I., Huang, S., Chen, D. & Hwu, W-M. W., Apr 17 2017, ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 385-388 4 p. (ICPE 2017 - Proceedings of the 2017 ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Computer supported cooperative work
Program processors
Field programmable gate arrays (FPGA)
Computer systems
Data storage equipment

Comparative performance evaluation of multi-GPU MLFMA implementation for 2-D VIE problems

Pearson, C., Hidayetoglu, M., Ren, W., Chew, W. C. & Hwu, W-M. W., Jul 25 2017, CEM 2017 - 2017 Computing and Electromagnetics International Workshop. Gurel, L. (ed.). Institute of Electrical and Electronics Engineers Inc., p. 63-64 2 p. 7991888. (CEM 2017 - 2017 Computing and Electromagnetics International Workshop).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

multipoles
evaluation
two dimensional bodies
water
supercomputers

Comparing static and dynamic code scheduling for multiple-instruction-issue processors

Chang, P. P., Chen, W. Y., Mahlke, S. A. & Hwu, W-M. W., Sep 1 1991, MICRO 1991 - Proceedings of the 24th Annual International Symposium on Microarchitecture. IEEE Computer Society, p. 25-33 9 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scheduling
Hardware
Experiments

Comparison based sorting for systems with multiple GPUs

Tanasic, I., Vilanova, L., Jordà, M., Cabezas, J., Gelado, I., Navarro, N. & Hwu, W-M. W., Apr 15 2013, Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, GPGPU 2013. p. 1-11 11 p. (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sorting
Graphics processing unit

Comparison of full and partial predicated execution support for ILP processors

Mahlke, S. A., Hank, R. E., McCormick, J. E., August, D. I. & Hwu, W-M. W., 1995, ACM SIGARCH (Association for Computing Nachinery Special Interest Group on Computer Architecture) - Conference Proceedings. ACM, p. 138-149 12 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Inductive logic programming (ILP)
Code generation

Comparison of full and partial predicated execution support for ILP processors

Mahlke, S. A., Hank, R. E., McCormick, J. E., August, D. I. & Hwu, W. M. W., Jan 1 1995, Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. p. 138-149 12 p. (Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Inductive logic programming (ILP)
Code generation