Wen-Mei W Hwu

If you made any changes in Pure these will be visible here soon.

Research Output

Filter
Conference contribution
2019

An efficient GPU implementation technique for higher-order 3D stencils

Anjum, O., Simon, G. D. G., Hidayetoglu, M. & Hwu, W. M., Aug 2019, Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019. Xiao, Z., Yang, L. T., Balaji, P., Li, T., Li, K. & Zomaya, A. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 552-561 10 p. 8855722. (Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs

Gonzalo, S. G. D., Huang, S., Gomez-Luna, J., Hammond, S., Mutlu, O. & Hwu, W. M., Mar 5 2019, CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization. Moseley, T., Jimborean, A. & Kandemir, M. T. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 73-84 12 p. 8661187. (CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DeepStore: In-storage acceleration for intelligent queries

Mailthody, V. S., Qureshi, Z., Liang, W., Feng, Z., Gonzalo, S. G. D., Li, Y., Franke, H., Xiong, J., Huang, J. & Hwu, W. M., Oct 12 2019, MICRO 2019 - 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Proceedings. IEEE Computer Society, p. 224-238 15 p. (Proceedings of the Annual International Symposium on Microarchitecture, MICRO).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Evaluating characteristics of CUDA communication primitives on high-bandwidth interconnects

Pearson, C., Dakkak, A., Hashash, S., Li, C., Chung, I. H., Xiong, J. & Hwu, W. M., Apr 4 2019, ICPE 2019 - Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 209-218 10 p. (ICPE 2019 - Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access

FlatFlash: Exploiting the Byte-Accessibility of SSDs within A Unified Memory-Storage Hierarchy

Abulila, A., Mailthody, V. S., Qureshi, Z., Huang, J., Kim, N. S., Xiong, J. & Hwu, W. M., Apr 4 2019, ASPLOS 2019 - 24th International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 971-985 15 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access

FPGA/DNN co-design: An efficient design methodology for IoT intelligence on the edge

Hao, C., Zhang, X., Li, Y., Huang, S., Xiong, J., Rupnow, K., Hwu, W. M. & Chen, D., Jun 2 2019, Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019. Institute of Electrical and Electronics Engineers Inc., a206. (Proceedings - Design Automation Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hardware-Software Co-Design for an Analog-Digital Accelerator for Machine Learning

Ambrosi, J., Ankit, A., Antunes, R., Chalamalasetti, S. R., Chatterjee, S., El Hajj, I., Fachini, G., Faraboschi, P., Foltin, M., Huang, S., Hwu, W. M., Knuppe, G., Lakshminarasimha, S. V., Milojicic, D., Parthasarathy, M., Ribeiro, F., Rosa, L., Roy, K., Silveira, P. & Strachan, J. P., Feb 8 2019, 2018 IEEE International Conference on Rebooting Computing, ICRC 2018. Institute of Electrical and Electronics Engineers Inc., 8638612. (2018 IEEE International Conference on Rebooting Computing, ICRC 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Implementing neural machine translation with bi-directional GRU and attention mechanism on FPGAs using HLS

Li, Q., Zhang, X., Xiong, J. J., Hwu, W. M. & Chen, D., Jan 21 2019, ASP-DAC 2019 - 24th Asia and South Pacific Design Automation Conference. Institute of Electrical and Electronics Engineers Inc., p. 693-698 6 p. (Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

MemXCT: Memory-centric X-ray CT reconstruction with massive parallelization

Hidayetolu, M., Biçer, T., De Gonzalo, S. G., Ren, B., Gürsoy, D., Kettimuthu, R., Foster, I. T. & Hwu, W. M. W., Nov 17 2019, Proceedings of SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, a85. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

MLModelScope: Evaluate and introspect cognitive pipelines

Li, C., Dakkak, A., Xiong, J. & Hwu, W. M., Jul 2019, Proceedings - 2019 IEEE World Congress on Services, SERVICES 2019. Chang, C. K., Chen, P., Goul, M., Oyama, K., Reiff-Marganiec, S., Sun, Y., Wang, S. & Wang, Z. (eds.). Institute of Electrical and Electronics Engineers Inc., p. 335-338 4 p. 8817116. (Proceedings - 2019 IEEE World Congress on Services, SERVICES 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

NAIS: Neural architecture and implementation search and its applications in autonomous driving

Hao, C., Hwu, W. M., Gu, J., Chen, D., Chen, Y., Liu, X., Sarwari, A., Sew, D., Dhar, A., Wu, B., Fu, D. & Xiong, J., Nov 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019 - Digest of Technical Papers. Institute of Electrical and Electronics Engineers Inc., 8942055. (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD; vol. 2019-November).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Near-Memory and In-Storage FPGA Acceleration for Emerging Cognitive Computing Workloads

Dhar, A., Huang, S., Xiong, J., Jamsek, D., Mesnet, B., Huang, J., Kim, N. S., Hwu, W. M. & Chen, D., Jul 2019, Proceedings - 2019 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019. IEEE Computer Society, p. 68-75 8 p. 8839401. (Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI; vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference

Ankit, A., El Hajj, I., Rahul Chalamalasetti, S., Ndu, G., Foltin, M., Williams, R. S., Faraboschi, P., Hwu, W. M., Paul Strachan, J., Roy, K. & Milojicic, D. S., Apr 4 2019, ASPLOS 2019 - 24th International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, p. 715-731 17 p. (International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access

Reinforcement learning based text style transfer without parallel training corpus

Gong, H., Bhat, S., Wu, L., Xiong, J. & Hwu, W. M., 2019, Long and Short Papers. Association for Computational Linguistics (ACL), p. 3168-3180 13 p. (NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference; vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

SPGNet: Semantic prediction guidance for scene parsing

Cheng, B., Uiuc, U., Chen, L. C., Wei, Y., Zhu, Y., Huang, Z., Xiong, J., Huang, T., Hwu, W. M. & Shi, H., Oct 2019, Proceedings - 2019 International Conference on Computer Vision, ICCV 2019. Institute of Electrical and Electronics Engineers Inc., p. 5217-5227 11 p. 9008568. (Proceedings of the IEEE International Conference on Computer Vision; vol. 2019-October).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TrIMS: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service

Dakkak, A., Li, C., De Gonzalo, S. G., Xiong, J. & Hwu, W. M., Jul 2019, Proceedings - 2019 IEEE International Conference on Cloud Computing, CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. Bertino, E., Chang, C. K., Chen, P., Damiani, E., Goul, M. & Oyama, K. (eds.). IEEE Computer Society, p. 372-382 11 p. 8814494. (IEEE International Conference on Cloud Computing, CLOUD; vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Update on k-truss Decomposition on GPU

Almasri, M., Anjum, O., Pearson, C., Qureshi, Z., Mailthody, V. S., Nagi, R., Xiong, J. & Hwu, W. M., Sep 2019, 2019 IEEE High Performance Extreme Computing Conference, HPEC 2019. Institute of Electrical and Electronics Engineers Inc., 8916285. (2019 IEEE High Performance Extreme Computing Conference, HPEC 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Update on triangle counting on GPU

Pearson, C., Almasri, M., Anjum, O., Mailthody, V. S., Qureshi, Z., Nagi, R., Xiong, J. & Hwu, W. M., Sep 2019, 2019 IEEE High Performance Extreme Computing Conference, HPEC 2019. Institute of Electrical and Electronics Engineers Inc., 8916547. (2019 IEEE High Performance Extreme Computing Conference, HPEC 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2020

Benanza: Automatic μbenchmark Generation to Compute "lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs

Li, C., Dakkak, A., Xiong, J. & Hwu, W. M., May 2020, Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020. Institute of Electrical and Electronics Engineers Inc., p. 440-450 11 p. 9139782. (Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

DLbricks: Composable benchmark generation to reduce deep learning benchmarking effort on CPUs

Li, C., Dakkak, A., Xiong, J. & Hwu, W. M., Apr 20 2020, ICPE 2020 - Proceedings of the ACM/SPEC International Conference on Performance Engineering. Association for Computing Machinery, Inc, p. 202-209 8 p. (ICPE 2020 - Proceedings of the ACM/SPEC International Conference on Performance Engineering).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access

Pare: A paper-reviewer matching approach using a common topic space

Anjum, O., Gong, H., Bhat, S., Xiong, J. & Hwu, W. M., 2020, EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics, p. 518-528 11 p. (EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

The design and implementation of the wolfram language compiler

Dakkak, A., Wickham-Jones, T. & Hwu, W. M., Feb 22 2020, CGO 2020 - Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization. Mars, J., Tang, L., Xue, J. & Wu, P. (eds.). Association for Computing Machinery, Inc, p. 212-228 17 p. (CGO 2020 - Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Open Access

XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs

Li, C., Dakkak, A., Xiong, J., Wei, W., Xu, L. & Hwu, W. M., May 2020, Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020. Institute of Electrical and Electronics Engineers Inc., p. 326-327 2 p. 9139875. (Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020).

Research output: Chapter in Book/Report/Conference proceedingConference contribution