TY - JOUR
T1 - Secure Federated Learning Across Heterogeneous Cloud and High-Performance Computing Resources
T2 - A Case Study on Federated Fine-Tuning of LLaMA 2
AU - Li, Zilinghan
AU - He, Shilan
AU - Chaturvedi, Pranshu
AU - Kindratenko, Volodymyr
AU - Huerta, Eliu A.
AU - Kim, Kibaek
AU - Madduri, Ravi
N1 - This work was supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract DE-AC02-06CH11357. This research is also part of the Delta research computing project, which is supported by the National Science Foundation (Award OCI 2005572) and the state of Illinois. Delta is a joint effort of the University of Illinois at Urbana-Champaign and the National Center for Supercomputing Applications. The work of Eliu A. Huerta was supported in part by National Science Foundation awards OAC-1931561 and OAC-2209892.
PY - 2024
Y1 - 2024
N2 - Federated learning enables multiple data owners to collaboratively train robust machine learning models without transferring large or sensitive local datasets by only sharing the parameters of the locally trained models. In this article, we elaborate on the design of our Advanced Privacy-Preserving Federated Learning (APPFL) framework, which streamlines end-to-end secure and reliable federated learning experiments across cloud computing facilities and high-performance computing resources by leveraging Globus Compute, a distributed function as a service platform, and Amazon Web Services. We further demonstrate the use case of APPFL in fine-tuning an LLaMA 2 7B model using several cloud resources and supercomputers.
AB - Federated learning enables multiple data owners to collaboratively train robust machine learning models without transferring large or sensitive local datasets by only sharing the parameters of the locally trained models. In this article, we elaborate on the design of our Advanced Privacy-Preserving Federated Learning (APPFL) framework, which streamlines end-to-end secure and reliable federated learning experiments across cloud computing facilities and high-performance computing resources by leveraging Globus Compute, a distributed function as a service platform, and Amazon Web Services. We further demonstrate the use case of APPFL in fine-tuning an LLaMA 2 7B model using several cloud resources and supercomputers.
UR - http://www.scopus.com/inward/record.url?scp=85189520580&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85189520580&partnerID=8YFLogxK
U2 - 10.1109/MCSE.2024.3382583
DO - 10.1109/MCSE.2024.3382583
M3 - Article
AN - SCOPUS:85189520580
SN - 1521-9615
VL - 26
SP - 52
EP - 58
JO - Computing in Science and Engineering
JF - Computing in Science and Engineering
IS - 3
ER -