TY - JOUR
T1 - Stochastic gradient descent-based support vector machines training optimization on Big Data and HPC frameworks
AU - Abeykoon, Vibhatha
AU - Fox, Geoffrey
AU - Kim, Minje
AU - Ekanayake, Saliya
AU - Kamburugamuve, Supun
AU - Govindarajan, Kannan
AU - Wickramasinghe, Pulasthi
AU - Perera, Niranda
AU - Widanage, Chathura
AU - Uyar, Ahmet
AU - Gunduz, Gurhan
AU - Akkas, Selahatin
N1 - Publisher Copyright:
© 2021 John Wiley & Sons Ltd.
PY - 2022/4/10
Y1 - 2022/4/10
N2 - Support vector machines (SVM) is a widely used machine learning algorithm. With the increasing amount of research data nowadays, understanding how to do efficient training is more important than ever. This article discusses the performance optimizations and benchmarks related to providing high-performance support for SVM training. In this research, we have focused on a highly scalable gradient descent-based approach to implementing the core SVM algorithm. In providing a scalable solution, we have designed optimized high-performance computing and dataflow-oriented SVM implementations. A high-performance computing approach means the algorithm is implemented with the bulk synchronous parallel (BSP) model. In addition, we analyzed the language level optimizations and math kernel optimizations on a prominent HPC modeling programming language (C++) and dataflow modeling programming language (Java). In the experiments, we compared the performance of classic HPC models, classic dataflow models, and hybrid models designed on classic HPC and dataflow programming models. Our research illustrates a scientific approach in designing the SVM algorithm at scale in classic HPC, dataflow, and hybrid systems.
AB - Support vector machines (SVM) is a widely used machine learning algorithm. With the increasing amount of research data nowadays, understanding how to do efficient training is more important than ever. This article discusses the performance optimizations and benchmarks related to providing high-performance support for SVM training. In this research, we have focused on a highly scalable gradient descent-based approach to implementing the core SVM algorithm. In providing a scalable solution, we have designed optimized high-performance computing and dataflow-oriented SVM implementations. A high-performance computing approach means the algorithm is implemented with the bulk synchronous parallel (BSP) model. In addition, we analyzed the language level optimizations and math kernel optimizations on a prominent HPC modeling programming language (C++) and dataflow modeling programming language (Java). In the experiments, we compared the performance of classic HPC models, classic dataflow models, and hybrid models designed on classic HPC and dataflow programming models. Our research illustrates a scientific approach in designing the SVM algorithm at scale in classic HPC, dataflow, and hybrid systems.
KW - dataflow
KW - high-performance computing
KW - hybrid systems
KW - machine learning
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=85103369700&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103369700&partnerID=8YFLogxK
U2 - 10.1002/cpe.6292
DO - 10.1002/cpe.6292
M3 - Article
AN - SCOPUS:85103369700
SN - 1532-0626
VL - 34
JO - Concurrency and Computation: Practice and Experience
JF - Concurrency and Computation: Practice and Experience
IS - 8
M1 - e6292
ER -