TY - GEN
T1 - MirrorTaint
T2 - 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023
AU - Ouyang, Yicheng
AU - Shao, Kailai
AU - Chen, Kunqiu
AU - Shen, Ruobing
AU - Chen, Chao
AU - Xu, Mingze
AU - Zhang, Yuqun
AU - Zhang, Lingming
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Taint analysis, i.e., labeling data and propagating the labels through data flows, has been widely used for analyzing program information flows and ensuring system/data security. Due to its important applications, various taint analysis techniques have been proposed, including static and dynamic taint analysis. However, existing taint analysis techniques can be hardly applied to the rising microservice systems for industrial applications. To address such a problem, in this paper, we proposed the first practical non-intrusive dynamic taint analysis technique MirrorTaint for extensively supporting microservice systems on JVMs. In particular, by instrumenting the microservice systems, MirrorTaint constructs a set of data structures with their respective policies for labeling/propagating taints in its mirrored space. Such data structures are essentially non-intrusive, i.e., modifying no program meta-data or runtime system. Then, during program execution, MirrorTaint replicates the stack-based JVM instruction execution in its mirrored space on-the-fly for dynamic taint tracking. We have evaluated MirrorTaint against state-of-the-art dynamic and static taint analysis systems on various popular open-source microservice systems. The results demonstrate that MirrorTaint can achieve better compatibility, quite close precision and higher recall (97.9%/100.0%) than state-of-the-art Phosphor (100.0%/9.9%) and FlowDroid (100%/28.2%). Also, MirrorTaint incurs lower runtime overhead than Phosphor (although both are dynamic techniques). Moreover, we have performed a case study in Ant Group, a global billion-user FinTech company, to compare MirrorTaint and their mature developer-experience-based data checking system for automatically generated fund documents. The result shows that the developer experience can be incomplete, causing the data checking system to only cover 84.0% total data relations, while MirrorTaint can automatically find 99.0% relations with 100.0% precision. Lastly, we also applied MirrorTaint to successfully detect a recently wide-spread Log4j2 security vulnerability.
AB - Taint analysis, i.e., labeling data and propagating the labels through data flows, has been widely used for analyzing program information flows and ensuring system/data security. Due to its important applications, various taint analysis techniques have been proposed, including static and dynamic taint analysis. However, existing taint analysis techniques can be hardly applied to the rising microservice systems for industrial applications. To address such a problem, in this paper, we proposed the first practical non-intrusive dynamic taint analysis technique MirrorTaint for extensively supporting microservice systems on JVMs. In particular, by instrumenting the microservice systems, MirrorTaint constructs a set of data structures with their respective policies for labeling/propagating taints in its mirrored space. Such data structures are essentially non-intrusive, i.e., modifying no program meta-data or runtime system. Then, during program execution, MirrorTaint replicates the stack-based JVM instruction execution in its mirrored space on-the-fly for dynamic taint tracking. We have evaluated MirrorTaint against state-of-the-art dynamic and static taint analysis systems on various popular open-source microservice systems. The results demonstrate that MirrorTaint can achieve better compatibility, quite close precision and higher recall (97.9%/100.0%) than state-of-the-art Phosphor (100.0%/9.9%) and FlowDroid (100%/28.2%). Also, MirrorTaint incurs lower runtime overhead than Phosphor (although both are dynamic techniques). Moreover, we have performed a case study in Ant Group, a global billion-user FinTech company, to compare MirrorTaint and their mature developer-experience-based data checking system for automatically generated fund documents. The result shows that the developer experience can be incomplete, causing the data checking system to only cover 84.0% total data relations, while MirrorTaint can automatically find 99.0% relations with 100.0% precision. Lastly, we also applied MirrorTaint to successfully detect a recently wide-spread Log4j2 security vulnerability.
KW - JVM
KW - dynamic taint analysis
KW - microservice
UR - http://www.scopus.com/inward/record.url?scp=85171790574&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171790574&partnerID=8YFLogxK
U2 - 10.1109/ICSE48619.2023.00210
DO - 10.1109/ICSE48619.2023.00210
M3 - Conference contribution
AN - SCOPUS:85171790574
T3 - Proceedings - International Conference on Software Engineering
SP - 2514
EP - 2526
BT - Proceedings - 2023 IEEE/ACM 45th International Conference on Software Engineering, ICSE 2023
PB - IEEE Computer Society
Y2 - 15 May 2023 through 16 May 2023
ER -