TY - GEN
T1 - AUTOLOOP
T2 - 6th IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2005
AU - Yin, Li
AU - Palmer, John
AU - Uttamchandani, Sandeep
AU - Katz, Randy
AU - Agha, Gul
PY - 2005
Y1 - 2005
N2 - Enterprise applications typically depend on guaranteed performance from the storage subsystem, lest they fail. However, changes in the workload characteristics, component failures, load surges, are unlikely to result in guaranteed performance for the applications. Given that widespread access protocols and scheduling policies are largely best-effort, the problem of meeting performance goals on a shared system is a very difficult one, and currently accomplished by human administrators, using a 24 × 7 Observe-Analyze-Act (OAA) loop. AUTOLOOP is an OAA automation framework that uses a combination of self-refining models and constrained optimization techniques. This paper gives an overview of the automation process, and focuses on the analyze aspect of the loop that selects the corrective action. The process of action selection today is "black magic" - human administrators use their years of experience and coarse-grained heuristics to select along a spectrum of actions ranging from short-term tuning (such as throttling of workloads) to long-term modifications (such as migration of data among the available resources). AUTOLOOP is the first-of-a-kind within storage systems that formalizes the task of action selection as a machine-executable constraint solving problem. AUTOLOOP exhaustively searches the solution-space of corrective actions, uses skyline analysis to short-list a subset of low-cost high-benefit actions, and selects the optimal set of actions along with a schedule to invoke them. The action selection takes into account the cost of action invocation, the expected benefit, the current and future workload needs, the overall load pattern on the system, and the application-level Service Level Objectives (SLOs).
AB - Enterprise applications typically depend on guaranteed performance from the storage subsystem, lest they fail. However, changes in the workload characteristics, component failures, load surges, are unlikely to result in guaranteed performance for the applications. Given that widespread access protocols and scheduling policies are largely best-effort, the problem of meeting performance goals on a shared system is a very difficult one, and currently accomplished by human administrators, using a 24 × 7 Observe-Analyze-Act (OAA) loop. AUTOLOOP is an OAA automation framework that uses a combination of self-refining models and constrained optimization techniques. This paper gives an overview of the automation process, and focuses on the analyze aspect of the loop that selects the corrective action. The process of action selection today is "black magic" - human administrators use their years of experience and coarse-grained heuristics to select along a spectrum of actions ranging from short-term tuning (such as throttling of workloads) to long-term modifications (such as migration of data among the available resources). AUTOLOOP is the first-of-a-kind within storage systems that formalizes the task of action selection as a machine-executable constraint solving problem. AUTOLOOP exhaustively searches the solution-space of corrective actions, uses skyline analysis to short-list a subset of low-cost high-benefit actions, and selects the optimal set of actions along with a schedule to invoke them. The action selection takes into account the cost of action invocation, the expected benefit, the current and future workload needs, the overall load pattern on the system, and the application-level Service Level Objectives (SLOs).
UR - http://www.scopus.com/inward/record.url?scp=33744980459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33744980459&partnerID=8YFLogxK
U2 - 10.1109/POLICY.2005.9
DO - 10.1109/POLICY.2005.9
M3 - Conference contribution
AN - SCOPUS:33744980459
SN - 0769522653
SN - 9780769522654
T3 - Proceedings - Sixth IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2005
SP - 129
EP - 138
BT - Proceedings - Sixth IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2005
Y2 - 6 June 2005 through 8 June 2005
ER -