TY - GEN
T1 - Visibly pushdown automata for streaming XML
AU - Kumar, Viraj
AU - Madhusudan, P.
AU - Viswanathan, Mahesh
PY - 2007
Y1 - 2007
N2 - We propose the study of visibly pushdown automata (VPA) for processing XML documents. VPAs are pushdown automata where the input determines the stack operation, and XML documents are naturally visibly pushdown with the VPA pushing onto the stack on open-tags and popping the stack on close-tags. In this paper we demonstrate the power and ease visibly pushdown automata give in the design of streaming algorithms for XML documents. We study the problems of type-checking streaming XML documents against SDTD schemas, and the problem of typing tags in a streaming XML document according to an SDTD schema. For the latter problem, we consider both pre-order typing and post-order typing of a document, which dynamically determines types at open-tags and close-tags respectively as soon as they are met. We also generalize the problems of pre-order and post-order typing to prefix querying. We show that a deterministic VPA yields an algorithm to the problem of answering in one pass the set of all answers to any query that has the property that a node satisfying the query is determined solely by the prefix leading to the node. All the streaming algorithms we develop in this paper are based on the construction of deterministic VPAs, and hence, for any fixed problem, the algorithms process each element of the input in constant time, and use space (d), where d is the depth of the document.
AB - We propose the study of visibly pushdown automata (VPA) for processing XML documents. VPAs are pushdown automata where the input determines the stack operation, and XML documents are naturally visibly pushdown with the VPA pushing onto the stack on open-tags and popping the stack on close-tags. In this paper we demonstrate the power and ease visibly pushdown automata give in the design of streaming algorithms for XML documents. We study the problems of type-checking streaming XML documents against SDTD schemas, and the problem of typing tags in a streaming XML document according to an SDTD schema. For the latter problem, we consider both pre-order typing and post-order typing of a document, which dynamically determines types at open-tags and close-tags respectively as soon as they are met. We also generalize the problems of pre-order and post-order typing to prefix querying. We show that a deterministic VPA yields an algorithm to the problem of answering in one pass the set of all answers to any query that has the property that a node satisfying the query is determined solely by the prefix leading to the node. All the streaming algorithms we develop in this paper are based on the construction of deterministic VPAs, and hence, for any fixed problem, the algorithms process each element of the input in constant time, and use space (d), where d is the depth of the document.
KW - Pushdown automata
KW - Query
KW - Schema
KW - Streaming algorithms
KW - Typing
KW - XML
UR - http://www.scopus.com/inward/record.url?scp=35348906173&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35348906173&partnerID=8YFLogxK
U2 - 10.1145/1242572.1242714
DO - 10.1145/1242572.1242714
M3 - Conference contribution
AN - SCOPUS:35348906173
SN - 1595936548
SN - 9781595936547
T3 - 16th International World Wide Web Conference, WWW2007
SP - 1053
EP - 1062
BT - 16th International World Wide Web Conference, WWW2007
T2 - 16th International World Wide Web Conference, WWW2007
Y2 - 8 May 2007 through 12 May 2007
ER -