Learn More
Tree pattern matching is one of the most fundamental tasks for XML query processing. Holistic twig query processing techniques [4, 16] have been developed to minimize the intermediate results, namely, those root-to-leaf path matches that are not in the final twig results. However, useless path matches cannot be completely avoided, especially when there is a(More)
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies a set of Web page segments, each of which represents an individual object (e.g., a product). State-of-the-art methods suffice for simple search, but they often fail to handle more(More)
Tree pattern matching is one of the most fundamental tasks for XML query processing. Holistic twig query processing techniques [4, 16] have been developed to minimize the intermediate results, namely, those root-to-leaf path matches that are not in the final twig results. However, useless path matches cannot be completely avoided, especially when there is a(More)
XML message filtering problem involves searching for instances of a given, potentially large, set of patterns in a continuous stream of XML messages. Since the messages arrive continuously, it is essential that the filtering rate matches the data arrival rate. Therefore , the given set of filter patterns needs to be indexed appropriately to enable real-time(More)
Wide-area database replication technologies and the availability of content delivery networks allow Web applications to be hosted and served from powerful data centers. This form of application support requires a complete Web application suite to be distributed along with the database replicas. A major advantage of this approach is that dynamic content is(More)
XML message filtering problem involves searching for instances of a given, potentially large, set of patterns in a continuous stream of XML messages. Since the messages arrive continuously, it is essential that the filtering rate matches the data arrival rate. Therefore, the given set of filter patterns needs to be indexed appropriately to enable real-time(More)
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly(More)
Web performance is a key differentiation among content providers. Snafus and slowdowns at major web sites demonstrate the difficulty that companies face trying to scale to a large amount of web traffic. One solution to this problem is to store web content at server-side and edge-caches for fast delivery to the end users. However, for many e-commerce sites,(More)