Learn More
We describe efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e., a twig, in a large node-labeled tree, using a summary data structure. This problem is of interest for queries on XML and other hierarchical data, to provide query feedback and for costbased query optimization. Our summary data structure(More)
Time series are recorded values of an interesting phenomenon such as stock prices, household incomes, or patient heart rates over a period of time. Time series data mining focuses on discovering interesting patterns in such data. This article introduces a wavelet-based time series data analysis to interested readers. It provides a systematic survey of(More)
Over the last decades, improvements in CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude, enabling the use of data compression techniques to improve the performance of database systems. Previous work describes the benefits of compression for numerical attributes, where data is stored in compressed format on(More)
Topic models have been widely used to discover latent topics in text documents. However, they may produce topics that are not interpretable for an application. Researchers have proposed to incorporate prior domain knowledge into topic models to help produce coherent topics. The knowledge used in existing models is typically domain dependent and assumed to(More)
Database queries are often exploratory and users often find their queries return too many answers, many of them irrelevant. Existing work either categorizes or ranks the results to help users locate interesting results. The success of both approaches depends on the utilization of user preferences. However, most existing work assumes that all users have the(More)
Ischemic postconditioning (Postcond) is defined as rapid intermittent interruptions of blood flow in the early phase of reperfusion and mechanically alters the hydrodynamics of reperfusion. Although Postcond has been demonstrated to attenuate ischemia/reperfusion (I/R) injury in the heart and brain, its roles to renal I/R injury remain to be defined. In the(More)
Adriamycin (Adr) and docetaxel (Doc) are two chemotherapeutic agents commonly used in the treatment of breast cancer. However, patients with breast cancer who are treated by the drugs often develop resistance to them and some other drugs. Recently studies have shown that microRNAs (miRNAs, miRs) play an important role in drug-resistance. In present study,(More)
Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices suffer from at least one of three limitations: they focus only on indexing the structure (relying on a separate index for node content), they are useful only for simple path expressions such as root-to-leaf paths, or they cannot(More)