Makoto Onizuka

Learn More
We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in our experiments we achieve a throughput of about 5.4MB/s, independent of the number of XPath expressions (up to 1,000,000 in our tests).(More)
We consider the problem of evaluating a large number of XPath expressions on a stream of XML packets. We contribute two novel techniques. The first is to use a single Deterministic Finite Automaton (DFA). The contribution here is to show that the DFA can be used effectively for this problem: in our experiments we achieve a constant throughput, independently(More)
Graphs are fundamental data structures and have been em-<lb>ployed for centuries to model real-world systems and phe-<lb>nomena. Random walk with restart (RWR) provides a good<lb>proximity score between two nodes in a graph, and it has<lb>been successfully used in many applications such as auto-<lb>matic image captioning, recommender systems, and link(More)
We describe a toolkit for highly scalable XML data processing, consisting of two components. The first is a collection of stand-alone XML tools, s.a. sorting, aggregation, nesting, and unnesting, that can be chained to express more complex restructurings. The second is a highly scalable XPath processor for XML streams that can be used to develop scalable(More)
Graphs are a fundamental data structure and have been employed to model objects as well as their relationships. The similarity of objects on the web (e.g., webpages, photos, music, micro-blogs, and social networking service users) is the key to identifying relevant objects in many recent applications. SimRank, proposed by Jeh and Widom, provides a good(More)
Personalize PageRank (PPR) is an effective relevance (proximity) measure in graph mining. The goal of this paper is to efficiently compute single node relevance and top-k/highly relevant nodes without iteratively computing the relevances of all nodes. Based on a "random surfer model", PPR iteratively computes the relevances of all nodes in a graph until(More)
Interleukin IL-17 is a proinflammatory cytokine that has been implicated in the pathogenesis of various autoimmune diseases. The single nucleotide polymorphism (SNP), rs2275913, in the promoter region of the IL-17 gene is associated with susceptibility to ulcerative colitis. When we examined the impact of rs2275913 in a cohort consisting of 438 pairs of(More)
Stem cells of highly regenerative organs including blood are susceptible to endogenous DNA damage caused by both intrinsic and extrinsic stress. Response mechanisms to such stress equipped in hematopoietic stem cells (HSCs) are crucial in sustaining hematopoietic homeostasis but remain largely unknown. In this study, we demonstrate that serial(More)
<i>Personalized PageRank (PPR)</i> has been successfully applied to various applications. In real applications, it is important to set PPR parameters in an ad-hoc manner when finding similar nodes because of dynamically changing nature of graphs. Through interactive actions, interactive similarity search supports users to enhance the efficacy of(More)