Learn More
The volume of RDF data continues to grow over the past decade and many known RDF datasets have billions of triples. A grant challenge of managing this huge RDF data is how to access this big RDF data efficiently. A popular approach to addressing the problem is to build a full set of permutations of (S, P, O) indexes. Although this approach has shown to(More)
Large scale graph processing represents an interesting challenge due to the lack of locality. This paper presents PathGraph for improving iterative graph computation on graphs with billions of edges. Our system design has three unique features: First, we model a large graph using a collection of tree-based partitions and use an path-centric computation(More)
With wide application of virtualization technology, the demand is increasing for performance analysis and system diagnosis in virtualization environment. There are some profiling toolkits based on hardware events, such as OProfile in native Linux and Xenoprof in Xen virtual machine environment. However, sometimes users in different domains need monitor(More)
The flexibility of the RDF data model has attracted an increasing number of organizations to store their data in an RDF format. With the rapid growth of RDF datasets, we envision that it is inevitable to deploy a cluster of computing nodes to process large-scale RDF data in order to deliver desirable query performance. In this paper, we address the(More)
Keywords: Group Steiner tree Keyword search RDF graph Top-K a b s t r a c t This paper presents a novel IR-style keyword search model for semantic web data retrieval, distinguished from current retrieval methods. In this model, an answer to a keyword query is a connected subgraph that contains all the query keywords. In addition, the answer is minimal(More)
Server consolidation, an application form of virtualization technology, consolidates multiple physical servers into a single or fewer real machines. It results in higher resource utilization and smaller space consumption and is considered as a tendency for enterprise application deployment. Some researches were carried for evaluating its static performance.(More)
—The emerging need for conducting complex analysis over big RDF datasets calls for scale-out solutions that can harness a computing cluster to process big RDF datasets. Queries over RDF data often involve complex self-joins, which would be very expensive to run if the data are not carefully partitioned across the cluster and hence distributed joins over(More)