T H E 2 0 1 0 L a R K C P H D S Y Mp O S I U M No V E Mb E R 1 4 T H , 2 0 1 0

Abstract

Interleaving reasoning and selection by knowledge summarization is a method to<lb>deal with the scalability in large-scale data searches and reasoning. Many exist-<lb>ing methods have been developed to improve searches on large-scale data, which<lb>require high performance hardware support (with emphasizes on Parallelization<lb>and Cloud architectures). Because the scale of the semantic data grows faster<lb>than the hardware performance, only rely on the improvements of hardware and<lb>architecture innovation will not always work, especially in resource limited en-<lb>vironments. Interleaving reasoning and selection by knowledge summarization<lb>does not rely on high-performance hardware. Instead of a global check on all<lb>the data, this method only explores partial data by using heuristic strategies. To<lb>achieve this goal, some preprocess need to be done offline, including dividing and<lb>summarization. The dividing process cut the original dataset into small chunks,<lb>and the summarization process gives the heuristic information to measure how<lb>far the target chunks are. For each query, the heuristic information is calculated<lb>with a function to estimate the possibility of the answers appearing in a chunk.<lb>When a query comes, the heuristic information will be used to find the chunks<lb>with high possibility of partially answering the query. Then the query will be<lb>executed against the found chunks, in descending order of their possibility. In-<lb>terleaving reasoning and selection does not require completeness, so searches<lb>based on knowledge summarization can stop at any time, and the most possible<lb>chunks will be selected first to answer the query. This method enables personal<lb>computer or even mobile devices have the ability to deal with large-scale data,<lb>and some experiments on the RDF version of the MEDLINE dataset has been<lb>done to prove the proposed method. The summarization is to describe the origi-<lb>nal dataset with a quite small size. It can be expressed as ontologies in different<lb>perspectives. With these ontologies, and other external ontologies related to a<lb>query, this method can process the query with reasoning features. Based on this<lb>method, a prototype system named Knowledge Intensive Summarization Sys-<lb>tem (KISS) is developed and the system indicates that the proposed method is<lb>potentially effective.

Cite this paper

@inproceedings{Huang2010THE, title={T H E 2 0 1 0 L a R K C P H D S Y Mp O S I U M No V E Mb E R 1 4 T H , 2 0 1 0}, author={Zhisheng Huang and Yi Zeng and Reto Krummenacher and Danica Damljanovic and Matthias Assel}, year={2010} }