Capri: Programmable Analytics for Linked Data


Link Data (LD) initiative has fundamentally changed the way how data are published, distributed, and consumed. It advocates data transparency and accessibility to fulfill the Web of Data vision. Thus far, tens of billions of data items have been made publicly available in machine-understandable forms (e.g. RDF). The sheer size of LD data, however, has not resulted in a significant increase of data consumption and thus a self-sustainable consumption-driven publication. We contend that this is primarily due to the lack of tooling for exploiting LD. A new programming paradigm is necessary to simplify and encourage value-add LD data utilisation. This paper reports an on-going project towards programmable Linked Open Data. We propose to tap into a distributed computing environment underpinning the popular statistical toolkit R. Where possible, native R operators and functions are used in our approach so as to lower the learning curve for experienced data scientists. We believe a report to the relevant community at this stage can help us to collect critical requirements before moving into the next stage of development. The crux of our future work lies in comprehensive and extensive evaluations, in terms of, but not limited to, system performance, system stability, system scalability, programming productivity and user experience.

DOI: 10.1145/2684200.2684336

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Hu2014CapriPA, title={Capri: Programmable Analytics for Linked Data}, author={Bo Hu and Eduarda Mendes Rodrigues and Emeric Viel}, booktitle={iiWAS}, year={2014} }