• Publications
  • Influence
Making Open Data Transparent: Data Discovery on Open Data
Open Data poses interesting new challenges for data integration research and one of those challenges is data discovery, how can the authors find new data sets within this ever expanding sea of Open Data. Expand
LabBook: Metadata-driven social collaborative data analysis
The key insight is to collect and use more metadata about all elements of the analytic ecosystem by means of an architecture and user experience that reduce the cost of contributing such metadata. Expand
VIQS: Visual Interactive Exploration of Query Semantics
VIQS (Visual Interactive Query Semantics), a system that extracts query semantics from query logs over multiple datasets, and allows users to explore underlying patterns visually, is developed to help system designers effectively browse and understand patterns of use. Expand
Barriers to adoption of information technology in healthcare
This report takes a systems thinking perspective to identify barriers to the application of information technology in healthcare and adoption of those advances through the prism of two use cases: electronic medical records (EMR) and remote patient monitoring (RPM) technology. Expand
VizCurator: A Visual Tool for Curating Open Data
Vizcurator permits the exploration, understanding and curation of open RDF data, its schema, and how it has been linked to other sources, and can be used to create new binary temporal relations by reifying base facts and linking them to temporal resources. Expand
Pytheas: Pattern-based Table Discovery in CSV Files
This work proposes Pytheas: a principled method for automatically classifying lines in a CSV file and discovering tables within it based on the intuition that tables maintain a coherency of values in each column, and introduces a confidence measure for table discovery. Expand
VoidWiz: Resolving incompleteness using network effects
A principled way of performing value imputation on missing values is introduced, allowing a user to choose a correct value after viewing possible values and why they were inferred. Expand
CSV is a popular Open Data format widely used in a variety of domains for its simplicity and effectiveness in storing and disseminating data. Unfortunately, data published in this format often doesExpand
Automated Conceptual Abstraction of Large Semantic Diagrams
The design and development of applications and systems is a multistep process involving developers and stakeholders. During the development lifecycle, numerous diagrams are often created in order toExpand