• Corpus ID: 14321514

XLSearch: A Search Engine for Spreadsheets

  title={XLSearch: A Search Engine for Spreadsheets},
  author={Michael Kohlhase and Corneliu-Claudiu Prodescu and Christian Liguda},
Spreadsheets are end-user programs and domain models that are heavily employed in administration, financial forecasting, education, and science because of their intuitive, flexible, and direct approach to computation. As a result, institutions are swamped by millions of spreadsheets that are becoming increasingly difficult to manage, access, and control. This note presents the XLSearch system, a novel search engine for spreadsheets. It indexes spreadsheet formulae and efficiently answers… 

Figures and Tables from this paper

MathWebSearch at NTCIR-11
MWS 1.0 is described, the submission and results for the NTCIR-11 Math-2 Task are evaluated, and future work suggested by the task results are evaluated to form a stable basis for future research into extended query languages and user-interaction issues.
Searching for Truth in a Database of Statistics
This work presents a novel algorithm enabling the exploitation of statistic databases such as those compiled by state agencies, by identifying the statistic datasets most relevant for a given fact-checking query, and extracting from each dataset the best specific query answer it may contain.
Augmenting Mathematical Formulae for More Effective Querying & Efficient Presentation
This thesis proposes to rethink the fundamentals of MIR systems and proposes the concept of context-free formulae visualized by the idea of Formula Home Page (FHP), a mathematically literate person can fully understand the formula semantics without needing to visit a FHP.
MathWebSearch at NTCIR-10
The MATHWEBSEARCH system in the NTCIR-10 Math pilot task is presented and the results are analyzed, a challenge in mathematical information retrieval.
Challenges of Mathematical Information Retrievalin the NTCIR-11 Math Wikipedia Task
This paper discusses the dataset preparation, topic generation and evaluation methods, and summarizes the results of the participants, with a special focus on the Wikipedia Task.
Vers une vérification automatique des affirmations statistiques
The proposed PhD project takes place within the ANR project ContentCheck, and will be developed based on the interactions with the authors' partner from Le Monde, interested in developing textual and semantic tools for analyzing content shared through digital media.


The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms
A corpus of spreadsheets is assembled that is suitable for evaluating dependability devices in Microsoft Excel and a variety of feature of these spreadsheets are measured to aid researchers in selecting subsets of the corpus appropriate to their needs.
Compensating the Computational Bias of Spreadsheets with MKM Techniques
It is shown that spreadsheets are interesting applications for MKM techniques which can alleviate usability and maintenance problems as spreadsheet-based applications grow evermore complex and long-lived.
MathWebSearch 0.5: Scaling an Open Formula Search Engine
This work focuses on scalability issues in MathWebSearch to take advantage of corpora in the giga-formula range, and re-implemented the index to make it distributable and made all the APIs web standards conformant.
A methodology for testing spreadsheets
A testing methodology that adapts data flow adequacy criteria and coverage monitoring to the task of testing spreadsheets is presented and it is found that test suites created according to the methodology detected, on average, 81% of the faults in a set of faulty spreadsheets, significantly outperforming randomly generated test suites.
In Excel, Cell Names Spell Speed, Safety: Give a Cell a Name, and Your Work Will Go Faster and Be More Error-Free
This tutorial helps you to learn how to use a naming system called "named ranges" to make formulas easier to write and to read.
The Definitive ANTLR 4 Reference
This book teaches using real-world examples and shows you how to use ANTLR to build such things as a data file reader, a JSON to XML translator, an R parser, and a Java class-interface extractor.
OMDoc - An Open Markup Format for Mathematical Documents [version 1.2]
  • M. Kohlhase
  • Computer Science
    Lecture Notes in Computer Science
  • 2006
In contrast to the OMDoc format, this report is a total re-write, it closes many documentation gaps, clarifies various remaining issues and adds a multitude of new examples.
Spreadsheet Errors: What We Know. What We Think We Can Do
To date, only one technique, cell-by-cell code inspection, has been demonstrated to be effective, and the degree to which other techniques can reduce spreadsheet errors needs to be determined.
Spreadsheet Auditing Software
This paper documents and tests office software tools designed to assist in the audit of spreadsheets to identify the success of software tools in detecting different types of errors, to identify how the software tools assist the auditor and to determine the usefulness of the tools.
A scalable module system