Data clone detection and visualization in spreadsheets
@article{Hermans2013DataCD, title={Data clone detection and visualization in spreadsheets}, author={Felienne F. J. Hermans and B. M. W. Sedee and Martin Pinzger and Arie van Deursen}, journal={2013 35th International Conference on Software Engineering (ICSE)}, year={2013}, pages={292-301} }
Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose…
60 Citations
Copy-Paste Detection in Spreadsheets
- Computer Science
- 2013
An algorithm to detect data clones within spreadsheets: formulas whose values are copied in a different location and it is shown that this algorithm is able to detect these data clones with precision rates similar to those achieved by state-of-the-art code clone detection algorithm.
Detecting table clones and smells in spreadsheets
- Computer ScienceSIGSOFT FSE
- 2016
Inspired by existing fingerprint-based code clone detection techniques, a detection algorithm was developed to detect table clones and related smells due to inconsistency among them in spreadsheets and applied it to real-world spreadsheets from the EUSES corpus.
On the empirical evaluation of similarity coefficients for spreadsheets fault localization
- Computer ScienceAutomated Software Engineering
- 2014
This paper studies the impact of different similarity coefficients on the accuracy of spectrum-based fault localization applied to the spreadsheet domain and shows that three of the 42 studied coefficients require less effort by the user while inspecting the diagnostic report, and can be used interchangeably without a loss of accuracy.
WARDER: Towards effective spreadsheet defect detection by validity-based cell cluster refinements
- Computer ScienceJ. Syst. Softw.
- 2020
How effectively can spreadsheet anomalies be detected: An empirical study
- Computer ScienceJ. Syst. Softw.
- 2017
WARDER: Towards E ective Spreadsheet Defect Detection by Validity-based Cell Cluster Re nements
- Computer Science
- 2020
WARDER is presented to improve and discuss and improve one state-of-the-art technique, CUSTODES, which exploits spreadsheet cell clustering and defect detection to extend its scope and make its detection patterns adaptive to varying spreadsheet styles.
Why Does my Spreadsheet Compute Wrong Values?
- Computer Science2014 IEEE 25th International Symposium on Software Reliability Engineering
- 2014
This paper introduces a novel dependency-based approach for model-based fault localization in spreadsheets that improves diagnostic accuracy while keeping computation times short, thus making the automated fault localization more appropriate for practical applications.
Analyzing and Visualizing Spreadsheets
- Computer Science
- 2013
This dissertation aims at developing methods to support spreadsheet users to understand, update and improve spreadsheets and found that methods from software engineering can be applied to spreadsheets very well, and that these methods support end-users in working with spreadsheets.
Using constraints to diagnose faulty spreadsheets
- Computer ScienceSoftware Quality Journal
- 2014
This work presents a constraint-based approach, ConBug, for debugging spreadsheets, which helps end users to pinpoint faulty cells in a spreadsheet and demonstrates that the approach is light-weight and efficient.
Smelling Faults in Spreadsheets
- Computer Science2014 IEEE International Conference on Software Maintenance and Evolution
- 2014
A technique to automatically pinpoint potential faults in spreadsheets is proposed, which combines a catalog of spreadsheet smells that provide a first indication of a potential fault, with a generic spectrum-based fault localization strategy in order to improve on these initial results.
References
SHOWING 1-10 OF 38 REFERENCES
Exact and Near-miss Clone Detection in Spreadsheets
- Computer ScienceTiny Trans. Comput. Sci.
- 2012
Clone detection in spreadsheets is useful both to reveal opportunities for improving the spreadsheet and to detect actual errors, and this work shows that this is a promising avenue.
Exact and Near-miss Clone Detection in Spreadsheets
- Computer Science
- 2012
Clone detection in spreadsheets is useful both to reveal opportunities for improving the spreadsheet and to detect actual errors, and this work shows that this is a promising avenue.
Detecting code smells in spreadsheet formulas
- Computer Science2012 28th IEEE International Conference on Software Maintenance (ICSM)
- 2012
A list of metrics by which to detect smelly formulas and a visualization technique to highlight these formulas in spreadsheets are presented and indicate that formula smells are common and that they can reveal real errors and weaknesses in spreadsheet formulas.
Detection and analysis of near-miss software clones
- Computer Science2009 IEEE International Conference on Software Maintenance
- 2009
A hybrid clone detection method is developed, and a mutation-based framework is developed that automatically and efficiently measures (and compares) the recall and precision of clone detection tools.
Detecting and visualizing inter-worksheet smells in spreadsheets
- Computer Science2012 34th International Conference on Software Engineering (ICSE)
- 2012
The results of the evaluation indicate that smells can indeed reveal weaknesses in a spreadsheet's design, and that data flow diagrams are an appropriate way to show those weaknesses.
Using Slicing to Identify Duplication in Source Code
- Computer ScienceSAS
- 2001
The design and initial implementation of a tool that finds clones and displays them to the programmer and uses program dependence graphs (PDGs) and program slicing to find isomorphic PDG subgraphs that represent clones is described.
Tracking the Evolution of Code Clones
- Computer ScienceSOFSEM
- 2011
An approach for mapping code duplications from one particular version of the software to another one, based on a similarity distance function, and introduces the term of "clone smells", which gives a clue about why the reported code fragments might be dangerous.
Supporting professional spreadsheet users by generating leveled dataflow diagrams
- Computer Science2011 33rd International Conference on Software Engineering (ICSE)
- 2011
This paper first study the problems and information needs of professional spreadsheet users by means of a survey conducted at a large financial company, and presents an approach that extracts this information from spreadsheets and presents it in a compact and easy to understand way, with leveled dataflow diagrams.
Automatically Extracting Class Diagrams from Spreadsheets
- Computer ScienceECOOP
- 2010
This work creates a library of common spreadsheet usage patterns that are localized in the spreadsheet using a two- dimensional parsing algorithm and transformed and enriched with information from the library to automatically extract information and transform it into class diagrams.
Near-miss function clones in open source software: an empirical study
- Computer ScienceJ. Softw. Maintenance Res. Pract.
- 2010
This paper examines more than twenty open source C, Java and C# systems, including the entire Linux Kernel, Apache httpd, J2SDK-Swing and db4o, and compares their use of cloned code in several different dimensions, including language, clone size, clone similarity, clone location and clone density.