Clone Detection Using Abstract Syntax Suffix Trees

@article{Koschke2006CloneDU,
  title={Clone Detection Using Abstract Syntax Suffix Trees},
  author={Rainer Koschke and Raimar Falke and Pierre Frenzel},
  journal={2006 13th Working Conference on Reverse Engineering},
  year={2006},
  pages={253-262}
}
Reusing software through copying and pasting is a continuous plague in software development despite the fact that it creates serious maintenance problems. Various techniques have been proposed to find duplicated redundant code (also known as software clones). A recent study has compared these techniques and shown that token-based clone detection based on suffix trees is extremely fast but yields clone candidates that are often no syntactic units. Current techniques based on abstract syntax… 

Figures from this paper

Empirical evaluation of clone detection using syntax suffix trees

TLDR
This paper describes how to make use of suffix trees to find syntactic clones in abstract syntax trees and reports the results of a large case study in which it empirically compare the new technique to other techniques using the Bellon benchmark for clone detectors.

Design and Development of an Efficient Software Clone Detection Technique

TLDR
The study reported the use of clone detection in finding commonalities in the form of domain concepts in source code which will help analysts in understanding the design of the system for better maintenance.

Semantic Clone Detection Using Machine Learning

TLDR
A machine learning framework to automatically detect clones in software, which is able to detect Types-3 and the most complicated kind of clones, Type-4 clones, is presented.

Code Clone Detection based on Logical Similarity : A Review

TLDR
An algorithm for clone detection based on comparing parts of abstract syntax tree of programs and finding semantic coding styles is presented, which represents a challenge in current scenario.

Code Clone Detection Using Various Approaches C

ISBN 978-81-929648-0-5 IRISET@2014 63 Abstract— In the last few decades many techniques for software clone detection have been investigated by various researchers to detect the duplicated code in

Extracting the similarity in detected software clones using metrics

TLDR
This proposal is a new technique for finding similar code blocks and for quantifying their similarity, which can be used to find clone clusters, sets of code blocks all within a user-supplied similarity.

Abstract Syntax Tree Based Clone Detection for Java Projects

TLDR
A software engineering process is examined to create an abstract syntax tree based clone detector for java projects to reduce the time and effort as well as maintenance cost.

Developing a Novel and Effective Clone Detection Using Data Mining Technique

TLDR
This new algorithm detects the code clone for control structures such as for, while and do statements by splitting the original source code called source units and assigning index values for each statement.

Clone Detection Using Abstract Syntax Trees

TLDR
The goal in cloning is to create a new software program that mimics everything the original software does and the way in which it does.

An Empirical Study on Retrieving Structural Clones Using Sequence Pattern Mining Algorithms

TLDR
A new approach for detection of structural clones in source code is presented, which is parse-tree-based and enhanced by frequent subsequence mining, and proposes an encoding algorithm for control statements and method identifiers.
...

References

SHOWING 1-10 OF 31 REFERENCES

Effective Clone Detection Without Language Barriers

TLDR
This thesis investigates how the premises of simplicity and adaptability influence all phases of the clone detection process, and examines how line-based string matching as basic feature comparison technique can be augmented by minimal parsing to improve detection sensitivity.

Using Slicing to Identify Duplication in Source Code

TLDR
The design and initial implementation of a tool that finds clones and displays them to the programmer and uses program dependence graphs (PDGs) and program slicing to find isomorphic PDG subgraphs that represent clones is described.

CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code

TLDR
A new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison, is proposed, which has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems.

A language independent approach for detecting duplicated code

  • Stéphane DucasseM. RiegerS. Demeyer
  • Computer Science
    Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360)
  • 1999
TLDR
This paper shows that it is possible to circumvent this hindrance by applying a language independent and visual approach, i.e. a tool that requires no parsing, yet is able to detect a significant amount of code duplication.

Improved tool support for the investigation of duplication in software

TLDR
The criteria for a complete tool that is designed to aid in the comprehension of cloning within a software system is described and a prototype of such a tool is presented and the value of its features is demonstrated through a case study on the Apache httpd Web server.

Identification of high-level concept clones in source code

  • Andrian MarcusJ. Maletic
  • Computer Science
    Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001)
  • 2001
TLDR
The intention of the approach is to enhance and augment existing clone detection methods that are based on structural analysis and improve the quality of clone detection.

Clone detection in source code by frequent itemset techniques

TLDR
A new approach for the detection of clones in source code, which is inspired by the concept of frequent itemsets from data mining, is described, represented as an abstract syntax tree in XML.

Experiment on the automatic detection of function clones in a software system using metrics

TLDR
A technique to automatically identify duplicate and near duplicate functions in a large software system using metrics extracted from the source code using the tool Datrix/sup TM/.

On Software Maintenance Process Improvement Based on Code Clone Analysis

TLDR
This paper intends to extend the functionality of Gemini to cope with the problems, and applies the extended Gemini to several software and evaluates the applicability of the new functions.

Finding function clones in Web applications

  • F. LanubileTeresa Mallardo
  • Computer Science
    Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings.
  • 2003
TLDR
The results obtained from applying a simple semi-automated approach to three Web applications show that the approach is useful for a fast selection of script function clones, and can be applied to prevent clone spreading or to remove redundant scripting code.