Identifying syntactic differences between two programs

  title={Identifying syntactic differences between two programs},
  author={Wuu Yang},
  journal={Software: Practice and Experience},
  • Wuu Yang
  • Published 1 June 1991
  • Computer Science
  • Software: Practice and Experience
Programmers frequently face the need to identify the differences between two programs, usually two different versions of a program. Text‐based tools such as the UNIXr̀ utility diff often produce unsatisfactory comparisons because they cannot accurately pinpoint the differences and because they sometimes produce irrelevant differences. Since programs have a rigid syntactic structure as described by the grammar of the programming language in which they are written, we develop a comparison… 

Comparing Java programs: syntactic and contextual semantic differences

This thesis introduces algorithms that handle both ordered and unordered nodes in an abstract syntax tree and implements the algorithms in the functional language Haskell and the Strafunski libraries for generic programming to transform Java programs into abstract syntax trees and partly to traverse the trees.

How to merge program texts

  • Wuu Yang
  • Computer Science
    J. Syst. Softw.
  • 1994

Semantic comparison of structured visual dataflow programs

This algorithm performs depth-first search of call structures comparing embedded diagrams using subgraph isomorphism, to determine if two programs are semantically equivalent, and if they are not, discovers the differences.

Detecting and investigating the source code changes using logical rules

  • E. KodhaiB. Dhivya
  • Computer Science
    2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014]
  • 2014
This proposed system introduces the rule-based program differencing approach to represent the changes as logical rules and concisely represents the systematic changes and helps the software engineers to recognize the program differences.

Analyzing and inferring the structure of code change

This dissertation discovered that, in contrast to conventional wisdom, programmers often create and maintain code duplicates with clear intent and that immediate and aggressive refactoring may not be the best solution for managing code clones.

Identifying and Summarizing Systematic Code Changes via Rule Inference

A rule-based program differencing approach that automatically discovers and represents systematic changes as logic rules is proposed that is demonstrated through its application to several open source projects as well as a focus group study with professional software engineers from a large e-commerce company.

An evaluation of duplicate code detection using anti-unification

An algorithm for finding software clones, which works at the level of abstract syntax trees and is thus conceptually independent of the source language of the analyzed programs, and is formally based on the notion of anti-unification.

Refactoring Dynamic Languages

This tool provides several refactoring operations for the typical mistakes made by beginners and is intended to be used as part of their learning process in DrRacket, a simple and pedagogical IDE.

Duplicate code detection using anti-unification

A new algorithm for finding software clones, conceptually independent of the source language of the analyzed programs, working at the level of abstract syntax trees, that considers that two sequences of statements form a clone if one of them can be obtained from the other by replacing some subtrees.

An implementation of and experiment with semantic differencing

The first semantic differencing implementation for the C language is presented and studied, and a large collection of semantic differences of 10 programs are computed and the average size reduction was 37.70%.



A technique for isolating differences between files

A simple algorithm is described for isolating the differences between two files that corresponds closely to the intuitive notion of difference, is easy to implement, and is computationally efficient, with time linear in the file length.

A specification schema for indenting programs

Most indentation styles appearing in the literature can be specified with precision using methods developed in this paper, and although specifications for real‐life programs can be given using simple mathematics, the effort required is still considerable.

Identifying the semantic and textual differences between two versions of a program

This paper describes a technique for comparing two versions of a program, determining which program components represents changes, and classifying each changed component as representing either asemantic or a textual change.

Managing Multi-Version Programs with an Editor

A new method of automating much of the bookkeeping involved in dealing with multi-version programs is described here, which entails use of a special editor that enables a multi-versions program to be seen and modified in a fashion that is far closer to that normally permitted for a single-version program.

C, a reference manual

The Third Edition of C: A Reference Manual provides a complete discussion of the language, the run-time libraries, and a style of C programming that emphasizes correctness, portability, and maintainability.

A file comparison program

A simple method for computing a shortest sequence of insertion and deletion commands that converts one given file to another, which is particularly efficient when the difference between the two files is small compared to the files' lengths.

Some concerns aboul Modula-2

Because all II0 has been moved to procedures whose syntax is limited to what the programmer is given, Modula has been left wlth only low-level I/0 syntax. The II0 procedures In most high-level

Formatted programming languages

This paper presents a systematic approach to formatted language design that incorporates formatting within the syntax of programming languages and a set of guidelines for language designers to enhance readability within the constraints of the metasyntax.

C++ Programming Language

Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.

Production trees: a compact representation of parsed programs

This work develops the necessary analysis to characterize the storage requirements of parse trees, abstract syntax trees, and production trees and relate the size of all three to thesize of the program's text.