A file comparison program

  title={A file comparison program},
  author={Webb Miller and Eugene Wimberly Myers},
  journal={Software: Practice and Experience},
  • W. MillerE. Myers
  • Published 1 November 1985
  • Computer Science
  • Software: Practice and Experience
This paper presents a simple method for computing a shortest sequence of insertion and deletion commands that converts one given file to another. The method is particularly efficient when the difference between the two files is small compared to the files' lengths. In experiments performed on typical files, the program often ran four times faster than the UNIX diff command. 

Identifying syntactic differences between two programs

  • Wuu Yang
  • Computer Science
    Softw. Pract. Exp.
  • 1991
A comparison algorithm is developed that can point out the differences between two programs more accurately than previous text comparison tools and is based on a dynamic programming scheme.

An approximation to the greedy algorithm for differential compression of very large files

A new differential compression algorithm that combines the hash value and suffix array technique and depends upon the utilization of three new data structures, the block hash table, the quick index array, and the pointer array, which improves the run-time of the algorithm and compress very large files.

An analysis on computation of longest common subsequence algorithm

This paper has done comparison among various algorithms which works on two or more strings, and put the new proposals for the development of new algorithms for more strings.

A Semantic Difference Algorithm for Structured Visual Dataflow Programs

This paper presents an algorithm for semantic comparison of programs in controlled visual dataflow languages; that is, languages in which dataflow diagrams are embedded in control structures and performs depth-first search of call structures to determine if two programs are semantically equivalent, and if they are not, discovers the differences.

Implementation of Java Program Similarity Measurement Tool Using Token Structure and Execution Control Structure

This paper proposes similarity measurement method for Java programs by using software metrics that are calculated from the structure of token and execution control in the target source program by comparing the resulting metrics values without using expensive string comparison.

Syntactic Software Merging

The fundamentals of merging are described, the known methods of software merging are surveyed, including a method based on programming-language syntax, and a set of tools that perform syntactic merging are discussed.

Measuring the accuracy of page-reading systems

It is shown that the universe of cost functions is divided into equivalence classes, and the cost functions related to the longest common subsequence (LCS) are identified.

New Refinement Techniques for Longest Common Subsequence Algorithms

It has turned out to be difficult to develop an lcs algorithm which would be superior for all problem instances, and implementing the most evolved lcs algorithms presented recently is laborious.

Semantic comparison of structured visual dataflow programs

This algorithm performs depth-first search of call structures comparing embedded diagrams using subgraph isomorphism, to determine if two programs are semantically equivalent, and if they are not, discovers the differences.



Optimal Code Generation for Expression Trees

A dynamic programming algorithm is presented which produces optimal code for any machine in this class of machines, which runs in time linearly proportional to the size of the input.

The string-to-string correction problem with block moves

An algorithm that produces the shortest edit sequence transforming one string into another is presented and is optimal in the sense that it generates a minimal covering set of common substrings of one string with respect to another.

A linear space algorithm for computing maximal common subsequences

The problem of finding a longest common subsequence of two strings has been solved in quadratic time and space. An algorithm is presented which will solve this problem in quadratic time and in linear

The String-to-String Correction Problem

An algorithm is presented which solves the string-to-string correction problem in time proportional to the product of the lengths of the two strings.

Rcs — a system for version control

  • W. Tichy
  • Computer Science
    Softw. Pract. Exp.
  • 1985
Basic version control concepts are introduced and the practice of version control using RCS is discussed, and usage statistics show that RCS's delta method is space and time efficient.

Approximate String Matching

Approximate matching of strings is reviewed with the aim of surveying techniques suitable for finding an item in a database when there may be a spelling mistake or other error in the keyword. The

A redisplay algorithm

The algorithm is interesting because it applies results from the theoretical string-to-string correction problem (a generalization of the problem of finding a longest common subsequence) to a problem that is usually approached with crude ad-hoc techniques.

A fast algorithm for computing longest common subsequences

An algorithm for finding the longest common subsequence of two sequences of length n which has a running time of O((r + n) log n), where r is the total number of ordered pairs of positions at which the two sequences match.

The source code control system

  • M. Rochkind
  • Computer Science
    IEEE Transactions on Software Engineering
  • 1975
The SCCS approach to source code control is discussed, how it is used and explained is shown and how the system is implemented is explained.

Bounds on the Complexity of the Longest Common Subsequence Problem

It is shown that unless a bound on the total number of distinct symbols is assumed, every solution to the problem can consume an amount of time that is proportional to the product of the lengths of the two strings.