Learn More
Many tasks in software engineering can be characterized as source to source transformations. Design recovery, software restructuring, forward engineering, language translation, platform migration and code reuse can all be understood as transformations from one source text to another. TXL, the Tree Transformation Language, is a programming language and rapid(More)
Recently there has been considerable work toward standardizing SEFs (software exchange formats) for interchange of information about source programs. An exchange format is a common textual form for data extracted from source programs and used by a variety of software analysis and visualization tools. An SEF can be further specified by a schema, analogous to(More)
—While graph-based techniques show good results in finding exactly similar subgraphs in graphical models, they have great difficulty in finding near-miss matches. Text-based clone detectors, on the other hand, do very well with near-miss matching in source code. In this paper we introduce SIMONE, an adaptation of the mature text-based code clone detector(More)
In this paper we introduce a general, extensible diagrammatic syntax for expressing software architectures based on typed nodes and connections and formalized using set theory. The syntax provides a notion of abstraction corresponding to the concept of a subsystem, and exploits this notion in a general mechanism for pattern matching over architectures. We(More)
Syntactic analysis forms a foundation of many source analysis and reverse engineering tools. However, a single standard grammar is not always appropriate for all source analysis and manipulation tasks. Small custom modifications to the grammar can make the programs used to implement these tasks simpler, clearer and more efficient. This leads to a new(More)
Previous research shows that most software systems contain significant amounts of duplicated, or cloned, code. Some clones are exact duplicates of each other, while others differ in small details only. We designate these almost-perfect clones as " near-miss " clones. While technically difficult, detection of near-miss clones has many benefits, both academic(More)
Any attempt at automated software analysis or modification must be preceded by a comprehension step, i.e. parsing. This task, while often considered straightforward, can in fact be made very challenging depending on the source code in question. Files that make up web applications serve as an example of such difficult-to-parse artifacts, for two reasons.(More)