Annotation-less Unit Type Inference for C


Types in programming languages are crucial for catching errors at compile-time. Similarly, in scientific applications, the units system forms a type discipline because a correct equation must necessarily include terms with units that are equivalent on both sides. Many scientific and numerically-oriented programs are written in C, but the language provides no support for attaching units to variables. Thus, programmers cannot easily tell whether the units for variables in their programs are consistent. We propose to create an analysis that can infer the unit types of variables in C programs such that the units type discipline holds across all expressions in the program. Unlike previous work which relies on the user to annotate some types and then checks the rest for consistency, we propose to infer types with no user annotations at all. This completely automatic analysis produces a most-general system of units for a program, which include any consistent unit system the programmer may have intended as a special case. This most-general solution can then be specialized interactively by the programmer to give human-readable names to units, a process which requires much less programmer interaction than specifying each variable’s units one by one. Our analysis can be used by programmers to find bugs as indicated by the inferred unit types not matching up with their intuition and as a starting point for annotating their code with units using an annotation framework. The correct use of physical units and dimensions is an important aspect of correctness for many computer programs. On one hand, many programming errors lead to programs whose results do not have the expected units, analogously to the common experience of students in introductory physics courses. For instance, an experiment performed by Brown [Bro01] showed that checking the units in a short procedure written to calculate a function used in particle physics revealed three separate errors. Conversely, programs with unit errors can harbor subtle bugs. To cite a particularly costly error, an on-ground system used in the navigation of the NASA Mars Climate Orbiter spacecraft failed to convert between pound-seconds and newton-seconds in calculating the impulse produced by thruster firings, due to a programmer error when updating a program used for a previous spacecraft to include the specification of a new model of thruster [EJC01]. This root cause, along with inadequate testing, operational failures, and bad luck, eventually led to the loss of the spacecraft as it was destroyed in the Mars atmosphere on September 23rd, 1999 [Mar99]. As these examples indicate, more careful attention to physical units could improve software quality, and because the rules governing unit correctness are simple, much of such checking can be performed automatically. In fact, extensions exist for many languages allowing programmers to specify the units of variables and constants so that they can be checked by a compiler. However, though many of the potential benefits of unit annotations would be realized in the maintenance of existing systems, adding unit annotations by hand to large existing programs would be very expensive. Instead of requiring a developer to specify each unit type individually, we believe that a better approach is to automatically infer a general set of unit types based only on the assumption that a program’s use of units is consistent. Such inferred types could be useful for many kinds of automatic checking, such as warning a programmer when a change to one function is inconsistent with the units of variables in another one. Such a most-general unit system also provides a more efficient way for a developer to add human-readable units to a program: he or she must only specify a few of the unit types in a program, and the rest can be automatically assigned to be consistent.

1 Figure or Table

Cite this paper

@inproceedings{Guo2005AnnotationlessUT, title={Annotation-less Unit Type Inference for C}, author={Philip J. Guo and Stephen McCamant}, year={2005} }