Home > Uncategorized > sparse can now generate executables

sparse can now generate executables

How can you be confident that a source code analysis tool is correct when its analysis of a file does not result in any warning being generated? The best way I know of addressing this issue is for the analysis tool to be capable of generating an executable from the source. Executing a moderately large program requires that the translator get all sorts of complicated analysis correct and provides a huge boost to confidence of correctness compared to an analysis tool that does not have this ability.

Congratulations to the maintainers of sparse (a Linux kernel specific analysis tool started by Linus Torvalds in 2003) for becoming one of the very small number of source code analysis tools capable of generating executable programs.

The Model C Implementation is another tool capable of generating executables. I would love to be able to say that the reason for this was dedication to perfection by the project team, however, the truth is that it started life as a compiler and became an analyzer later.

Clang, the C front end+analyzer to llvm is often referred to as a static analyzer. While it does perform some static checking (like gcc does when the -Wall option is specified) a lot more checks needs to be supported before it can be considered on a par with modern static analysis tools.

GCC’s Treehydra project works via a plug-in to the compiler. This project has yet to live up to its potential so we can delay discussion of whether it should be classified as a standalone system or an executable generator.

I cannot think of any other ‘full language’ static analysis tools capable of generating executable programs (the C-semantics tool restricts its checks to those required by the C Standard and I think tools need to do a lot more than this to be considered static analysis tools). Corrections to my lack of knowledge welcome.

I was a little concerned that there were plans afoot to migrate sparse to becoming the build compiler for the Linux kernel. Linus answered my query by saying that this was never a goal.

  1. Peter
    August 30, 2011 01:00 | #1
  2. August 30, 2011 01:17 | #2

    @Peter
    Yes I did, thanks for pointing out my mistake; I have updated the post.

  3. August 30, 2011 02:09 | #3

    You say “the C-semantics tool restricts its checks to those required by the C Standard”, but that is not entirely accurate. The C standard requires only a few checks—things that must be given diagnostics messages. Our tool actually identifies lots of other problems, such as accessing memory that is no longer alive. The standard does not require an implementation diagnose this.

    I think we are better thought of as a tool that checks for undefined behavior. Our tool is not (and cannot be) complete, but it will catch many kinds of errors.

  4. August 30, 2011 11:46 | #4

    @Chucky Ellison
    Yes my wording was rather sloppy, I was trying to write with a broad brush and got overly broad. How best to briefly specify to a non-C expert exactly what C-semantics does is very hard. I used to tell people either that the Model Implementation “flags any behavior that prevents a program being strictly conforming” or “flags all unspecified, implementation undefined or undefined behavior”; both of which tended to result in blank looks from whoever I was talking to.

    Your tool will also catch some unspecified behaviors.

  1. No trackbacks yet.