C++ automatic rule checkers

Compiler warnings

One advantage of using an ahead-of-time compiler, as C++ programmers normally do, is that the compiler can check some aspects of program correctness without even running the program. Unfortunately, for historical and cultural reasons, many of these checks are disabled by default.

Compiler warnings, also known as “diagnostics” or “lints”, usually point at code which is legal according to the C++ standard, but invokes tracherous language corner cases that are unlikely to match programmer intent/comprehension. Code which triggers compiler warnings should normally be reviewed carefully for correctness, and ultimately rewritten into a more readable/correct form that does not trigger the warning.

In this section, we will explore some useful options of the GCC and clang compilers that you may want to enable to let them detect more issues with your code. We will focus on the behavior of GCC, which is largely imitated by clang. But if you are using another compiler, such as Microsoft Visual C++, you may want to check its documentation.

-Wall

This incorrectly named GCC and clang compiler option does not enable all error checking features, as you would expect. It does, however, enable a set of compiler warnings that the GCC and clang developers consider to provide a good trade-off between protection and false positives.

Click here for some examples of common problems that you can detect with -Wall
  • Comparison of a char* pointer to a hardcoded string: The following code will compare the memory addresses of two strings, and not their contents as the developer likely intended, a classic mistake in C and old-style C++.
void charptr_eq_str(const char* s) {
    if (s == "reference string") { /* ... */ }
}
  • Statically out of bounds array accesses: A C++ program that performs reads or writes beyond the end of an array invokes undefined behavior. This is the worst kind of error that a C++ programmer can make, because a compiler is allowed to assume that it does not happen when optimizing code, and thus unpredictable chaos will ensue at execution time. Using analysis that is carried out during the code optimization process (i.e. only when optimizations are enabled), GCC can detect this error in easy cases where all relevant parameters are known are compile time, as in the following example:
float numbers[42];
for (int i = 1; i <= 42; ++i) {  // Whoops! This is C++ not Fortran/Julia
    numbers[i] = i;
}
  • Misleading indentation: One unfortunate design decision of C++ is that control flow operations like if and while do not start a new code block. As a result, it is very easy to accidentally write code like the following, which -Wall will rightly report as suspicious:
if (condition)
    foo();  // Runs only if condition is true
    bar();  // Runs always, no matter if condition is true or false
  • Missing enum variant in switch : One problem with enums and switch statements is that it is very easy to start from correct usage…
enum class Color {
    Red,
    Green,
    Blue
};

// ... much later, possibly in a different source file ...

void print_color(Color color) { 
    switch (color) {
      case Color::Red:
        std::cout << "Red" << std::endl;
        break;
      case Color::Green:
        std::cout << "Green" << std::endl;
        break;
      case Color::Blue:
        std::cout << "Blue" << std::endl;
    }
}

…and later break the code by adding new enum variants without updating switch statements:

enum class Color {
    Red,
    Green,
    Blue,
    Pink  // This is new and support must be added to the switch statement
};

With -Wall, GCC will correctly point out that the switch statement must be updated.

-Wextra

As previously mentioned, the -Wall option only enables some of the available compiler warnings. More warnings can be enabled by additionally using the -Wextra option, which historically enabled warnings that the compiler authors considered to provide a less favorable true vs false positive compromises.

Most programmers, however, would consider the warnings which are enabled by -Wextra nowadays to point at code patterns that are still highly objectionable and worth eliminating.

Click here for some examples of common problems that you can detect with -Wextra
  • Implicit switch fallthrough: Starting from the incorrect switch statement example from the -Wall section, this would be one incorrect way to fix the it:
void print_more_colors(Color color) { 
    switch (color) {
      case Color::Red:
        std::cout << "Red" << std::endl;
        break;
      case Color::Green:
        std::cout << "Green" << std::endl;
        break;
      case Color::Blue:
        std::cout << "Blue" << std::endl;
      case Color::Pink:  // This breaks the former Color::Blue case
        std::cout << "Pink" << std::endl;
    }
}

This is incorrect because C++ switch statements infamously fall through by default, which means that if the switch statement is invoked with color set to Color::Blue, the program will continue executing code for other cases (in this case Color::Pink), and thus output both “Blue” and “Pink” in succession. With -Wextra, GCC will rightly point out that a break; statement should be inserted before the start of the Color::Pink case.

  • Missing field initializations: If you have a struct with multiple fields such as…
struct MultiField {
    int a;
    float b;
    int c;
};

…then C++ allows only some of the fields to be initialized during construction:

MultiField make_multi_field() {
    return { 12, 3.4 };
}

When you do this, only the first struct fields are initialized, and the remaining ones are left uninitialized. This surprising behavior commonly happens as a result of adding new struct fields without updating code that instantiates the struct, and because reading uninitialized data is undefined behavior, it is a very dangerous mistake. GCC will therefore warn about this patten in -Wextra mode.

  • Signed/unsigned integer type confusion: The C++ community has had many heated debates about where signed vs unsigned integer types should be used. One thing remains constant, however: programmers often pick the inappropriate integer type for their application, to disastrous results. Here are two examples that are detected by GCC in -Wextra mode:
void suspicious_int_conditions(int s, unsigned u) {
    // You probably meant to use a signed integer here
    if (u < 0) { /* Never executed */ }

    // The signed integer is reinterpreted as unsigned, with negative numbers
    // mapped into positive ones. This is likely to result in surprising outcome.
    while (s < u) { /* ... */ }
}

Hopefully, you will agree that there is little point in using only -Wall without -Wextra these days, and both options should normally be used.

Other correctness-oriented warnings

At this point, it will probably not surprise you that the -Wall -Wextra combination still does not enable the full error-checking power of GCC and clang. Among the many other available options, you may specifically want to enable…

  • -Wshadow: This detects situation where a local variable has the same name as a global variable, which is well-defined by the language (local variable takes precedence over global one) but can easily confuse a reader who is aware of the existence of the global variable.
  • -Wnon-virtual-dtor: This detects situation where a class has virtual methods, and is thus meant to be used via a pointer-to-base-class, but does not have the virtual destructor that this usage normally requires for correct resource cleanup.
  • -Woverloaded-virtual: This detects situations where code creates an overload of a virtual method, rather than an override, which is usually a mistake coming from an incorrect or outdated copy of the virtual method’s signature.
  • -Wconversion detects risks of loss of information caused by implicit conversions (e.g. trying to store a size_t into an char will normally lead to truncation).
  • -Wduplicated-cond detects the if (cond) { /*...*/ } else if (cond) { /*...*/ } duplicate condition pattern, which usually comes from copy-and-paste errors.
  • -Wduplicated-branches detects the if (cond1) { stuff(); } else { stuff(); } duplicate branch pattern, which also normally comes from copy-and-paste errors.
  • -Wnull-dereference detects simple attempts to dereference a null pointer, which is undefined behavior and thus strictly forbidden in any C++ program.
  • -Wdouble-promotion detects situations where a computation was seemingly intended to be carried out in single precision, but is actually carried out in (potentially slower) double precision due to the use of double-precision literals.

-Wpedantic aka -pedantic

While both -Wall and -Wextra are mainly about correctness, this is mainly about portability. By default, GCC and clang both provide a number of “GNU” extensions to the C and C++ standard which are not available in other compilers, including but far from limited to…

  • Mathematical constants like M_PI in <cmath>/<math.h>.
  • Support for the void main() main function signature.
  • C-style arrays whose length is 0 or determined at runtime.
  • __restrict__ manual alias analysis optimizations.
  • Hardware-agnostic SIMD support.

Many of thse extensions are extremely widely used and could be considered bugfixes for shortcomings of the C++ standard. But nonetheless they are not part of the standard, not available in other compilers, and a program that uses them is technically not a true C++ program.

The synonymous -pedantic and -Wpedantic command line options disable some of these extensions. If you care in any capacity about the portability of your code to compilers other than GCC and clang, including Microsoft, NVidia and Intel compilers, you should consider using these compiler options in order to reduce the risk of accidentally using GCC- or clang-specific features.

Beware, however, that -Wpedantic does not detect usage of all compiler extensions and is therefore not a substitute for running test builds for all compilers you care about in your project’s continuous integration infrastructure.

-Weverything

This clang-specific command line option is what most programmers agree -Wall should have been. It enables every single clang warning. It is commonly combined with a set of -Wno-xyz options (e.g. -Wno-shadow), which disables specific compiler warnings that the developers themselves (not the compiler authors) deemed to have an unfavorable cost/benefit tradeoff for their project.

-Werror

Because most compiler warnings point at code constructs that are suspicious, but not illegal according to the language rules, warnings do not normally cause compilation to fail.

For everyday development builds, this is undesirable: it commonly leads to undesirable outcomes like warnings being ignored by developers, or scripts continuing to do thing like running tests or other scripts on a build that does not even pass the compiler’s internal checks, hiding the outcome of these checks pretty deep in the depth of old build logs.

In these situations, the -Werror compiler option, which turns warnings into hard erros that fail the build, can be used to ensure that the build fails promptly when warnings as present, without burying the warnings below thousands of lines of unrelated script output.

However, -Werror should not be made a mandatory part of every non-development build of the project, because it means that a single compiler update that adds a new warning is all it takes to break the build for everyone, including mere users of the project who are not working on it and would not want to write patches to fix the warnings.

Exercise

The analyzers/mdbook_snippets.cpp file contains all the source code examples of this chapter, and can be compiled using the Makefile in the same directory using the make command.

Try to modify the Makefile’s first line (CXXFLAGS variable definition) to enable some of the warnings discussed in this chapter. Then rebuild using this new configuration (you may need to run make clean before make), and see how the compiler starts to report the various issues that were discussed previously.

clang-format

Modern programming languages allow a single program can be written in many different ways, with different amounts of spaces, line feeds, parentheses, and so on. This creates a conundrum:

  • On one hand, programmers who learn/work on their own tend to develop, over time, a distinctive way to lay out code in the source file that they are most comfortable with.
  • On the other hand, programs which are written by many different programmers are a lot more readable when everyone agrees to follow a common style, than when each individual developer goes and lays out the code in their own preferred/signature style.

The universally accepted solution to this problem is to standardize on a common coding style whose rules are simple enough to be easily machine-enforced. Tools can then be used to automatically translate the entire project’s code to this style, both at adoption time and whenever new code gets integrated into the common repository.

It is inevitable that the simple rules enforced by the tool will result in imperfect results that everyone agrees looks bad from time to time. To some extent, this can be addressed by making the rules more complex, which is why automatic code formatters tend to have a bewildering number of configuration options.

One should not forget, however, that attempts to improve code formatting often translate into hours of team arguments and configuration trial and error with near zero useful code production. The best is the enemy of the good, and staying as close as possible to the default tool configuration, however perfectible it may be, has the major advantage of quickly giving you the main benefit of automatic code formatting (standardizing the project’s coding style with minimal manual effort) without paying too many costs in return (endless team arguments about ideal code style).

In C++, the generally agreed standard for automatic code formatting is the clang-format utility. It internally uses the clang compiler, but can be used on any project even if the build is done using GCC or MSVC, as long as clang still manages to make basic sense of the source code. It is very easy to use: just point it at a source file with the -i flag for in-place reformating, and it will adjust the formatting according to the default style:

clang-format -i source.cpp

With a bit of extra work, you can also configure your code editor to automatically reformat source files whenever they are saved, or configure version control systems like git so that each modified source file is reformatted before creating a commit. Finally, as a last resort, your project’s continuous integration should check that the source files are formatted correctly before code is integrated into the project’s master branch. We’ll talk more about continous integration later in this course.

Exercise

Because it is used as part of a static site generation pipeline and must match the indentation expectations of the underlying HTML generation tool, the mdbook_snippets.cpp file has a very unusual style that you may find hard to read.

Try to automatically reformat it using clang-format, then visually inspect the result using your favorite text editor. Are there any formatter decisions that you find yourself disagreeing with?

Dynamic analysis

clang-format and compiler warnings are two examples of static program analysis: they can analyze and transform source code without running the underlying program, and reach conclusions that are valid no matter which input values the program is called with, or which operating system and hardware is used to executes the program. Ultimately, however, static analysis will not find many classes of problems in your code for two key reasons.

First, static analysis is hard. When working on static analysis tools, it is surprisingly easy to encounter edge cases where a precise analysis algorithm which tries to always returns the correct result would take an extremely long time to run or not manage to reach a conclusion at all.

Second, some information that is critical to program correctness is not known until runtime. For example, if a C++ program indexes an array with an integer that is specified by the user through the command line interface or API, whether that array access is in bounds (correct well-defined behavior) or out of bounds (incorrect undefined behavior) cannot be determined at compile time. A static analyzer could pessimistically flag this code pattern as potentially flawed, and some do. But this would inevitably have false positives, which would reduce programmer trust in the analyzer.

The solution to both of these problems is to stop relying on static analysis alone, and instead complement it with dynamic analysis, which analyzes the runtime behavior of the program during the execution of one or more specific test jobs, in a specific environment and with specific inputs.

Of course, one general drawback of dynamic analysis is that it will only find errors in that particular execution configuration. If your program is susceptible to out-of-bounds array accesses, but your test workloads do not trigger any such access, then dynamic analysis will not report the problem. In other words, dynamic analysis is only as good as your test suite is.

Valgrind

Valgrind uses a form of dynamic analysis called dynamic binary instrumentation. You can think of it as running your program in a kind of x86 CPU emulator (technically a JIT) that supports injection of arbitrary checks every time an instruction is executed, a basic block is entered/exited, etc.

One advantage of this approach is that it can be applied to any program without any preparation (although availability of debug info will greatly improve error messages). The drawbacks are that…

  • Some information from the source code (e.g. various categories of undefined behavior) is missing in the final binary, which limits the amount of problems that can be detected.
  • Instrumentation must be shoehorned into an existing binary, without invalidating code and data pointers or modifying existing control flow. As a result, it cannot be implemented very efficiently, which commonly results in execution being slowed down by ~30x. You will want to account for this by running valgrind-instrumented programs on smaller workloads.

To keep the implementation simple, valgrind also serializes execution of all program threads onto a single OS thread. This means that parallel programes will not get any speedup from availability of multiple CPU cores, resulting in an even greater valgrind-induced slowdown.

For program correctness checking purpose, the most useful valgrind tool is Memcheck, which is conveniently the one that is enabled by default when you run valgrind without any extra option. Memcheck checks for various memory usage errors, including…

  • Out-of-bounds accesses aka “buffer overflow”.
  • Use of uninitialized memory, which is undefined behavior.
  • Use-after-free of heap-allocated data.
  • Invalid frees, aka trying to free a pointer that was not previously allocated by the matching allocator (or was already freed before).
  • Memory leaks, aka allocating data and never freeing it.

To reduce overhead and false positives, some checks are disabled by default. See the command-line options documentation for more info about these and how they can be enabled when needed.

The easiest way to run your program through memcheck is to first make sure that your program is built with debug symbols (enabled by GCC/clang option -g) and then run it through valgrind using this sort of command line invocation:

# Simple version: just run "your_program" under valgrind's watch.
valgrind ./your_program

# More complex version, with extra arguments to both valgrind of your program
#
# To make the command more readable, using -- to separate valgrind's options
# from the command line that valgrind is monitoring is recommended.
valgrind --leak-check=full --track-origins=yes -- ./your_program arg1 arg2

Sanitizers

Some problems cannot be found by binary instrumentation like valgrind, either because the relevant information is not present in final binaries or because the slowdown brought by the dynamic instrumentation is so great that getting to the point of program execution where the error is triggered would take an excessive amount of time and machine resources.

In those situations, better results can be obtained by using a more invasive form of dynamic analysis, where the compiler injects supplementary error-checking code during program compilation. In C++, this is most commonly done using the Sanitizer tool family:

  • AddressSanitizer and MemorySanitizer cover memory errors similar to those found by valgrind’s Memcheck, but at a much lower runtime performance cost.
  • ThreadSanitizer and UndefinedBehaviorSanitizer cover many more cases of undefined behavior that Valgrind cannot reliably detect because un-instrumented binaries do not contain enough information about what the original source code was doing.
  • Other sanitizers are available, but they cover quite specialized failure modes and require more involved setup, so they are only used in situations where the high cost of failure justifies extra effort (e.g. security-critical software that processes untrusted user input).

In exchange for detecting more bugs and having a reduced impact on program performance than valgrind, sanitizers bring several limitations of their own:

  • Due to their deep integration with compiler internals, they work best when used with the compiler and libc/STL implementation that they were developed against, typically clang/libc++. Other compilers like GCC may try to support them too, but will often do so less reliably (expect hard-to-debug crashes and some missed error detection).
  • Undefined behavior is a program-wide property, it may not directly happen in the code that you wrote but instead in libc or other libraries that your program uses as a result of an invalid API request. Therefore complete error detection requires recompiling every single dependency of the program, down to libc, with sanitizer instrumentation. Source-based package managers like Spack may help with this, but in general achieving good sanitizer coverage in larger software projects is best done with help from the project’s computing infrastructure experts.

To use sanitizers, just pass the right -fsanitize=xyz option to your compiler during build, as documented in your compiler’s manual. Beware that some sanitizers are incompatible with each other, and even if they work combining them will result in increased runtime resource usage, so you may need multiple full-stack software builds to cover all the sanitizers that you are interested in. Use of a dedicated highly specced build & test server is therefore recommended.

It’s also worth noting that by default sanitizers don’t always stop the program when an error is detected, which can result in duplicated error reporting and misses in automatic test runs. You can enforce fail-on-error mode on operating systems where it is not the default using environment variables, e.g.

export ASAN_OPTIONS=abort_on_error=1:halt_on_error=1
export UBSAN_OPTIONS=abort_on_error=1:halt_on_error=1

Exercise

As previously discussed, the ability of GCC’s warnings to detect undefined behavior, like incorrect memory usage, is unfortunately very limited.

In particular, the out-of-bounds array access that the fortran_iteration() function performs is not detected when compiler optimizations are not enabled.

Experiment with valgrind and sanitizers, and see if you can find a tool configuration that succeeds at detecting this out-of-bounds memory access.

Conclusion

Where tools won’t help

It is easy to get carried away and add lots and lots of automated tools to your software build and test infrastructure, in the hope of finding more and more errors, but keep in mind that…

  • Each tool you add slows down your automated test routine and requires additional long-term maintenance effort. Because of this, you should normally only use one tool per category of error that you want to detect. Prefer a few good tools over many bad ones.
  • Tools are mainly focused on helping you follow the language rules and avoid suspicious constructs (areas of the language that are known to be hard to use, or useless code that suggests a typo has been made). Tools will not normally find more advanced logic errors, such as incorrect math, hardcoded reduced-precision math constants (e.g. #define PI 3.14) or use of exact comparison for validation of floating-point results.
  • To detect this sort of errors and make the most of dynamic analysis tools, you will want to have a good test suite that covers as much of your code as possible in a way that is as fine-grained as possible. We will explore this further in the remainder of this course.

Want more tools anyway?

Here are some other tools that you may want to experiment with, which may find extra issues beyond the ones covered by the tools we discussed here.