Detecting superfluous #includes in C/C++?

C++CRefactoringIncludeDependencies

C++ Problem Overview


I often find that the headers section of a file get larger and larger all the time but it never gets smaller. Throughout the life of a source file classes may have moved and been refactored and it's very possible that there are quite a few #includes that don't need to be there and anymore. Leaving them there only prolong the compile time and adds unnecessary compilation dependencies. Trying to figure out which are still needed can be quite tedious.

Is there some kind of tool that can detect superfluous #include directives and suggest which ones I can safely remove?
Does lint do this maybe?

C++ Solutions


Solution 1 - C++

Google's cppclean (links to: download, documentation) can find several categories of C++ problems, and it can now find superfluous #includes.

There's also a Clang-based tool, include-what-you-use, that can do this. include-what-you-use can even suggest forward declarations (so you don't have to #include so much) and optionally clean up your #includes for you.

Current versions of Eclipse CDT also have this functionality built in: going under the Source menu and clicking Organize Includes will alphabetize your #include's, add any headers that Eclipse thinks you're using without directly including them, and comments out any headers that it doesn't think you need. This feature isn't 100% reliable, however.

Solution 2 - C++

Also check out [include-what-you-use][1], which solves a similar problem.

[1]: https://github.com/include-what-you-use/include-what-you-use "include-what-you-use"

Solution 3 - C++

It's not automatic, but doxygen will produce dependency diagrams for #included files. You will have to go through them visually, but they can be very useful for getting a picture of what is using what.

Solution 4 - C++

The problem with detecting superfluous includes is that it can't be just a type dependency checker. A superfluous include is a file which provides nothing of value to the compilation and does not alter another item which other files depend. There are many ways a header file can alter a compile, say by defining a constant, redefining and/or deleting a used macro, adding a namespace which alters the lookup of a name some way down the line. In order to detect items like the namespace you need much more than a preprocessor, you in fact almost need a full compiler.

Lint is more of a style checker and certainly won't have this full capability.

I think you'll find the only way to detect a superfluous include is to remove, compile and run suites.

Solution 5 - C++

I thought that PCLint would do this, but it has been a few years since I've looked at it. You might check it out.

I looked at this blog and the author talked a bit about configuring PCLint to find unused includes. Might be worth a look.

Solution 6 - C++

The CScout refactoring browser can detect superfluous include directives in C (unfortunately not C++) code. You can find a description of how it works in this journal article.

Solution 7 - C++

Sorry to (re-)post here, people often don't expand comments.

Check my comment to crashmstr, FlexeLint / PC-Lint will do this for you. Informational message 766. Section 11.8.1 of my manual (version 8.0) discusses this.

Also, and this is important, keep iterating until the message goes away. In other words, after removing unused headers, re-run lint, more header files might have become "unneeded" once you remove some unneeded headers. (That might sound silly, read it slowly & parse it, it makes sense.)

Solution 8 - C++

I've never found a full-fledged tool that accomplishes what you're asking. The closest thing I've used is IncludeManager, which graphs your header inclusion tree so you can visually spot things like headers included in only one file and circular header inclusions.

Solution 9 - C++

You can write a quick script that erases a single #include directive, compiles the projects, and logs the name in the #include and the file it was removed from in the case that no compilation errors occurred.

Let it run during the night, and the next day you will have a 100% correct list of include files you can remove.

Sometimes brute-force just works :-)


edit: and sometimes it doesn't :-). Here's a bit of information from the comments:

  1. Sometimes you can remove two header files separately, but not both together. A solution is to remove the header files during the run and not bring them back. This will find a list of files you can safely remove, although there might a solution with more files to remove which this algorithm won't find. (it's a greedy search over the space of include files to remove. It will only find a local maximum)
  2. There may be subtle changes in behavior if you have some macros redefined differently depending on some #ifdefs. I think these are very rare cases, and the Unit Tests which are part of the build should catch these changes.

Solution 10 - C++

I've tried using Flexelint (the unix version of PC-Lint) and had somewhat mixed results. This is likely because I'm working on a very large and knotty code base. I recommend carefully examining each file that is reported as unused.

The main worry is false positives. Multiple includes of the same header are reported as an unneeded header. This is bad since Flexelint does not tell you what line the header is included on or where it was included before.

One of the ways automated tools can get this wrong:

In A.hpp:

class A { 
  // ...
};

In B.hpp:

#include "A.hpp

class B {
    public:
        A foo;
};

In C.cpp:

#include "C.hpp"  

#include "B.hpp"  // <-- Unneeded, but lint reports it as needed
#include "A.hpp"  // <-- Needed, but lint reports it as unneeded

If you blindly follow the messages from Flexelint you'll muck up your #include dependencies. There are more pathological cases, but basically you're going to need to inspect the headers yourself for best results.

I highly recommend this article on Physical Structure and C++ from the blog Games from within. They recommend a comprehensive approach to cleaning up the #include mess:

> Guidelines > > Here’s a distilled set of guidelines from Lakos’ book that minimize the number of physical dependencies between files. I’ve been using them for years and I’ve always been really happy with the results. > > 1. Every cpp file includes its own header file first. [snip] > 2. A header file must include all the header files necessary to parse it. [snip] > 3. A header file should have the bare minimum number of header files necessary to parse it. [snip]

Solution 11 - C++

If you are using Eclipse CDT you can try http://includator.com which is free for beta testers (at the time of this writing) and automatically removes superfluous #includes or adds missing ones. For those users who have FlexeLint or PC-Lint and are using Elicpse CDT, http://linticator.com might be an option (also free for beta test). While it uses Lint's analysis, it provides quick-fixes for automatically remove the superfluous #include statements.

Solution 12 - C++

This article explains a technique of #include removing by using the parsing of Doxygen. That's just a perl script, so it's quite easy to use.

Solution 13 - C++

CLion, the C/C++ IDE from JetBrains, detects redundant includes out-of-the-box. These are grayed-out in the editor, but there are also functions to optimise includes in the current file or whole project.

I've found that you pay for this functionality though; CLion takes a while to scan and analyse your project when first loaded.

Solution 14 - C++

Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:

http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes

(this is an old branch because trunk doesn't have the file anymore)

Solution 15 - C++

There's two types of superfluous #include files:

  1. A header file actually not needed by the module(.c, .cpp) at all
  2. A header file is need by the module but being included more than once, directly, or indirectly.

There's 2 ways in my experience that works well to detecting it:

  • gcc -H or cl.exe /showincludes (resolve problem 2)

    In real world, you can export CFLAGS=-H before make, if all the Makefile's not override CFLAGS options. Or as I used, you can create a cc/g++ wrapper to add -H options forcibly to each invoke of $(CC) and $(CXX). and prepend the wrapper's directory to $PATH variable, then your make will all uses you wrapper command instead. Of course your wrapper should invoke the real gcc compiler. This tricks need to change if your Makefile uses gcc directly. instead of $(CC) or $(CXX) or by implied rules.

You can also compile a single file by tweaking with the command line. But if you want to clean headers for the whole project. You can capture all the output by:

make clean

make 2>&1 | tee result.txt
  • PC-Lint/FlexeLint(resolve problem both 1 and 2)

    make sure add the +e766 options, this warning is about: unused header files.

    pclint/flint  -vf   ...
    

    This will cause pclint output included header files, nested header files will be indented appropriately.

Solution 16 - C++

There is a free tool Include File Dependencies Watcher which can be integrated in the visual studio. It shows superfluous #includes in red.

Solution 17 - C++

Here is a simple brute force way of identifying superfluous header includes. It's not perfect but eliminates the "obvious" unnecessary includes. Getting rid of these goes a long way in cleaning up the code.

The scripts can be accessed directly on GitHub.

Solution 18 - C++

Gimpel Software's http://www.gimpel.com/html/pcl.htm">PC Lint can report on when an include file has been included more than once in a compilation unit, but it can't find include files which are not needed in the way you are looking for.

Edit: It can. See https://stackoverflow.com/questions/614794/c-c-detecting-superfluous-includes/614811#614811">itsmatt's answer

Solution 19 - C++

To end this discussion: the c++ preprocessor is turing complete. It is a semantic property, whether an include is superfluous. Hence, it follows from Rice's theorem that it is undecidable whether an include is superfluous or not. There CAN'T be a program, that (always correctly) detects whether an include is superfluous.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionshooshView Question on Stackoverflow
Solution 1 - C++Josh KelleyView Answer on Stackoverflow
Solution 2 - C++TzafrirView Answer on Stackoverflow
Solution 3 - C++anonView Answer on Stackoverflow
Solution 4 - C++JaredParView Answer on Stackoverflow
Solution 5 - C++itsmattView Answer on Stackoverflow
Solution 6 - C++Diomidis SpinellisView Answer on Stackoverflow
Solution 7 - C++DanView Answer on Stackoverflow
Solution 8 - C++Dan OlsonView Answer on Stackoverflow
Solution 9 - C++Gilad NaorView Answer on Stackoverflow
Solution 10 - C++Ben MartinView Answer on Stackoverflow
Solution 11 - C++PeterSomView Answer on Stackoverflow
Solution 12 - C++Steve GuryView Answer on Stackoverflow
Solution 13 - C++congusbongusView Answer on Stackoverflow
Solution 14 - C++rubenvbView Answer on Stackoverflow
Solution 15 - C++zhaorufeiView Answer on Stackoverflow
Solution 16 - C++VladimirView Answer on Stackoverflow
Solution 17 - C++ap-osdView Answer on Stackoverflow
Solution 18 - C++crashmstrView Answer on Stackoverflow
Solution 19 - C++AlgomanView Answer on Stackoverflow