How should I detect unnecessary #include files in a large C++ project?

C++Visual Studio-2008IncludeHeaderDependencies

C++ Problem Overview


I am working on a large C++ project in Visual Studio 2008, and there are a lot of files with unnecessary #include directives. Sometimes the #includes are just artifacts and everything will compile fine with them removed, and in other cases classes could be forward declared and the #include could be moved to the .cpp file. Are there any good tools for detecting both of these cases?

C++ Solutions


Solution 1 - C++

While it won't reveal unneeded include files, Visual studio has a setting /showIncludes (right click on a .cpp file, Properties->C/C++->Advanced) that will output a tree of all included files at compile time. This can help in identifying files that shouldn't need to be included.

You can also take a look at the pimpl idiom to let you get away with fewer header file dependencies to make it easier to see the cruft that you can remove.

Solution 2 - C++

[PC Lint][1] works quite well for this, and it finds all sorts of other goofy problems for you too. It has command line options that can be used to create External Tools in Visual Studio, but I've found that the [Visual Lint][2] addin is easier to work with. Even the free version of Visual Lint helps. But give PC-Lint a shot. Configuring it so it doesn't give you too many warnings takes a bit of time, but you'll be amazed at what it turns up.

[1]: http://www.gimpel.com/html/pcl.htm "PC Lint" [2]: http://www.riverblade.co.uk/products/visual_lint/index.html "Visual Lint"

Solution 3 - C++

There's a new Clang-based tool, include-what-you-use, that aims to do this.

Solution 4 - C++

!!DISCLAIMER!! I work on a commercial static analysis tool (not PC Lint). !!DISCLAIMER!!

There are several issues with a simple non parsing approach:

  1. Overload Sets:

It's possible that an overloaded function has declarations that come from different files. It might be that removing one header file results in a different overload being chosen rather than a compile error! The result will be a silent change in semantics that may be very difficult to track down afterwards.

  1. Template specializations:

Similar to the overload example, if you have partial or explicit specializations for a template you want them all to be visible when the template is used. It might be that specializations for the primary template are in different header files. Removing the header with the specialization will not cause a compile error, but may result in undefined behaviour if that specialization would have been selected. (See: https://stackoverflow.com/questions/59331/visibility-of-template-specialization-of-c-function)

As pointed out by 'msalters', performing a full analysis of the code also allows for analysis of class usage. By checking how a class is used though a specific path of files, it is possible that the definition of the class (and therefore all of its dependnecies) can be removed completely or at least moved to a level closer to the main source in the include tree.

Solution 5 - C++

I don't know of any such tools, and I have thought about writing one in the past, but it turns out that this is a difficult problem to solve.

Say your source file includes a.h and b.h; a.h contains #define USE_FEATURE_X and b.h uses #ifdef USE_FEATURE_X. If #include "a.h" is commented out, your file may still compile, but may not do what you expect. Detecting this programatically is non-trivial.

Whatever tool does this would need to know your build environment as well. If a.h looks like:

#if defined( WINNT )
   #define USE_FEATURE_X
#endif

Then USE_FEATURE_X is only defined if WINNT is defined, so the tool would need to know what directives are generated by the compiler itself as well as which ones are specified in the compile command rather than in a header file.

Solution 6 - C++

Like Timmermans, I'm not familiar with any tools for this. But I have known programmers who wrote a Perl (or Python) script to try commenting out each include line one at a time and then compile each file.


It appears that now Eric Raymond has a tool for this.

Google's cpplint.py has an "include what you use" rule (among many others), but as far as I can tell, no "include only what you use." Even so, it can be useful.

Solution 7 - C++

If you're interested in this topic in general, you might want to check out Lakos' [Large Scale C++ Software Design][1]. It's a bit dated, but goes into lots of "physical design" issues like finding the absolute minimum of headers that need to be included. I haven't really seen this sort of thing discussed anywhere else.

[1]: http://www.amazon.com/Large-Scale-Software-Addison-Wesley-Professional-Computing/dp/0201633620/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1226092585&sr=8-1 "Large-Scale C++ Software Design"

Solution 8 - C++

Give Include Manager a try. It integrates easily in Visual Studio and visualizes your include paths which helps you to find unnecessary stuff. Internally it uses Graphviz but there are many more cool features. And although it is a commercial product it has a very low price.

Solution 9 - C++

You can build an include graph using C/C++ Include File Dependencies Watcher, and find unneeded includes visually.

Solution 10 - C++

If your header files generally start with

#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#endif

(as opposed to using #pragma once) you could change that to:

#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#else 
#pragma message("Someheader.h superfluously included")
#endif

And since the compiler outputs the name of the cpp file being compiled, that would let you know at least which cpp file is causing the header to be brought in multiple times.

Solution 11 - C++

PC-Lint can indeed do this. One easy way to do this is to configure it to detect just unused include files and ignore all other issues. This is pretty straightforward - to enable just message 766 ("Header file not used in module"), just include the options -w0 +e766 on the command line.

The same approach can also be used with related messages such as 964 ("Header file not directly used in module") and 966 ("Indirectly included header file not used in module").

FWIW I wrote about this in more detail in a blog post last week at http://www.riverblade.co.uk/blog.php?archive=2008_09_01_archive.xml#3575027665614976318.

Solution 12 - C++

Adding one or both of the following #defines will exclude often unnecessary header files and may substantially improve compile times especially if the code that is not using Windows API functions.

#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN

See http://support.microsoft.com/kb/166474

Solution 13 - C++

If you are looking to remove unnecessary #include files in order to decrease build times, your time and money might be better spent parallelizing your build process using cl.exe /MP, make -j, Xoreax IncrediBuild, distcc/icecream, etc.

Of course, if you already have a parallel build process and you're still trying to speed it up, then by all means clean up your #include directives and remove those unnecessary dependencies.

Solution 14 - C++

Start with each include file, and ensure that each include file only includes what is necessary to compile itself. Any include files that are then missing for the C++ files, can be added to the C++ files themselves.

For each include and source file, comment out each include file one at a time and see if it compiles.

It is also a good idea to sort the include files alphabetically, and where this is not possible, add a comment.

Solution 15 - C++

If you aren't already, using a precompiled header to include everything that you're not going to change (platform headers, external SDK headers, or static already completed pieces of your project) will make a huge difference in build times.

http://msdn.microsoft.com/en-us/library/szfdksca(VS.71).aspx

Also, although it may be too late for your project, organizing your project into sections and not lumping all local headers to one big main header is a good practice, although it takes a little extra work.

Solution 16 - C++

If you would work with Eclipse CDT you could try out http://includator.com to optimize your include structure. However, Includator might not know enough about VC++'s predefined includes and setting up CDT to use VC++ with correct includes is not built into CDT yet.

Solution 17 - C++

The latest Jetbrains IDE, CLion, automatically shows (in gray) the includes that are not used in the current file.

It is also possible to have the list of all the unused includes (and also functions, methods, etc...) from the IDE.

Solution 18 - C++

Some of the existing answers state that it's hard. That's indeed true, because you need a full compiler to detect the cases in which a forward declaration would be appropriate. You cant parse C++ without knowing what the symbols mean; the grammar is simply too ambiguous for that. You must know whether a certain name names a class (could be forward-declared) or a variable (can't). Also, you need to be namespace-aware.

Solution 19 - C++

Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:

http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes

(this is an old branch because trunk doesn't have the file anymore)

Solution 20 - C++

If there's a particular header that you think isn't needed anymore (say string.h), you can comment out that include then put this below all the includes:

#ifdef _STRING_H_
#  error string.h is included indirectly
#endif

Of course your interface headers might use a different #define convention to record their inclusion in CPP memory. Or no convention, in which case this approach won't work.

Then rebuild. There are three possibilities:

  • It builds ok. string.h wasn't compile-critical, and the include for it can be removed.

  • The #error trips. string.g was included indirectly somehow You still don't know if string.h is required. If it is required, you should directly #include it (see below).

  • You get some other compilation error. string.h was needed and isn't being included indirectly, so the include was correct to begin with.

Note that depending on indirect inclusion when your .h or .c directly uses another .h is almost certainly a bug: you are in effect promising that your code will only require that header as long as some other header you're using requires it, which probably isn't what you meant.

The caveats mentioned in other answers about headers that modify behavior rather that declaring things which cause build failures apply here as well.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionshambolicView Question on Stackoverflow
Solution 1 - C++EclipseView Answer on Stackoverflow
Solution 2 - C++JoeView Answer on Stackoverflow
Solution 3 - C++Josh KelleyView Answer on Stackoverflow
Solution 4 - C++Richard CordenView Answer on Stackoverflow
Solution 5 - C++Graeme PerrowView Answer on Stackoverflow
Solution 6 - C++Max LybbertView Answer on Stackoverflow
Solution 7 - C++AdrianView Answer on Stackoverflow
Solution 8 - C++AlexView Answer on Stackoverflow
Solution 9 - C++VladimirView Answer on Stackoverflow
Solution 10 - C++SamView Answer on Stackoverflow
Solution 11 - C++Anna-Jayne MetcalfeView Answer on Stackoverflow
Solution 12 - C++Roger NelsonView Answer on Stackoverflow
Solution 13 - C++bk1eView Answer on Stackoverflow
Solution 14 - C++selwynView Answer on Stackoverflow
Solution 15 - C++anon6439View Answer on Stackoverflow
Solution 16 - C++PeterSomView Answer on Stackoverflow
Solution 17 - C++Jean-Michaël CelerierView Answer on Stackoverflow
Solution 18 - C++MSaltersView Answer on Stackoverflow
Solution 19 - C++rubenvbView Answer on Stackoverflow
Solution 20 - C++Britton KerinView Answer on Stackoverflow