Maven docker cache dependencies

JavaDockerMavenCachingDockerfile

Java Problem Overview


I'm trying to use docker to automate maven builds. The project I want to build takes nearly 20 minutes to download all the dependencies, so I tried to build a docker image that would cache these dependencies, but it doesn't seem to save it. My Dockerfile is

FROM maven:alpine
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
ADD pom.xml /usr/src/app
RUN mvn dependency:go-offline

The image builds, and it does download everything. However, the resulting image is the same size as the base maven:alpine image, so it doesn't seem to have cached the dependencies in the image. When I try to use the image to mvn compile it goes through the full 20 minutes of redownloading everything.

Is it possible to build a maven image that caches my dependencies so they don't have to download everytime I use the image to perform a build?

I'm running the following commands:

docker build -t my-maven .

docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -w /usr/src/mymaven my-maven mvn compile

My understanding is that whatever RUN does during the docker build process becomes part of the resulting image.

Java Solutions


Solution 1 - Java

Usually, there's no change in pom.xml file but just some other source code changes when you're attempting to start docker image build. In such circumstance you can do this:

FYI:

FROM maven:3-jdk-8

ENV HOME=/home/usr/app

RUN mkdir -p $HOME

WORKDIR $HOME

# 1. add pom.xml only here

ADD pom.xml $HOME

# 2. start downloading dependencies

RUN ["/usr/local/bin/mvn-entrypoint.sh", "mvn", "verify", "clean", "--fail-never"]

# 3. add all source code and start compiling

ADD . $HOME

RUN ["mvn", "package"]

EXPOSE 8005

CMD ["java", "-jar", "./target/dist.jar"]

So the key is:

  1. add pom.xml file.

  2. then mvn verify --fail-never it, it will download maven dependencies.

  3. add all your source file then, and start your compilation(mvn package).

When there are changes in your pom.xml file or you are running this script for the first time, docker will do 1 -> 2 -> 3. When there are no changes in pom.xml file, docker will skip step 1、2 and do 3 directly.

This simple trick can be used in many other package management circumstances(gradle、yarn、npm、pip).

Edit:

You should also consider using mvn dependency:resolve or mvn dependency:go-offline accordingly as other comments & answers suggest.

Solution 2 - Java

Using BuildKit

From Docker v18.03 onwards you can use BuildKit instead of volumes that were mentioned in the other answers. It allows mounting caches that can persist between builds and you can avoid downloading contents of the corresponding .m2/repository every time.

Assuming that the Dockerfile is in the root of your project:

# syntax = docker/dockerfile:1.0-experimental

FROM maven:3.6.0-jdk-11-slim AS build
COPY . /home/build
RUN mkdir /home/.m2
WORKDIR /home/.m2
USER root
RUN --mount=type=cache,target=/root/.m2 mvn -f /home/build/pom.xml clean compile

target=/root/.m2 mounts cache to the specified place in maven image Dockerfile docs.

For building you can run the following command:

DOCKER_BUILDKIT=1 docker build --rm --no-cache  .   

More info on BuildKit can be found here.

Solution 3 - Java

It turns out the image I'm using as a base has a parent image which defines

VOLUME "$USER_HOME_DIR/.m2"

see: https://github.com/carlossg/docker-maven/blob/322d0dff5d0531ccaf47bf49338cb3e294fd66c8/jdk-8/Dockerfile

The result is that during the build, all the files are written to $USER_HOME_DIR/.m2, but because it is expected to be a volume, none of those files are persisted with the container image.

Currently in Docker there isn't any way to unregister that volume definition, so it would be necessary to build a separate maven image, rather than use the official maven image.

Solution 4 - Java

I don't think the other answers here are optimal. For example, the mvn verify answer executes the following phases, and does a lot more than just resolving dependencies:

> validate - validate the project is correct and all necessary information is available > > compile - compile the source code of the project > > test - test the compiled source code using a suitable unit testing framework. These tests should not require the code be packaged or deployed > > package - take the compiled code and package it in its distributable format, such as a JAR. > > verify - run any checks on results of integration tests to ensure quality criteria are met

All of these phases and their associated goals don't need to be ran if you only want to resolve dependencies.

If you only want to resolve dependencies, you can use the dependency:go-offline goal:

FROM maven:3-jdk-12
WORKDIR /tmp/example/

COPY pom.xml .
RUN mvn dependency:go-offline

COPY src/ src/
RUN mvn package

Solution 5 - Java

@Kim is closest, but it's not quite there yet. I don't think adding --fail-never is correct, even through it get's the job done.

The verify command causes a lot of plugins to execute which is a problem (for me) - I don't think they should be executing when all I want is to install dependencies! I also have a multi-module build and a javascript sub-build so this further complicates the setup.

But running only verify is not enough, because if you run install in the following commands, there will be more plugins used - which means more dependencies to download - maven refuses to download them otherwise. Relevant read: Maven: Introduction to the Build Lifecycle

You basically have to find what properties disable each plugin and add them one-by-one, so they don't break your build.

WORKDIR /srv

# cache Maven dependencies
ADD cli/pom.xml /srv/cli/
ADD core/pom.xml /srv/core/
ADD parent/pom.xml /srv/parent/
ADD rest-api/pom.xml /srv/rest-api/
ADD web-admin/pom.xml /srv/web-admin/
ADD pom.xml /srv/
RUN mvn -B clean install -DskipTests -Dcheckstyle.skip -Dasciidoctor.skip -Djacoco.skip -Dmaven.gitcommitid.skip -Dspring-boot.repackage.skip -Dmaven.exec.skip=true -Dmaven.install.skip -Dmaven.resources.skip

# cache YARN dependencies
ADD ./web-admin/package.json ./web-admin/yarn.lock /srv/web-admin/
RUN yarn --non-interactive --frozen-lockfile --no-progress --cwd /srv/web-admin install

# build the project
ADD . /srv
RUN mvn -B clean install

but some plugins are not that easily skipped - I'm not a maven expert (so I don't know why it ignores the cli option - it might be a bug), but the following works as expected for org.codehaus.mojo:exec-maven-plugin

<project>
    <properties>
        <maven.exec.skip>false</maven.exec.skip>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <version>1.3.2</version>
                <executions>
                    <execution>
                        <id>yarn install</id>
                        <goals>
                            <goal>exec</goal>
                        </goals>
                        <phase>initialize</phase>
                        <configuration>
                            <executable>yarn</executable>
                            <arguments>
                                <argument>install</argument>
                            </arguments>
                            <skip>${maven.exec.skip}</skip>
                        </configuration>
                    </execution>
                    <execution>
                        <id>yarn run build</id>
                        <goals>
                            <goal>exec</goal>
                        </goals>
                        <phase>compile</phase>
                        <configuration>
                            <executable>yarn</executable>
                            <arguments>
                                <argument>run</argument>
                                <argument>build</argument>
                            </arguments>
                            <skip>${maven.exec.skip}</skip>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

please notice the explicit <skip>${maven.exec.skip}</skip> - other plugins pick this up from the cli params but not this one (neither -Dmaven.exec.skip=true nor -Dexec.skip=true work by itself)

Hope this helps

Solution 6 - Java

There are two ways to cache maven dependencies:

  1. Execute "mvn verify" as part of a container execution, NOT build, and make sure you mount .m2 from a volume.

This is efficient but it does not play well with cloud build and multiple build slaves 2. Use a "dependencies cache container", and update it periodically. Here is how:

a. Create a Dockerfile that copies the pom and build offline dependencies:

    FROM maven:3.5.3-jdk-8-alpine
    WORKDIR /build
    COPY pom.xml .
    RUN mvn dependency:go-offline

b. Build it periodically (e.g. nightly) as "Deps:latest"

c. Create another Dockerfile to actually build the system per commit (preferably use multi-stage) - and make sure it is FROM Deps.

Using this system you will have fast, reconstruct-able builds with a mostly good-enough cache.

Solution 7 - Java

Similar with @Kim answer but I use dependency:resolve mvn command. So here's my complete Dockerfile:

FROM maven:3.5.0-jdk-8-alpine

WORKDIR /usr/src/app

# First copy only the pom file. This is the file with less change
COPY ./pom.xml .

# Download the package and make it cached in docker image
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml dependency:resolve

# Copy the actual code
COPY ./ .

# Then build the code
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml package

# The rest is same as usual
EXPOSE 8888

CMD ["java", "-jar", "./target/YOUR-APP.jar"]

Solution 8 - Java

After a few days of struggling, I managed to do this caching later using intermediate contrainer, and I'd like to summarize my findings here as this topic is so useful and being frequently shown in Google search frontpage:

  1. Kim's answer is only working to a certain condition: pom.xml cannot be changed, plus Maven do a regular update daily basis by default
  2. mvn dependency:go-offline -B --fail-never has a similar drawback, so if you need to pull fresh code from repo, high chances are Maven will trigger a full checkout every time
  3. Mount volume is not working as well because we need to resolve the dependencies during image being built
  4. Finally, I have a workable solution combined(May be not working to others):
  • Build an image to resolve all the dependencies first(Not intermediate image)
  • Create another Dockerfile with intermediate image, sample dockerfiles like this:
#docker build -t dependencies .
From ubuntu
COPY pom.xml pom.xml
RUN mvn dependency:go-offline -B --fail-never
From dependencies as intermediate

From tomcat
RUN git pull repo.git (whatsoever)
RUN mvn package

The idea is to keep all the dependencies in a different image that Maven can use immediately

It could be other scenarios I haven't encountered yet, but this solution relief me a bit from download 3GB rubbish every time I cannot imagine why Java became such a fat whale in today's lean world

Solution 9 - Java

I think the general game plan presented among the other answers is the right idea:

  1. Copy pom.xml
  2. Get dependencies
  3. Copy source
  4. Build

However, exactly how you do step #2 is the real key. For me, using the same command I used for building to fetch dependencies was the right solution:

FROM java/java:latest

# Work dir
WORKDIR /app
RUN mkdir -p .
# Copy pom and get dependencies
COPY pom.xml pom.xml
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single

# Copy and build source
COPY . .
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single

Any other command used to fetch dependencies resulted in many things needing to be download during the build step. It makes sense the running the exact command you plan on running will you get you the closest to everything you need to actually run that command.

Solution 10 - Java

I had this issue just a litle while ago. The are many solutions on the web, but the one that worked for me is simply mount a volume for the maven modules directory:

mkdir /opt/myvolumes/m2

then in the Dockerfile:

...
VOLUME /opt/myvolumes/m2:/root/.m2
...

There are better solutions, but not as straightforward.

This blog post goes the extra mile in helping you to cache everything:

https://keyholesoftware.com/2015/01/05/caching-for-maven-docker-builds/

Solution 11 - Java

A local Nexus 3 Image running in Docker and acting as a local Proxy is an acceptable solution:

The idea is similar to Dockerize an apt-cacher-ng service apt-cacher-ng

here you can find a comprehensive step by step. github repo

Its really fast.

Solution 12 - Java

Another Solution would be using a repository manger such as Sonar Nexus or Artifactory. You can set a maven proxy inside the registry then use the registry as your source of maven repositories.

Solution 13 - Java

I had to deal with the same issue.

Unfortunately, as just said by another contributor, dependency:go-offline and the other goals, don't fully solve the problem: many dependencies are not downloaded.

I found a working solution as follow.

# Cache dependencies

ADD settings.xml .
ADD pom.xml .

RUN mvn -B -s settings.xml -Ddocker.build.skip=true package test

# Build artifact

ADD src .
RUN mvn -B -s settings.xml -DskipTests package

The trick is to do a full build without sources, which produces a full dependency scan.

In order to avoid errors on some plugins (for example: OpenAPI maven generator plugin or Spring Boot maven plugin) I had to skip its goals, but letting it to download all the dependencies by adding for each one a configuration settings like follow:

<configuration>
    <skip>${docker.build.skip}</skip>
</configuration>

Regards.

Solution 14 - Java

If the dependencies are downloaded after the container is already up, then you need to commit the changes on this container and create a new image with the downloaded artifacts.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDaniel WatrousView Question on Stackoverflow
Solution 1 - JavaKimView Answer on Stackoverflow
Solution 2 - JavaFarzad VertigoView Answer on Stackoverflow
Solution 3 - JavaDaniel WatrousView Answer on Stackoverflow
Solution 4 - JavaKrzysztof CzelusniakView Answer on Stackoverflow
Solution 5 - JavaFilip ProcházkaView Answer on Stackoverflow
Solution 6 - JavaOmri SpectorView Answer on Stackoverflow
Solution 7 - JavaikandarsView Answer on Stackoverflow
Solution 8 - JavaCalvin ZhouView Answer on Stackoverflow
Solution 9 - JavaValevalorinView Answer on Stackoverflow
Solution 10 - JavaBruno9779View Answer on Stackoverflow
Solution 11 - JavaArmando BallaciView Answer on Stackoverflow
Solution 12 - JavaSeyed Vahid HashemiView Answer on Stackoverflow
Solution 13 - JavaAntonio PetriccaView Answer on Stackoverflow
Solution 14 - Javavstrom coderView Answer on Stackoverflow