Maven docker cache dependencies
JavaDockerMavenCachingDockerfileJava Problem Overview
I'm trying to use docker to automate maven builds. The project I want to build takes nearly 20 minutes to download all the dependencies, so I tried to build a docker image that would cache these dependencies, but it doesn't seem to save it. My Dockerfile is
FROM maven:alpine
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
ADD pom.xml /usr/src/app
RUN mvn dependency:go-offline
The image builds, and it does download everything. However, the resulting image is the same size as the base maven:alpine
image, so it doesn't seem to have cached the dependencies in the image. When I try to use the image to mvn compile
it goes through the full 20 minutes of redownloading everything.
Is it possible to build a maven image that caches my dependencies so they don't have to download everytime I use the image to perform a build?
I'm running the following commands:
docker build -t my-maven .
docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -w /usr/src/mymaven my-maven mvn compile
My understanding is that whatever RUN
does during the docker build process becomes part of the resulting image.
Java Solutions
Solution 1 - Java
Usually, there's no change in pom.xml
file but just some other source code changes when you're attempting to start docker image build. In such circumstance you can do this:
FYI:
FROM maven:3-jdk-8
ENV HOME=/home/usr/app
RUN mkdir -p $HOME
WORKDIR $HOME
# 1. add pom.xml only here
ADD pom.xml $HOME
# 2. start downloading dependencies
RUN ["/usr/local/bin/mvn-entrypoint.sh", "mvn", "verify", "clean", "--fail-never"]
# 3. add all source code and start compiling
ADD . $HOME
RUN ["mvn", "package"]
EXPOSE 8005
CMD ["java", "-jar", "./target/dist.jar"]
So the key is:
-
add
pom.xml
file. -
then
mvn verify --fail-never
it, it will download maven dependencies. -
add all your source file then, and start your compilation(
mvn package
).
When there are changes in your pom.xml
file or you are running this script for the first time, docker will do 1 -> 2 -> 3. When there are no changes in pom.xml
file, docker will skip step 1、2 and do 3 directly.
This simple trick can be used in many other package management circumstances(gradle、yarn、npm、pip).
Edit:
You should also consider using mvn dependency:resolve
or mvn dependency:go-offline
accordingly as other comments & answers suggest.
Solution 2 - Java
Using BuildKit
From Docker v18.03
onwards you can use BuildKit instead of volumes that were mentioned in the other answers. It allows mounting caches that can persist between builds and you can avoid downloading contents of the corresponding .m2/repository
every time.
Assuming that the Dockerfile is in the root of your project:
# syntax = docker/dockerfile:1.0-experimental
FROM maven:3.6.0-jdk-11-slim AS build
COPY . /home/build
RUN mkdir /home/.m2
WORKDIR /home/.m2
USER root
RUN --mount=type=cache,target=/root/.m2 mvn -f /home/build/pom.xml clean compile
target=/root/.m2
mounts cache to the specified place in maven image Dockerfile docs.
For building you can run the following command:
DOCKER_BUILDKIT=1 docker build --rm --no-cache .
More info on BuildKit can be found here.
Solution 3 - Java
It turns out the image I'm using as a base has a parent image which defines
VOLUME "$USER_HOME_DIR/.m2"
The result is that during the build, all the files are written to $USER_HOME_DIR/.m2
, but because it is expected to be a volume, none of those files are persisted with the container image.
Currently in Docker there isn't any way to unregister that volume definition, so it would be necessary to build a separate maven image, rather than use the official maven image.
Solution 4 - Java
I don't think the other answers here are optimal. For example, the mvn verify
answer executes the following phases, and does a lot more than just resolving dependencies:
> validate - validate the project is correct and all necessary information is available > > compile - compile the source code of the project > > test - test the compiled source code using a suitable unit testing framework. These tests should not require the code be packaged or deployed > > package - take the compiled code and package it in its distributable format, such as a JAR. > > verify - run any checks on results of integration tests to ensure quality criteria are met
All of these phases and their associated goals don't need to be ran if you only want to resolve dependencies.
If you only want to resolve dependencies, you can use the dependency:go-offline
goal:
FROM maven:3-jdk-12
WORKDIR /tmp/example/
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src/ src/
RUN mvn package
Solution 5 - Java
@Kim is closest, but it's not quite there yet. I don't think adding --fail-never
is correct, even through it get's the job done.
The verify
command causes a lot of plugins to execute which is a problem (for me) - I don't think they should be executing when all I want is to install dependencies! I also have a multi-module build and a javascript sub-build so this further complicates the setup.
But running only verify
is not enough, because if you run install
in the following commands, there will be more plugins used - which means more dependencies to download - maven refuses to download them otherwise. Relevant read: Maven: Introduction to the Build Lifecycle
You basically have to find what properties disable each plugin and add them one-by-one, so they don't break your build.
WORKDIR /srv
# cache Maven dependencies
ADD cli/pom.xml /srv/cli/
ADD core/pom.xml /srv/core/
ADD parent/pom.xml /srv/parent/
ADD rest-api/pom.xml /srv/rest-api/
ADD web-admin/pom.xml /srv/web-admin/
ADD pom.xml /srv/
RUN mvn -B clean install -DskipTests -Dcheckstyle.skip -Dasciidoctor.skip -Djacoco.skip -Dmaven.gitcommitid.skip -Dspring-boot.repackage.skip -Dmaven.exec.skip=true -Dmaven.install.skip -Dmaven.resources.skip
# cache YARN dependencies
ADD ./web-admin/package.json ./web-admin/yarn.lock /srv/web-admin/
RUN yarn --non-interactive --frozen-lockfile --no-progress --cwd /srv/web-admin install
# build the project
ADD . /srv
RUN mvn -B clean install
but some plugins are not that easily skipped - I'm not a maven expert (so I don't know why it ignores the cli option - it might be a bug), but the following works as expected for org.codehaus.mojo:exec-maven-plugin
<project>
<properties>
<maven.exec.skip>false</maven.exec.skip>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.3.2</version>
<executions>
<execution>
<id>yarn install</id>
<goals>
<goal>exec</goal>
</goals>
<phase>initialize</phase>
<configuration>
<executable>yarn</executable>
<arguments>
<argument>install</argument>
</arguments>
<skip>${maven.exec.skip}</skip>
</configuration>
</execution>
<execution>
<id>yarn run build</id>
<goals>
<goal>exec</goal>
</goals>
<phase>compile</phase>
<configuration>
<executable>yarn</executable>
<arguments>
<argument>run</argument>
<argument>build</argument>
</arguments>
<skip>${maven.exec.skip}</skip>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
please notice the explicit <skip>${maven.exec.skip}</skip>
- other plugins pick this up from the cli params but not this one (neither -Dmaven.exec.skip=true
nor -Dexec.skip=true
work by itself)
Hope this helps
Solution 6 - Java
There are two ways to cache maven dependencies:
- Execute "mvn verify" as part of a container execution, NOT build, and make sure you mount .m2 from a volume.
This is efficient but it does not play well with cloud build and multiple build slaves 2. Use a "dependencies cache container", and update it periodically. Here is how:
a. Create a Dockerfile that copies the pom and build offline dependencies:
FROM maven:3.5.3-jdk-8-alpine
WORKDIR /build
COPY pom.xml .
RUN mvn dependency:go-offline
b. Build it periodically (e.g. nightly) as "Deps:latest"
c. Create another Dockerfile to actually build the system per commit (preferably use multi-stage) - and make sure it is FROM Deps.
Using this system you will have fast, reconstruct-able builds with a mostly good-enough cache.
Solution 7 - Java
Similar with @Kim answer but I use dependency:resolve
mvn command. So here's my complete Dockerfile:
FROM maven:3.5.0-jdk-8-alpine
WORKDIR /usr/src/app
# First copy only the pom file. This is the file with less change
COPY ./pom.xml .
# Download the package and make it cached in docker image
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml dependency:resolve
# Copy the actual code
COPY ./ .
# Then build the code
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml package
# The rest is same as usual
EXPOSE 8888
CMD ["java", "-jar", "./target/YOUR-APP.jar"]
Solution 8 - Java
After a few days of struggling, I managed to do this caching later using intermediate contrainer, and I'd like to summarize my findings here as this topic is so useful and being frequently shown in Google search frontpage:
- Kim's answer is only working to a certain condition: pom.xml cannot be changed, plus Maven do a regular update daily basis by default
- mvn dependency:go-offline -B --fail-never has a similar drawback, so if you need to pull fresh code from repo, high chances are Maven will trigger a full checkout every time
- Mount volume is not working as well because we need to resolve the dependencies during image being built
- Finally, I have a workable solution combined(May be not working to others):
- Build an image to resolve all the dependencies first(Not intermediate image)
- Create another Dockerfile with intermediate image, sample dockerfiles like this:
#docker build -t dependencies .
From ubuntu
COPY pom.xml pom.xml
RUN mvn dependency:go-offline -B --fail-never
From dependencies as intermediate
From tomcat
RUN git pull repo.git (whatsoever)
RUN mvn package
The idea is to keep all the dependencies in a different image that Maven can use immediately
It could be other scenarios I haven't encountered yet, but this solution relief me a bit from download 3GB rubbish every time I cannot imagine why Java became such a fat whale in today's lean world
Solution 9 - Java
I think the general game plan presented among the other answers is the right idea:
- Copy pom.xml
- Get dependencies
- Copy source
- Build
However, exactly how you do step #2 is the real key. For me, using the same command I used for building to fetch dependencies was the right solution:
FROM java/java:latest
# Work dir
WORKDIR /app
RUN mkdir -p .
# Copy pom and get dependencies
COPY pom.xml pom.xml
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single
# Copy and build source
COPY . .
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single
Any other command used to fetch dependencies resulted in many things needing to be download during the build step. It makes sense the running the exact command you plan on running will you get you the closest to everything you need to actually run that command.
Solution 10 - Java
I had this issue just a litle while ago. The are many solutions on the web, but the one that worked for me is simply mount a volume for the maven modules directory:
mkdir /opt/myvolumes/m2
then in the Dockerfile:
...
VOLUME /opt/myvolumes/m2:/root/.m2
...
There are better solutions, but not as straightforward.
This blog post goes the extra mile in helping you to cache everything:
https://keyholesoftware.com/2015/01/05/caching-for-maven-docker-builds/
Solution 11 - Java
A local Nexus 3 Image running in Docker and acting as a local Proxy is an acceptable solution:
The idea is similar to Dockerize an apt-cacher-ng service apt-cacher-ng
here you can find a comprehensive step by step. github repo
Its really fast.
Solution 12 - Java
Another Solution would be using a repository manger such as Sonar Nexus or Artifactory. You can set a maven proxy inside the registry then use the registry as your source of maven repositories.
Solution 13 - Java
I had to deal with the same issue.
Unfortunately, as just said by another contributor, dependency:go-offline
and the other goals, don't fully solve the problem: many dependencies are not downloaded.
I found a working solution as follow.
# Cache dependencies
ADD settings.xml .
ADD pom.xml .
RUN mvn -B -s settings.xml -Ddocker.build.skip=true package test
# Build artifact
ADD src .
RUN mvn -B -s settings.xml -DskipTests package
The trick is to do a full build without sources, which produces a full dependency scan.
In order to avoid errors on some plugins (for example: OpenAPI maven generator plugin or Spring Boot maven plugin) I had to skip its goals, but letting it to download all the dependencies by adding for each one a configuration settings like follow:
<configuration>
<skip>${docker.build.skip}</skip>
</configuration>
Regards.
Solution 14 - Java
If the dependencies are downloaded after the container is already up, then you need to commit the changes on this container and create a new image with the downloaded artifacts.