Jenkins + Play 1.2.4 : problems with cobertura lock files / report
JavaPlayframeworkJenkinsPlayframework 1.xCoberturaJava Problem Overview
We have a Play 1.2.4 application and we got Jenkins (on Ubuntu) for the application. We're having problems with Cobertura.
After running the tests (succesfully), every now and then, we get the following error:
---------------------------------------
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at net.sourceforge.cobertura.util.FileLocker.lock(FileLocker.java:124)
at play.modules.cobertura.CoberturaPlugin$CoberturaPluginShutdownThread.run(Unknown Source)
Caused by: java.nio.channels.OverlappingFileLockException
at sun.nio.ch.FileChannelImpl$SharedFileLockTable.checkList(FileChannelImpl.java:1166)
at sun.nio.ch.FileChannelImpl$SharedFileLockTable.add(FileChannelImpl.java:1068)
at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:824)
at java.nio.channels.FileChannel.lock(FileChannel.java:860)
... 6 more
---------------------------------------
Unable to get lock on /var/lib/jenkins/jobs/project/workspace/cobertura.ser.lock: null
This is known to happen on Linux kernel 2.6.20.
Make sure cobertura.jar is in the root classpath of the jvm
process running the instrumented code. If the instrumented code
is running in a web server, this means cobertura.jar should be in
the web server's lib directory.
Don't put multiple copies of cobertura.jar in different WEB-INF/lib directories.
Only one classloader should load cobertura. It should be the root classloader.
---------------------------------------
lock file could not be deleted
This doesn't seem to "break the build" but further down the build, we get the following (which causes the cobertura reports to fail)
Publishing Cobertura coverage report...
No coverage results were found using the pattern 'test-result/code-coverage/coverage.xml' relative to '/var/lib/jenkins/jobs/project/workspace'. Did you enter a pattern relative to the correct directory? Did you generate the XML report(s) for Cobertura?
Build step 'Publish Cobertura Coverage Report' changed build result to FAILURE
Running a subsequent build manually usually passes.
According to https://stackoverflow.com/questions/2148886/zero-code-coverage-with-cobertura-1-9-2-but-tests-are-working , I tried setting -Dcobertura.use.java.nio=false after play auto-test -command.
As this error was happening only now and then, I'm not totally sure if this helped. But after that, we got a problem with play auto-test hanging:
...
Executing /opt/play-1.2.4/play auto-test "/var/lib/jenkins/jobs/project/workspace" -Dcobertura.use.java.nio=false
[workspace] $ /opt/play-1.2.4/play auto-test "/var/lib/jenkins/jobs/project/workspace" -Dcobertura.use.java.nio=false
<build stuck here for a couple of days>
Since nothing has been totally deterministic, it's a bit difficult to say about causalities here. (This seems to happen after one or two builds after jenkins/server restart)
Currently I'm considering disabling Cobertura in our project, but if somebody has other ideas, that would be great =)
Java Solutions
Solution 1 - Java
Clearly this is due JVM locking issues either in your JVM implementation, or rather, in the way you are deploying your cobertura JAR.
Jenkins can spawn up alot of JVM threads, and if cobetura is on your global classpath, its possible that some weird collisions are happening.
I assume, ultimately, that this should be attributed to a minor bug in cobertura (unless the complex corbertura file locking is solving some other more important problem).
According to the source code for Cobertura's FileLock (cobertura/src/main/java/net/sourceforge/cobertura/util/FileLocker.java), there are some issues around multiple JVM's loading up the Cobertura jar.
To solve, make sure there is only one copy and one app launching and using Corbetura.
The reason your VM implementation fixed it, most likely, is that you've decreased the amount of variability in the way cobetrura can be loaded. Also maybe you are restarting your VM with higher frequency then your jenkins server.
In our jenkins corbertura builds, we just use the maven plugin and this seems to work fine without issue (but then again, we are not using java 1.7, nor are we using Play).
Solution 2 - Java
This has been bothering us for a while (play 1.2.4/Jenkins). There is some problem due to overlapping sequences between jenkins cobertura plugin (report publish) and play framework cobertura module. I believe its purely timing coincidence and hence intermittent. We have the following work around for the lack of a better resolution.
Removed jenkins cobertura report publish action from the main build job. We created a new jenkins job that is setup with publish cobertura coverage report action. In the new job, we have the shell action to copy the coverage.xml from the main build job workspace to the workspace of the new job for cobertura coverage report publish action to run. The copy (for obvious reasons) is done to avoid running both play cobertura and jenkins cobertura in the same job.
Its not the best but happy to see the coverage report/graphs :-)
Solution 3 - Java
The trick is to use one datafile (cobertura.ser) per module to avoid locks from parallel tasks.
With ant:
<cobertura-instrument todir="${build.dir}" datafile="cobertura.ser.${modulename}">
...
At end merge the many cobertura files into one cobertura file:
<target name="merge-coverage">
<cobertura-merge datafile="cobertura.ser">
<fileset dir="${build.dir}">
<include name="cobertura.ser.*" />
</fileset>
</cobertura-merge>
</target>
Solution 4 - Java
-Dcobertura.use.java.nio=false the previous appears to require changing to true to be able to use file locking as your error message explained.
Also, somewhere the application probably requires adding the full folder classpath for cobertura.
It appears you are using something similar to a COF(constantly open file) , the error message is referring to a file that exists, but regions of the file are locked n the drive, not the file itself simply.
Solution 5 - Java
Did you set
%test.play.tmp=none
in your application.conf file ?