Java - shutting down on Out of Memory Error
JavaOut of-MemoryJava Problem Overview
I've heard very contradictory things on how to best handle this, and am stuck with the following dilemma:
- an OOME brings down a thread, but not the whole application
- and I need to bring down the whole application but can't because the thread doesn't have any memory left
I've always understood best practice is let them go so the JVM can die because the JVM is in an inconsistent state at that point, but that doesn't seem to be working here.
Java Solutions
Solution 1 - Java
OutOfMemoryError
is just like any other error. If it escapes from Thread.run()
it will cause thread to die. Nothing more. Also, when a thread dies, it is no longer a GC root, thus all references kept only by this thread are eligible for garbage collection. This means JVM is very likely to recover from OOME.
If you want to kill your JVM no matter what because you suspect it can be in an inconsistent state, add this to your java
options:
-XX:OnOutOfMemoryError="kill -9 %p"
%p
is the current Java process PID placeholder. The rest is self-explained.
Of course you can also try catching OutOfMemoryError
and handling it somehow. But that's tricky.
Solution 2 - Java
In Java version 8u92 the VM arguments
-XX:+ExitOnOutOfMemoryError
-XX:+CrashOnOutOfMemoryError
were added, see the release notes.
> ExitOnOutOfMemoryError
> When you enable this option, the JVM exits on the
> first occurrence of an out-of-memory error. It can be used if you
> prefer restarting an instance of the JVM rather than handling out of
> memory errors.
>
> CrashOnOutOfMemoryError
> If this option is enabled, when an
> out-of-memory error occurs, the JVM crashes and produces text and
> binary crash files.
Enhancement Request: JDK-8138745 (parameter naming is wrong though JDK-8154713, ExitOnOutOfMemoryError
instead of ExitOnOutOfMemory
)
Solution 3 - Java
With version 8u92 there's now a JVM option in the Oracle JDK to make the JVM exit when an OutOfMemoryError occurs:
From the release notes:
> ExitOnOutOfMemoryError - When you enable this option, the JVM exits on the first occurrence of an out-of-memory error. It can be used if you prefer restarting an instance of the JVM rather than handling out of memory errors.
Solution 4 - Java
If you want to bring down your program, take a look at the -XX:OnOutOfMemoryError="<cmd args>;<cmd args>"
(documented here) option on the command line. Just point it to a kill script for your application.
In general, I have never had any luck to gracefully handle this error without restarting the application. There was always some kind of corner case slipping through, so I personally suggest to indeed stop your application but investigate the source of the problem.
Solution 5 - Java
You can force your program to terminate in multiple ways, once the error will ocurre. Like others have suggested, you can catch the error and do a System.exit after that, if needed. But I suggest you too use -XX:+HeapDumpOnOutOfMemoryError, this way the JVM will create a memory dump file with the content of your application once the event was produced. You will use a profiles, I recommend you Eclipse MAT to investigate the image. This way you will find pretty quickly what is the cause of the issue, and react properly. If you are not using Eclipse you can use the Eclipse MAT as a standalone product, see: http://wiki.eclipse.org/index.php/MemoryAnalyzer.
Solution 6 - Java
I suggest handling all uncaught exceptions from within the application to ensure it tries to give you the best possible data before terminating. Then have an external script that restarts your process when it crashes.
public class ExitProcessOnUncaughtException implements UncaughtExceptionHandler
{
static public void register()
{
Thread.setDefaultUncaughtExceptionHandler(new ExitProcessOnUncaughtException());
}
private ExitProcessOnUncaughtException() {}
@Override
public void uncaughtException(Thread t, Throwable e)
{
try {
StringWriter writer = new StringWriter();
e.printStackTrace(new PrintWriter(writer));
System.out.println("Uncaught exception caught"+ " in thread: "+t);
System.out.flush();
System.out.println();
System.err.println(writer.getBuffer().toString());
System.err.flush();
printFullCoreDump();
} finally {
Runtime.getRuntime().halt(1);
}
}
public static void printFullCoreDump()
{
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
System.out.println("\n"+
sdf.format(System.currentTimeMillis())+"\n"+
"All Stack Trace:\n"+
getAllStackTraces()+
"\nHeap\n"+
getHeapInfo()+
"\n");
}
public static String getAllStackTraces()
{
String ret="";
Map<Thread, StackTraceElement[]> allStackTraces = Thread.getAllStackTraces();
for (Entry<Thread, StackTraceElement[]> entry : allStackTraces.entrySet())
ret+=getThreadInfo(entry.getKey(),entry.getValue())+"\n";
return ret;
}
public static String getHeapInfo()
{
String ret="";
List<MemoryPoolMXBean> memBeans = ManagementFactory.getMemoryPoolMXBeans();
for (MemoryPoolMXBean mpool : memBeans) {
MemoryUsage usage = mpool.getUsage();
String name = mpool.getName();
long used = usage.getUsed();
long max = usage.getMax();
int pctUsed = (int) (used * 100 / max);
ret+=" "+name+" total: "+(max/1000)+"K, "+pctUsed+"% used\n";
}
return ret;
}
public static String getThreadInfo(Thread thread, StackTraceElement[] stack)
{
String ret="";
ret+="\n\""+thread.getName()+"\"";
if (thread.isDaemon())
ret+=" daemon";
ret+=
" prio="+thread.getPriority()+
" tid="+String.format("0x%08x", thread.getId());
if (stack.length>0)
ret+=" in "+stack[0].getClassName()+"."+stack[0].getMethodName()+"()";
ret+="\n java.lang.Thread.State: "+thread.getState()+"\n";
ret+=getStackTrace(stack);
return ret;
}
public static String getStackTrace(StackTraceElement[] stack)
{
String ret="";
for (StackTraceElement element : stack)
ret+="\tat "+element+"\n";
return ret;
}
}
Solution 7 - Java
Generally speaking you should never write a catch block that catches java.lang.Error
or any of its subclasses including OutOfMemoryError
. The only exception to this would be if you are using a third-party library who throws a custom subclass of Error
when they should have subclassed RuntimeException
. This is really just a work around for an error in their code though.
From the JavaDoc for java.lang.Error
:
> An Error is a subclass of Throwable that indicates serious problems > that a reasonable application should not try to catch.
If you are having problems with your application continuing to run even after one of the threads dies because of an OOME you have a couple options.
First, you might want to check to see if it's possible to mark the remaining threads as daemon threads. If there is ever a point when only daemon threads remain in the JVM it will run all the shutdown hooks and terminate as orderly as possible. To do this you'll need to call setDaemon(true)
on the thread object before it is started. If the threads are actually created by a framework or some other code you might have to use a different means to set that flag.
The other option is to assign an uncaught exception handler to the threads in question and call either System.exit()
or if absolutely necessary Runtime.getRuntime().halt()
. Calling halt is very dangerous as shutdown hooks won't even attempt to run, but in certain situations halt might work where System.exit would have failed if an OOME has already been thrown.
Solution 8 - Java
Since the JVM options
-XX:+ExitOnOutOfMemoryError
-XX:+CrashOnOutOfMemoryError
-XX:OnOutOfMemoryError=...
don't work if the OutOfMemoryError
occurs because of exhausted threads (see the corresponding JDK bug report), it may be worth trying the tool jkill. It registers via JVMTI and exits the VM if the memory or the available threads are exhausted.
In my tests it works as expected (and how I would expect the JVM options to work).
Solution 9 - Java
You can surround your thread code with a try catch for the OOME and do some manual cleanup if such an event occurs. A trick is to make your thread function be only a try catch around another function. Upon memory error it should free some space up on the stack allowing you to do some quick deletes. This should work if you do a garbage collection request on some resources immediately after catching and/or to set a dying flag to tell other threads to quit.
Once the thread with OOME dies and you do some collection on it's elements, you should have more than enough free space for other threads to quit in an orderly fashion. This is a more graceful quit with an opportunity to log the problem before dying as well.