Reflection type inference on Java 8 Lambdas

Java Problem Overview

I was experimenting with the new Lambdas in Java 8, and I am looking for a way to use reflection on the lambda classes to get the return type of a lambda function. I am especially interested in cases where the lambda implements a generic superinterface. In the code example below, MapFunction<F, T> is the generic superinterface, and I am looking for a way to find out what type binds to the generic parameter T.

While Java throws away a lot of generic type information after the compiler, subclasses (and anonymous subclasses) of generic superclasses and generic superinterfaces did preserve that type information. Via reflection, these types were accessible. In the example below (case 1), reflection tells my that the MyMapper implementation of MapFunction binds java.lang.Integer to the generic type parameter T.

Even for subclasses that are themselves generic, there are certain means to find out what binds to a generic parameter, if some others are known. Consider case 2 in the example below, the IdentityMapper where both F and T bind to the same type. When we know that, we know the type F if we know the parameter type T (which in my case we do).

The question is now, how can I realize something similar for the Java 8 lambdas? Since they are actually not regular subclasses of the generic superinterface, the above described method does not work. Specifically, can I figure out that the parseLambda binds java.lang.Integer to T, and the identityLambda binds the same to F and T?

PS: In theory it should possible to decompile the lambda code and then use an embedded compiler (like the JDT) and tap into its type inference. I hope that there is a simpler way to do this ;-)

/**
 * The superinterface.
 */
public interface MapFunction<F, T> {
	
	T map(F value);
}

/**
 * Case 1: A non-generic subclass.
 */
public class MyMapper implements MapFunction<String, Integer> {

	public Integer map(String value) {
		return Integer.valueOf(value);
	}
}

/**
 * A generic subclass
 */
public class IdentityMapper<E> implements MapFunction<E, E> {

	public E map(E value) {
		return value;
	}
	
}

/**
 * Instantiation through lambda
 */

public MapFunction<String, Integer> parseLambda = (String str) -> { return Integer.valueOf(str); }

public MapFunction<E, E> identityLambda = (value) -> { return value; }


public static void main(String[] args)
{
	// case 1
	getReturnType(MyMapper.class);    // -> returns java.lang.Integer
	
	// case 2
	getReturnTypeRelativeToParameter(IdentityMapper.class, String.class);    // -> returns java.lang.String
}

private static Class<?> getReturnType(Class<?> implementingClass)
{
	Type superType = implementingClass.getGenericInterfaces()[0];
	
	if (superType instanceof ParameterizedType) {
		ParameterizedType parameterizedType = (ParameterizedType) superType;
		return (Class<?>) parameterizedType.getActualTypeArguments()[1];
	}
	else return null;
}

private static Class<?> getReturnTypeRelativeToParameter(Class<?> implementingClass, Class<?> parameterType)
{
	Type superType = implementingClass.getGenericInterfaces()[0];
	
	if (superType instanceof ParameterizedType) {
		ParameterizedType parameterizedType = (ParameterizedType) superType;
		TypeVariable<?> inputType = (TypeVariable<?>) parameterizedType.getActualTypeArguments()[0];
		TypeVariable<?> returnType = (TypeVariable<?>) parameterizedType.getActualTypeArguments()[1];
		
		if (inputType.getName().equals(returnType.getName())) {
			return parameterType;
		}
		else {
			// some logic that figures out composed return types
		}
	}
	
	return null;
}

Java Solutions

Solution 1 - Java

The exact decision how to map lambda code to interface implementations is left to the actual runtime environment. In principle, all lambdas implementing the same raw interface could share a single runtime class just like MethodHandleProxies does. Using different classes for specific lambdas is an optimization performed by the actual LambdaMetafactory implementation but not a feature intended to aid debugging or Reflection.

So even if you find more detailed information in the actual runtime class of a lambda interface implementation it will be an artifact of the currently used runtime environment which might not be available in different implementation or even other versions of your current environment.

If the lambda is Serializable you can use the fact that the serialized form contains the method signature of the instantiated interface type to puzzle the actual type variable values together.

Solution 2 - Java

This is currently possible to solve but only in a pretty hackie way, but let me first explain a few things:

When you write a lambda, the compiler inserts a dynamic invoke instruction pointing to the LambdaMetafactory and a private static synthetic method with the body of the lambda. The synthetic method and the method handle in the constant pool both contain the generic type (if the lambda uses the type or is explicit as in your examples).

Now at runtime the LambdaMetaFactory is called and a class is generated using ASM that implements the functional interface and the body of the method then calls the private static method with any arguments passed. It is then injected into the original class using Unsafe.defineAnonymousClass (see John Rose post) so it can access the private members etc.

Unfortunately the generated Class does not store the generic signatures (it could) so you can't use the usual reflection methods that allow you to get around erasure

For a normal Class you could inspect the bytecode using Class.getResource(ClassName + ".class") but for anonymous classes defined using Unsafe you are out of luck. However you can make the LambdaMetaFactory dump them out with the JVM argument:

java -Djdk.internal.lambda.dumpProxyClasses=/some/folder

By looking at the dumped class file (using javap -p -s -v), one can see that it does indeed call the static method. But the problem remains how to get the bytecode from within Java itself.

This unfortunately is where it gets hackie:

Using reflection we can call Class.getConstantPool and then access the MethodRefInfo to get the type descriptors. We can then use ASM to parse this and return the argument types. Putting it all together:

Method getConstantPool = Class.class.getDeclaredMethod("getConstantPool");
getConstantPool.setAccessible(true);
ConstantPool constantPool = (ConstantPool) getConstantPool.invoke(lambda.getClass());
String[] methodRefInfo = constantPool.getMemberRefInfoAt(constantPool.size() - 2);

int argumentIndex = 0;
String argumentType = jdk.internal.org.objectweb.asm.Type.getArgumentTypes(methodRef[2])[argumentIndex].getClassName();
Class<?> type = (Class<?>) Class.forName(argumentType);

Updated with jonathan's suggestion

Now ideally the classes generated by LambdaMetaFactory should store the generic type signatures (I might see if I can submit a patch to the OpenJDK) but currently this is the best we can do. The code above has the following problems:

It uses undocumented methods and classes
It is extremely vulnerable to code changes in the JDK
It doesn't preserve the generic types, so if you pass List<String> into a lambda it will come out as List

Solution 3 - Java

I recently added support for resolving lambda type arguments to TypeTools. Ex:

MapFunction<String, Integer> fn = str -> Integer.valueOf(str);
Class<?>[] typeArgs = TypeResolver.resolveRawArguments(MapFunction.class, fn.getClass());

The resolved type args are as expected:

assert typeArgs[0] == String.class;
assert typeArgs[1] == Integer.class;

To handle a passed lambda:

public void call(Callable<?> c) {
  // Assumes c is a lambda
  Class<?> callableType = TypeResolver.resolveRawArguments(Callable.class, c.getClass());
}

Note: The underlying implementation uses the ConstantPool approach outlined by @danielbodart which is known to work on Oracle JDK and OpenJDK (and possibly others).

Solution 4 - Java

Parameterized type information is only available at runtime for elements of code that are bound - that is, specifically compiled into a type. Lambdas do the same thing, but as your Lambda is de-sugared to a method rather than to a type, there is no type to capture that information.

Consider the following:

import java.util.Arrays;
import java.util.function.Function;

public class Erasure {

    static class RetainedFunction implements Function<Integer,String> {
        public String apply(Integer t) {
            return String.valueOf(t);
        }
    }

    public static void main(String[] args) throws Exception {
        Function<Integer,String> f0 = new RetainedFunction();
        Function<Integer,String> f1 = new Function<Integer,String>() {
            public String apply(Integer t) {
                return String.valueOf(t);
            }
        };
        Function<Integer,String> f2 = String::valueOf;
        Function<Integer,String> f3 = i -> String.valueOf(i);
	
        for (Function<Integer,String> f : Arrays.asList(f0, f1, f2, f3)) {
            try {
                System.out.println(f.getClass().getMethod("apply", Integer.class).toString());
            } catch (NoSuchMethodException e) {
                System.out.println(f.getClass().getMethod("apply", Object.class).toString());
            }
            System.out.println(Arrays.toString(f.getClass().getGenericInterfaces()));
        }
    }
}

f0 and f1 both retain their generic type information, as you'd expect. But as they're unbound methods that have been erased to Function<Object,Object>, f2 and f3 do not.

Solution 5 - Java

I have found a way of doing it for serializable lambdas. All my lambdas are serializable, to that works.

Thanks, Holger, for pointing me to the SerializedLambda.

The generic parameters are captured in the lambda's synthetic static method and can be retrieved from there. Finding the static method that implements the lambda is possible with the information from the SerializedLambda

The steps are as follows:

Get the SerializedLambda via the write replacement method that is auto-generated for all serializable lambdas
Find the class that contains the lambda implementation (as a synthetic static method)
Get the java.lang.reflect.Method for the synthetic static method
Get generic types from that Method

UPDATE: Apparently, this does not work with all compilers. I have tried it with the compiler of Eclipse Luna (works) and the Oracle javac (does not work).

// sample how to use
public static interface SomeFunction<I, O> extends java.io.Serializable {
	
	List<O> applyTheFunction(Set<I> value);
}

public static void main(String[] args) throws Exception {
	
	SomeFunction<Double, Long> lambda = (set) -> Collections.singletonList(set.iterator().next().longValue());
			
	SerializedLambda sl = getSerializedLambda(lambda);		
	Method m = getLambdaMethod(sl);
	
	System.out.println(m);
	System.out.println(m.getGenericReturnType());
	for (Type t : m.getGenericParameterTypes()) {
		System.out.println(t);
	}

    // prints the following
    // (the method) private static java.util.List test.ClassWithLambdas.lambda$0(java.util.Set)
    // (the return type, including *Long* as the generic list type) java.util.List<java.lang.Long>
    // (the parameter, including *Double* as the generic set type) java.util.Set<java.lang.Double>

// getting the SerializedLambda
public static SerializedLambda getSerializedLambda(Object function) {
	if (function == null || !(function instanceof java.io.Serializable)) {
		throw new IllegalArgumentException();
	}
	
	for (Class<?> clazz = function.getClass(); clazz != null; clazz = clazz.getSuperclass()) {
        try {
            Method replaceMethod = clazz.getDeclaredMethod("writeReplace");
            replaceMethod.setAccessible(true);
            Object serializedForm = replaceMethod.invoke(function);

            if (serializedForm instanceof SerializedLambda) {
                return (SerializedLambda) serializedForm;
            }
        }
        catch (NoSuchMethodError e) {
            // fall through the loop and try the next class
        }
        catch (Throwable t) {
            throw new RuntimeException("Error while extracting serialized lambda", t);
        }
    }
	
	throw new Exception("writeReplace method not found");
}

// getting the synthetic static lambda method
public static Method getLambdaMethod(SerializedLambda lambda) throws Exception {
	String implClassName = lambda.getImplClass().replace('/', '.');
	Class<?> implClass = Class.forName(implClassName);
	
	String lambdaName = lambda.getImplMethodName();
	
	for (Method m : implClass.getDeclaredMethods()) {
		if (m.getName().equals(lambdaName)) {
			return m;
		}
	}
	
	throw new Exception("Lambda Method not found");
}

Content Type	Original Author	Original Content on Stackoverflow
Question	Stephan Ewen	View Question on Stackoverflow
Solution 1 - Java	Holger	View Answer on Stackoverflow
Solution 2 - Java	Daniel Worthington-Bodart	View Answer on Stackoverflow
Solution 3 - Java	Jonathan	View Answer on Stackoverflow
Solution 4 - Java	MrPotes	View Answer on Stackoverflow
Solution 5 - Java	Stephan Ewen	View Answer on Stackoverflow