Is it possible in java make something like Comparator but for implementing custom equals() and hashCode()

JavaCollectionsEqualsHashcode

Java Problem Overview


I have an array of objects and I want to concatenate it with another array of objects, except that objects that have same id's. That objects are used in many places in the system and don't have hash code or equals implemented. So I don't want to implement hashCode() and equals(), cause I'm afraid to break something somewhere in the system where that objects are used and I don't know about that.

I want to put all that objects in a set, but somehow make the objects use custom hashCode() and equals(). Something like custom Comparator, but for equals.

Java Solutions


Solution 1 - Java

Yes it is possible to do such a thing. (And people have done it.) But it won't allow you to put your objects into a HashMap, HashSet, etc. That is because the standard collection classes expect the key objects themselves to provide the equals and hashCode methods. (That is the way they are designed to work ...)

Alternatives:

  1. Implement a wrapper class that holds an instance of the real class, and provides its own implementation of equals and hashCode.

  2. Implement your own hashtable-based classes which can use a "hashable" object to provide equals and hashcode functionality.

  3. Bite the bullet and implement equals and hashCode overrides on the relevant classes.

In fact, the 3rd option is probably the best, because your codebase most likely needs to to be using a consistent notion of what it means for these objects to be equal. There are other things that suggest that your code needs an overhaul. For instance, the fact that it is currently using an array of objects instead of a Set implementation to represent what is apparently supposed to be a set.

On the other hand, maybe there was/is some real (or imagined) performance reason for the current implementation; e.g. reduction of memory usage. In that case, you should probably write a bunch of helper methods for doing operations like concatenating 2 sets represented as arrays.

Solution 2 - Java

90% of the time when a user wants an equivalence relation there is already a more straightforward solution. You want to de-duplicate a bunch of things based on ids only? Can you just put them all into a Map with the ids as keys, then get the values() collection of that?

Solution 3 - Java

HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.

public interface HashingStrategy<E>
{
    int computeHashCode(E object);
    boolean equals(E object1, E object2);
}

As others have pointed out, you can't use a HashingStrategy with the built in HashSet or HashMap. Eclipse Collections includes a set called UnifiedSetWithHashingStrategy and a map called UnifiedMapWithHashingStrategy.

Let's look at an example. Here's a simple Data class we can use in a UnifiedSetWithHashingStrategy.

public class Data
{
    private final int id;

    public Data(int id)
    {
        this.id = id;
    }

    public int getId()
    {
        return id;
    }

    // No equals or hashcode
}

Here's how you might set up a UnifiedSetWithHashingStrategy and use it.

java.util.Set<Data> set =
  new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
Assert.assertTrue(set.add(new Data(1)));

// contains returns true even without hashcode and equals
Assert.assertTrue(set.contains(new Data(1)));

// Second call to add() doesn't do anything and returns false
Assert.assertFalse(set.add(new Data(1)));

Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.

How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to MutableSet, so it also implements a form of get().

Note: I am a committer for Eclipse Collections.

Solution 4 - Java

Of course you can create some external object providing an equality comparison and a HashCode. But the build-in collections of Java do not use such an object for their comparisons/lookup.

I once did create an interface like this in my package-collection (just freshly translated to english):

public interface HashableEquivalenceRelation {

    /**
     * Returns true if two objects are considered equal.
     *
     * This should form an equivalence relation, meaning it
     * should fulfill these properties:
     *  <ul>
     *    <li>Reflexivity:  {@code areEqual(o, o)}
     *            should always return true.</li>
     *    <li>Symmetry: {@code areEqual(o1,o2) == areEqual(o2,o1)}
     *            for all objects o1 and o2</li>
     *    <li>Transitivity: If {@code areEqual(o1, o2)} and {@code areEqual(o2,o3)},
     *            then {@code areEqual(o1,o3}} should hold too.</li>
     *  </ul>
     * Additionally, the relation should be temporary consistent, i.e. the
     * result of this method for the same two objects should not change as
     * long as the objects do not change significantly (the precise meaning of
     * <em>change significantly</em> is dependent on the implementation).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public boolean areEqual(Object o1, Object o2);

    /**
     * Returns a hashCode for an arbitrary object.
     *
     * This should be temporary consistent, i.e. the result for the same
     * objects should not change as long as the object does not change significantly
     * (with change significantly having the same meaning as for {@link areEqual}).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public int hashCode(Object o);

}

Than I had a group of interfaces CustomCollection, CustomSet, CustomList, CustomMap, etc. defined like the interfaces in java.util, but using such an equivalence relation for all the methods instead of the build-in relation given by Object.equals. I had some default implementations, too:

/**
 * The equivalence relation induced by Object#equals.
 */
public final static EquivalenceRelation DEFAULT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return
                o1 == o2 ||
                o1 != null &&
                o1.equals(o2);
        }
        public int hashCode(Object ob)
        {
            return
                ob == null?
                0 :
                ob.hashCode();
        }
        public String toString() { return "<DEFAULT>"; }
    };

/**
 * The equivalence relation induced by {@code ==}.
 * (The hashCode used is {@link System#identityHashCode}.)
 */
public final static EquivalenceRelation IDENTITY =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return o1 == o2; }
        public int hashCode(Object ob) { return System.identityHashCode(ob); }
        public String toString() { return "<IDENTITY>"; }
    };

/**
 * The all-relation: every object is equivalent to every other one.
 */
public final static EquivalenceRelation ALL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return true; }
        public int hashCode(Object ob) { return 0; }
        public String toString() { return "<ALL>"; }
    };

/**
 * An equivalence relation partitioning the references
 * in two groups: the null reference and any other reference.
 */
public final static EquivalenceRelation NULL_OR_NOT_NULL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return (o1 == null && o2 == null) ||
                (o1 != null && o2 != null);
        }
        public int hashCode(Object o) { return o == null ? 0 : 1; }
        public String toString() { return "<NULL_OR_NOT_NULL>"; }
    };

/**
 * Two objects are equivalent if they are of the same (actual) class.
 */
public final static EquivalenceRelation SAME_CLASS =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == o2 || o1 != null && o2 != null &&
                o1.getClass() == o2.getClass();
        }
        public int hashCode(Object o) { return o == null ? 0 : o.getClass().hashCode(); }
        public String toString() { return "<SAME_CLASS>"; }
    };


/**
 * Compares strings ignoring case.
 * Other objects give a {@link ClassCastException}.
 */
public final static EquivalenceRelation STRINGS_IGNORE_CASE =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == null ?
                o2 == null :
                ((String)o1).equalsIgnoreCase((String)o2);
        }
        public int hashCode(Object o)
        {
            return o == null ? -12345 : ((String)o).toUpperCase().hashCode();
        }
        public String toString() { return "<STRINGS_IGNORE_CASE>"; }
    };


/**
 * Compares {@link CharSequence} implementations by content.
 * Other object give a {@link ClassCastException}.
 */
public final static EquivalenceRelation CHAR_SEQUENCE_CONTENT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) 
        {
            CharSequence seq1 = (CharSequence)o1;
            CharSequence seq2 = (CharSequence)o2;
            if (seq1 == null ^ seq2 == null) // nur eins von beiden null
                return false;
            if (seq1 == seq2)   // umfasst auch den Fall null == null
                return true;
            int size = seq1.length();
            if (seq2.length() != size)
                return false;
            for (int i = 0; i < size; i++)
                {
                    if (seq1.charAt(i) != seq2.charAt(i))
                        return false;
                }
            return true;
        }
        /**
         * Entrspricht String.hashCode
         */
        public int hashCode(Object o)
        {
            CharSequence sequence = (CharSequence)o;
            if (sequence == null)
                return 0;
            int hash = 0;
            int size = sequence.length();
            for (int i = 0; i < size; i++)
                {
                    hash = hash * 31 + sequence.charAt(i);
                }
            return hash;
        }
    };
    

Solution 5 - Java

Use Guava Equivalence:

Equivalence<T> equivalence = new Equivalence<T>() {
    @Override
    protected boolean doEquivalent(T a, T b) {
        return CustomComparator.equals(a, b);
    }

    @Override
    protected int doHash(T item) {
        return CustomHashCodeGenerator.hashCode(item);
    }
};
List<T> items = getItems();
Set<Equivalence.Wrapper<T>> setWithWrappedObjects = items.stream()
    .map(item -> equivalence.wrap(item))
    .collect(Collectors.toSet());

Solution 6 - Java

Would using a TreeSet help here? A TreeSet actually performs ordering and Set based behavior using compare/compareTo and allows you to define a custom Comparator for use in one of the constructors.

Solution 7 - Java

Just had this problem and worked up a simple solution. Not sure how memory-intensive it is; I'm sure people can refine it down the line.

When the Comparator returns 0, the elements match.

public static <E> Set<E> filterSet(Set<E> set, Comparator<E> comparator){
	Set<E> output = new HashSet<E>();
	for(E eIn : set){
		boolean add = true;
		for(E eOut : output){
			if(comparator.compare(eIn, eOut) == 0){
				add = false;
				break;
			}
		}
		if(add) output.add(eIn);
	}
	return output;
}

My use case was that I needed to filter out duplicate URLs, as in URLs that point to the same document. The URL object has a samePage() method that will return true if everything except the fragment are the same.

filtered = Misc.filterSet(filtered, (a, b) -> a.sameFile(b) ? 0 : 1);

Solution 8 - Java

You will not succeed doing your de-duplicating concatenation with a Comparator. Presumably you are looking to do something like this:

List<Object> list = new ArrayList<Object>();
list.addAll( a );
list.addAll( b );
Collections.sort( list, new MyCustomComparator() );

The problem is that Comparator needs to compare not just for equals/not-equals, but also for relative order. Given objects x and y that are not equal, you have to answer if one is greater than the other. You will not be able to do that, since you aren't actually trying to compare the objects. If you don't give a consistent answer, you will send the sorting algorithm into an infinite loop.

I do have a solution for you. Java has a class called LinkedHashSet, whose virtue is that it doesn't allow duplicates to be inserted, but maintains insertion order. Rather than implementing a comparator, implement a wrapper class to hold the actual object and implement hashCode/equals.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondhblahView Question on Stackoverflow
Solution 1 - JavaStephen CView Answer on Stackoverflow
Solution 2 - JavaKevin BourrillionView Answer on Stackoverflow
Solution 3 - JavaCraig P. MotlinView Answer on Stackoverflow
Solution 4 - JavaPaŭlo EbermannView Answer on Stackoverflow
Solution 5 - JavaizogfifView Answer on Stackoverflow
Solution 6 - JavawhaleyView Answer on Stackoverflow
Solution 7 - Javandm13View Answer on Stackoverflow
Solution 8 - JavaKonstantin KomissarchikView Answer on Stackoverflow