Most efficient way to see if an ArrayList contains an object in Java

JavaAlgorithmOptimizationSearchArraylist

Java Problem Overview


I have an ArrayList of objects in Java. The objects have four fields, two of which I'd use to consider the object equal to another. I'm looking for the most efficient way, given those two fields, to see if the array contains that object.

The wrench is that these classes are generated based on XSD objects, so I can't modify the classes themselves to overwrite the .equals.

Is there any better way than just looping through and manually comparing the two fields for each object and then breaking when found? That just seems so messy, looking for a better way.

Edit: the ArrayList comes from a SOAP response that is unmarshalled into objects.

Java Solutions


Solution 1 - Java

It depends on how efficient you need things to be. Simply iterating over the list looking for the element which satisfies a certain condition is O(n), but so is ArrayList.Contains if you could implement the Equals method. If you're not doing this in loops or inner loops this approach is probably just fine.

If you really need very efficient look-up speeds at all cost, you'll need to do two things:

  1. Work around the fact that the class is generated: Write an adapter class which can wrap the generated class and which implement equals() based on those two fields (assuming they are public). Don't forget to also implement hashCode() (*)
  2. Wrap each object with that adapter and put it in a HashSet. HashSet.contains() has constant access time, i.e. O(1) instead of O(n).

Of course, building this HashSet still has a O(n) cost. You are only going to gain anything if the cost of building the HashSet is negligible compared to the total cost of all the contains() checks that you need to do. Trying to build a list without duplicates is such a case.


* () Implementing hashCode() is best done by XOR'ing (^ operator) the hashCodes of the same fields you are using for the equals implementation (but multiply by 31 to reduce the chance of the XOR yielding 0)

Solution 2 - Java

You could use a Comparator with Java's built-in methods for sorting and binary search. Suppose you have a class like this, where a and b are the fields you want to use for sorting:

class Thing { String a, b, c, d; }

You would define your Comparator:

Comparator<Thing> comparator = new Comparator<Thing>() {
  public int compare(Thing o1, Thing o2) {
    if (o1.a.equals(o2.a)) {
      return o1.b.compareTo(o2.b);
    }
    return o1.a.compareTo(o2.a);
  }
};

Then sort your list:

Collections.sort(list, comparator);

And finally do the binary search:

int i = Collections.binarySearch(list, thingToFind, comparator);

Solution 3 - Java

Given your constraints, you're stuck with brute force search (or creating an index if the search will be repeated). Can you elaborate any on how the ArrayList is generated--perhaps there is some wiggle room there.

If all you're looking for is prettier code, consider using the Apache Commons Collections classes, in particular CollectionUtils.find(), for ready-made syntactic sugar:

ArrayList haystack = // ...
final Object needleField1 = // ...
final Object needleField2 = // ...

Object found = CollectionUtils.find(haystack, new Predicate() {
   public boolean evaluate(Object input) {
      return needleField1.equals(input.field1) && 
             needleField2.equals(input.field2);
   }
});

Solution 4 - Java

If the list is sorted, you can use a binary search. If not, then there is no better way.

If you're doing this a lot, it would almost certainly be worth your while to sort the list the first time. Since you can't modify the classes, you would have to use a Comparator to do the sorting and searching.

Solution 5 - Java

Even if the equals method were comparing those two fields, then logically, it would be just the same code as you doing it manually. OK, it might be "messy", but it's still the correct answer

Solution 6 - Java

If you are a user of my ForEach DSL, it can be done with a Detect query.

Foo foo = ...
Detect<Foo> query = Detect.from(list);
for (Detect<Foo> each: query) 
    each.yield = each.element.a == foo.a && each.element.b == foo.b;
return query.result();

Solution 7 - Java

>Is there any better way than just looping through and manually comparing the two fields for each object and then breaking when found? That just seems so messy, looking for a better way.

If your concern is maintainability you could do what Fabian Steeg suggest ( that's what I would do ) although it probably isn't the "most efficient" ( because you have to sort the array first and then perform the binary search ) but certainly the cleanest and better option.

If you're really concerned with efficiency, you can create a custom List implementation that uses the field in your object as the hash and use a HashMap as storage. But probably this would be too much.

Then you have to change the place where you fill the data from ArrayList to YourCustomList.

Like:

 List list = new ArrayList();

 fillFromSoap( list );

To:

 List list = new MyCustomSpecialList();

 fillFromSoap( list );

The implementation would be something like the following:

class MyCustomSpecialList extends AbstractList  { 
    private Map<Integer, YourObject> internalMap;

    public boolean add( YourObject o ) { 
         internalMap.put( o.getThatFieldYouKnow(), o );
    }

    public boolean contains( YourObject o ) { 
        return internalMap.containsKey( o.getThatFieldYouKnow() );
    }

}

Pretty much like a HashSet, the problem here is the HashSet relies on the good implementation of the hashCode method, which probably you don't have. Instead you use as the hash "that field you know" which is the one that makes one object equals to the other.

Of course implementing a List from the scratch lot more tricky than my snippet above, that's why I say the Fabian Steeg suggestion would be better and easier to implement ( although something like this would be more efficient )

Tell us what you did at the end.

Solution 8 - Java

Maybe a List isn't what you need.

Maybe a TreeSet would be a better container. You get O(log N) insertion and retrieval, and ordered iteration (but won't allow duplicates).

LinkedHashMap might be even better for your use case, check that out too.

Solution 9 - Java

Building a HashMap of these objects based on the field value as a key could be worthwhile from the performance perspective, e.g. populate Maps once and find objects very efficiently

Solution 10 - Java

If you need to search many time in the same list, it may pay off to build an index.

Iterate once through, and build a HashMap with the equals value you are looking for as the key and the appropriate node as the value. If you need all instead of anyone of a given equals value, then let the map have a value type of list and build the whole list in the initial iteration.

Please note that you should measure before doing this as the overhead of building the index may overshadow just traversing until the expected node is found.

Solution 11 - Java

There are three basic options:

  1. If retrieval performance is paramount and it is practical to do so, use a form of hash table built once (and altered as/if the List changes).

  2. If the List is conveniently sorted or it is practical to sort it and O(log n) retrieval is sufficient, sort and search.

  3. If O(n) retrieval is fast enough or if it is impractical to manipulate/maintain the data structure or an alternate, iterate over the List.

Before writing code more complex than a simple iteration over the List, it is worth thinking through some questions.

  • Why is something different needed? (Time) performance? Elegance? Maintainability? Reuse? All of these are okay reasons, apart or together, but they influence the solution.

  • How much control do you have over the data structure in question? Can you influence how it is built? Managed later?

  • What is the life cycle of the data structure (and underlying objects)? Is it built up all at once and never changed, or highly dynamic? Can your code monitor (or even alter) its life cycle?

  • Are there other important constraints, such as memory footprint? Does information about duplicates matter? Etc.

Solution 12 - Java

I would say the simplest solution would be to wrap the object and delegate the contains call to a collection of the wrapped class. This is similar to the comparator but doesn't force you to sort the resulting collection, you can simply use ArrayList.contains().

public class Widget {
    	private String name;
    	private String desc;
    
    	public String getName() {
    		return name;
    	}
    
    	public void setName(String name) {
    		this.name = name;
    	}
    
    	public String getDesc() {
    		return desc;
    	}
    
    	public void setDesc(String desc) {
    		this.desc = desc;
    	}
    }
    
    
    
    public abstract class EqualsHashcodeEnforcer<T> {
    
    	protected T wrapped;
    
    	public T getWrappedObject() {
    		return wrapped;
    	}
    
    	@Override
    	public boolean equals(Object obj) {
    		return equalsDelegate(obj);
    	}
    
    	@Override
    	public int hashCode() {
    		return hashCodeDelegate();
    	}
    
    	protected abstract boolean equalsDelegate(Object obj);
    
    	protected abstract int hashCodeDelegate();
    }
    

    public class WrappedWidget extends EqualsHashcodeEnforcer<Widget> {
    
    	@Override
    	protected boolean equalsDelegate(Object obj) {
    		if (obj == null) {
    			return false;
    		}
    		if (obj == getWrappedObject()) {
    			return true;
    		}
    		if (obj.getClass() != getWrappedObject().getClass()) {
    			return false;
    		}
    		Widget rhs = (Widget) obj;
    
    		return new EqualsBuilder().append(getWrappedObject().getName(),
    				rhs.getName()).append(getWrappedObject().getDesc(),
    				rhs.getDesc()).isEquals();
    	}
    
    	@Override
    	protected int hashCodeDelegate() {
    
    		return new HashCodeBuilder(121, 991).append(
    				getWrappedObject().getName()).append(
    				getWrappedObject().getDesc()).toHashCode();
    	}
    
    }

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionParrotsView Question on Stackoverflow
Solution 1 - JavaWim CoenenView Answer on Stackoverflow
Solution 2 - JavaFabian SteegView Answer on Stackoverflow
Solution 3 - JavaMichael Brewer-DavisView Answer on Stackoverflow
Solution 4 - JavaMichael MyersView Answer on Stackoverflow
Solution 5 - Javaoxbow_lakesView Answer on Stackoverflow
Solution 6 - JavaakuhnView Answer on Stackoverflow
Solution 7 - JavaOscarRyzView Answer on Stackoverflow
Solution 8 - JavaBen HardyView Answer on Stackoverflow
Solution 9 - JavaRocket SurgeonView Answer on Stackoverflow
Solution 10 - JavaThorbjørn Ravn AndersenView Answer on Stackoverflow
Solution 11 - JavaJeremy RishelView Answer on Stackoverflow
Solution 12 - Javajonathan.coneView Answer on Stackoverflow