The JPA hashCode() / equals() dilemma

JavaHibernateJpaIdentityEclipselink

Java Problem Overview


There have been some discussions here about JPA entities and which hashCode()/equals() implementation should be used for JPA entity classes. Most (if not all) of them depend on Hibernate, but I'd like to discuss them JPA-implementation-neutrally (I am using EclipseLink, by the way).

All possible implementations are having their own advantages and disadvantages regarding:

  • hashCode()/equals() contract conformity (immutability) for List/Set operations
  • Whether identical objects (e.g. from different sessions, dynamic proxies from lazily-loaded data structures) can be detected
  • Whether entities behave correctly in detached (or non-persisted) state

As far I can see, there are three options:

  1. Do not override them; rely on Object.equals() and Object.hashCode()
    • hashCode()/equals() work
    • cannot identify identical objects, problems with dynamic proxies
    • no problems with detached entities
  2. Override them, based on the primary key
    • hashCode()/equals() are broken
    • correct identity (for all managed entities)
    • problems with detached entities
  3. Override them, based on the Business-Id (non-primary key fields; what about foreign keys?)
    • hashCode()/equals() are broken
    • correct identity (for all managed entities)
    • no problems with detached entities

My questions are:

  1. Did I miss an option and/or pro/con point?
  2. What option did you choose and why?



UPDATE 1:

By "hashCode()/equals() are broken", I mean that successive hashCode() invocations may return differing values, which is (when correctly implemented) not broken in the sense of the Object API documentation, but which causes problems when trying to retrieve a changed entity from a Map, Set or other hash-based Collection. Consequently, JPA implementations (at least EclipseLink) will not work correctly in some cases.

UPDATE 2:

Thank you for your answers -- most of them have remarkable quality.
Unfortunately, I am still unsure which approach will be the best for a real-life application, or how to determine the best approach for my application. So, I'll keep the question open and hope for some more discussions and/or opinions.

Java Solutions


Solution 1 - Java

Read this very nice article on the subject: Don't Let Hibernate Steal Your Identity.

The conclusion of the article goes like this:

> Object identity is deceptively hard to implement correctly when > objects are persisted to a database. However, the problems stem > entirely from allowing objects to exist without an id before they are > saved. We can solve these problems by taking the responsibility of > assigning object IDs away from object-relational mapping frameworks > such as Hibernate. Instead, object IDs can be assigned as soon as the > object is instantiated. This makes object identity simple and > error-free, and reduces the amount of code needed in the domain model.

Solution 2 - Java

I always override equals/hashcode and implement it based on the business id. Seems the most reasonable solution for me. See the following link.

> To sum all this stuff up, here is a listing of what will work or won't work with the different ways to handle equals/hashCode: enter image description here

EDIT:

To explain why this works for me:

  1. I don't usually use hashed-based collection (HashMap/HashSet) in my JPA application. If I must, I prefer to create UniqueList solution.
  2. I think changing business id on runtime is not a best practice for any database application. On rare cases where there is no other solution, I'd do special treatment like remove the element and put it back to the hashed-based collection.
  3. For my model, I set the business id on constructor and doesn't provide setters for it. I let JPA implementation to change the field instead of the property.
  4. UUID solution seems to be overkill. Why UUID if you have natural business id? I would after all set the uniqueness of the business id in the database. Why having THREE indexes for each table in the database then?

Solution 3 - Java

I personally already used all of these three stategies in different projects. And I must say that option 1 is in my opinion the most practicable in a real life app. In my experience breaking hashCode()/equals() conformity leads to many crazy bugs as you will every time end up in situations where the result of equality changes after an entity has been added to a collection.

But there are further options (also with their pros and cons):


a) hashCode/equals based on a set of immutable, not null, constructor assigned, fields

(+) all three criterias are guaranteed

(-) field values must be available to create a new instance

(-) complicates handling if you must change one of then


b) hashCode/equals based on a primary key that is assigned by the application (in the constructor) instead of JPA

(+) all three criterias are guaranteed

(-) you cannot take advantage of simple reliable ID generation stategies like DB sequences

(-) complicated if new entities are created in a distributed environment (client/server) or app server cluster


c) hashCode/equals based on a UUID assigned by the constructor of the entity

(+) all three criterias are guaranteed

(-) overhead of UUID generation

(-) may be a little risk that twice the same UUID is used, depending on algorythm used (may be detected by an unique index on DB)

Solution 4 - Java

If you want to use equals()/hashCode() for your Sets, in the sense that the same entity can only be in there once, then there is only one option: Option 2. That's because a primary key for an entity by definition never changes (if somebody indeed updates it, it's not the same entity anymore)

You should take that literally: Since your equals()/hashCode() are based on the primary key, you must not use these methods, until the primary key is set. So you shouldn't put entities in the set, until they're assigned a primary key. (Yes, UUIDs and similar concepts may help to assign primary keys early.)

Now, it's theoretically also possible to achieve that with Option 3, even though so-called "business-keys" have the nasty drawback that they can change: "All you'll have to do is delete the already inserted entities from the set(s), and re-insert them." That is true - but it also means, that in a distributed system, you'll have to make sure, that this is done absolutely everywhere the data has been inserted to (and you'll have to make sure, that the update is performed, before other things occur). You'll need a sophisticated update mechanism, especially if some remote systems aren't currently reachable...

Option 1 can only be used, if all the objects in your sets are from the same Hibernate session. The Hibernate documentation makes this very clear in chapter 13.1.3. Considering object identity:

> Within a Session the application can safely use == to compare objects. > > However, an application that uses == outside of a Session might produce unexpected results. This might occur even in some unexpected places. For example, if you put two detached instances into the same Set, both might have the same database identity (i.e., they represent the same row). JVM identity, however, is by definition not guaranteed for instances in a detached state. The developer has to override the equals() and hashCode() methods in persistent classes and implement their own notion of object equality.

It continues to argue in favor of Option 3:

> There is one caveat: never use the database identifier to implement equality. Use a business key that is a combination of unique, usually immutable, attributes. The database identifier will change if a transient object is made persistent. If the transient instance (usually together with detached instances) is held in a Set, changing the hashcode breaks the contract of the Set.

This is true, if you

  • cannot assign the id early (e.g. by using UUIDs)
  • and yet you absolutely want to put your objects in sets while they're in transient state.

Otherwise, you're free to choose Option 2.

Then it mentions the need for a relative stability:

> Attributes for business keys do not have to be as stable as database primary keys; you only have to guarantee stability as long as the objects are in the same Set.

This is correct. The practical problem I see with this is: If you can't guarantee absolute stability, how will you be able to guarantee stability "as long as the objects are in the same Set". I can imagine some special cases (like using sets only for a conversation and then throwing it away), but I would question the general practicability of this.


Short version:

  • Option 1 can only be used with objects within a single session.
  • If you can, use Option 2. (Assign PK as early as possible, because you can't use the objects in sets until the PK is assigned.)
  • If you can guarantee relative stability, you can use Option 3. But be careful with this.

Solution 5 - Java

We usually have two IDs in our entities:

  1. Is for persistence layer only (so that persistence provider and database can figure out relationships between objects).
  2. Is for our application needs (equals() and hashCode() in particular)

Take a look:

@Entity
public class User {

	@Id
	private int id;  // Persistence ID
	private UUID uuid; // Business ID

	// assuming all fields are subject to change
	// If we forbid users change their email or screenName we can use these
	// fields for business ID instead, but generally that's not the case
	private String screenName;
	private String email;

	// I don't put UUID generation in constructor for performance reasons. 
	// I call setUuid() when I create a new entity
	public User() {
	}

	// This method is only called when a brand new entity is added to 
	// persistence context - I add it as a safety net only but it might work 
	// for you. In some cases (say, when I add this entity to some set before 
	// calling em.persist()) setting a UUID might be too late. If I get a log 
	// output it means that I forgot to call setUuid() somewhere.
	@PrePersist
	public void ensureUuid() {
		if (getUuid() == null) {
			log.warn(format("User's UUID wasn't set on time. " 
				+ "uuid: %s, name: %s, email: %s",
				getUuid(), getScreenName(), getEmail()));
			setUuid(UUID.randomUUID());
		}
	}

	// equals() and hashCode() rely on non-changing data only. Thus we 
	// guarantee that no matter how field values are changed we won't 
	// lose our entity in hash-based Sets.
	@Override
	public int hashCode() {
		return getUuid().hashCode();
	}

	// Note that I don't use direct field access inside my entity classes and
	// call getters instead. That's because Persistence provider (PP) might
	// want to load entity data lazily. And I don't use 
	//    this.getClass() == other.getClass() 
	// for the same reason. In order to support laziness PP might need to wrap
	// my entity object in some kind of proxy, i.e. subclassing it.
	@Override
	public boolean equals(final Object obj) {
		if (this == obj)
			return true;
		if (!(obj instanceof User))
			return false;
		return getUuid().equals(((User) obj).getUuid());
	}

	// Getters and setters follow
}

EDIT: to clarify my point regarding calls to setUuid() method. Here's a typical scenario:

User user = new User();
// user.setUuid(UUID.randomUUID()); // I should have called it here
user.setName("Master Yoda");
user.setEmail("[email protected]");

jediSet.add(user); // here's bug - we forgot to set UUID and 
                   //we won't find Yoda in Jedi set

em.persist(user); // ensureUuid() was called and printed the log for me.

jediCouncilSet.add(user); // Ok, we got a UUID now

When I run my tests and see the log output I fix the problem:

User user = new User();
user.setUuid(UUID.randomUUID());

Alternatively, one can provide a separate constructor:

@Entity
public class User {

	@Id
	private int id;  // Persistence ID
	private UUID uuid; // Business ID

	... // fields

	// Constructor for Persistence provider to use
	public User() {
	}

	// Constructor I use when creating new entities
	public User(UUID uuid) {
		setUuid(uuid);
	}

	... // rest of the entity.
}

So my example would look like this:

User user = new User(UUID.randomUUID());
...
jediSet.add(user); // no bug this time

em.persist(user); // and no log output

I use a default constructor and a setter, but you may find two-constructors approach more suitable for you.

Solution 6 - Java

  1. If you have a business key, then you should use that for equals and hashCode.

  2. If you don't have a business key, you should not leave it with the default Object equals and hashCode implementations because that does not work after you merge and entity.

  3. You can use the entity identifier in the equals method only if the hashCode implementation returns a constant value, like this:

    @Entity
    public class Book implements Identifiable<Long> {
     
        @Id
        @GeneratedValue
        private Long id;
     
        private String title;
     
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (!(o instanceof Book)) return false;
            Book book = (Book) o;
            return getId() != null && Objects.equals(getId(), book.getId());
        }
     
        @Override
        public int hashCode() {
            return getClass().hashCode();
        }
     
        //Getters and setters omitted for brevity
    }
    

Check out this test case on GitHub that proves this solution works like a charm.

Solution 7 - Java

Although using a business key (option 3) is the most commonly recommended approach (Hibernate community wiki, "Java Persistence with Hibernate" p. 398), and this is what we mostly use, there's a Hibernate bug which breaks this for eager-fetched sets: HHH-3799. In this case, Hibernate can add an entity to a set before its fields are initialized. I'm not sure why this bug hasn't gotten more attention, as it really makes the recommended business-key approach problematic.

I think the heart of the matter is that equals and hashCode should be based on immutable state (reference Odersky et al.), and a Hibernate entity with Hibernate-managed primary key has no such immutable state. The primary key is modified by Hibernate when a transient object becomes persistent. The business key is also modified by Hibernate, when it hydrates an object in the process of being initialized.

That leaves only option 1, inheriting the java.lang.Object implementations based on object identity, or using an application-managed primary key as suggested by James Brundege in "Don't Let Hibernate Steal Your Identity" (already referenced by Stijn Geukens's answer) and by Lance Arlaus in "Object Generation: A Better Approach to Hibernate Integration".

The biggest problem with option 1 is that detached instances can't be compared with persistent instances using .equals(). But that's OK; the contract of equals and hashCode leaves it up to the developer to decide what equality means for each class. So just let equals and hashCode inherit from Object. If you need to compare a detached instance to a persistent instance, you can create a new method explicitly for that purpose, perhaps boolean sameEntity or boolean dbEquivalent or boolean businessEquals.

Solution 8 - Java

I agree with Andrew's answer. We do the same thing in our application but instead of storing UUIDs as VARCHAR/CHAR, we split it into two long values. See UUID.getLeastSignificantBits() and UUID.getMostSignificantBits().

One more thing to consider, is that calls to UUID.randomUUID() are pretty slow, so you might want to look into lazily generating the UUID only when needed, such as during persistence or calls to equals()/hashCode()

@MappedSuperclass
public abstract class AbstractJpaEntity extends AbstractMutable implements Identifiable, Modifiable {

	private static final long	serialVersionUID	= 1L;

	@Version
	@Column(name = "version", nullable = false)
	private int					version				= 0;

	@Column(name = "uuid_least_sig_bits")
	private long				uuidLeastSigBits	= 0;

	@Column(name = "uuid_most_sig_bits")
	private long				uuidMostSigBits		= 0;

	private transient int		hashCode			= 0;

	public AbstractJpaEntity() {
		//
	}

	public abstract Integer getId();

	public abstract void setId(final Integer id);

	public boolean isPersisted() {
		return getId() != null;
	}

	public int getVersion() {
		return version;
	}

	//calling UUID.randomUUID() is pretty expensive, 
	//so this is to lazily initialize uuid bits.
	private void initUUID() {
		final UUID uuid = UUID.randomUUID();
		uuidLeastSigBits = uuid.getLeastSignificantBits();
		uuidMostSigBits = uuid.getMostSignificantBits();
	}

	public long getUuidLeastSigBits() {
		//its safe to assume uuidMostSigBits of a valid UUID is never zero
		if (uuidMostSigBits == 0) {
			initUUID();
		}
		return uuidLeastSigBits;
	}

	public long getUuidMostSigBits() {
		//its safe to assume uuidMostSigBits of a valid UUID is never zero
		if (uuidMostSigBits == 0) {
			initUUID();
		}
		return uuidMostSigBits;
	}

	public UUID getUuid() {
		return new UUID(getUuidMostSigBits(), getUuidLeastSigBits());
	}

	@Override
	public int hashCode() {
		if (hashCode == 0) {
			hashCode = (int) (getUuidMostSigBits() >> 32 ^ getUuidMostSigBits() ^ getUuidLeastSigBits() >> 32 ^ getUuidLeastSigBits());
		}
		return hashCode;
	}

	@Override
	public boolean equals(final Object obj) {
		if (obj == null) {
			return false;
		}
		if (!(obj instanceof AbstractJpaEntity)) {
			return false;
		}
		//UUID guarantees a pretty good uniqueness factor across distributed systems, so we can safely
		//dismiss getClass().equals(obj.getClass()) here since the chance of two different objects (even 
		//if they have different types) having the same UUID is astronomical
		final AbstractJpaEntity entity = (AbstractJpaEntity) obj;
		return getUuidMostSigBits() == entity.getUuidMostSigBits() && getUuidLeastSigBits() == entity.getUuidLeastSigBits();
	}

	@PrePersist
	public void prePersist() {
		// make sure the uuid is set before persisting
		getUuidLeastSigBits();
	}

}

Solution 9 - Java

Jakarta Persistence 3.0, section 4.12 writes:

> Two entities of the same abstract schema type are equal if and only if they have the same primary key value.

I see no reason why Java code should behave differently.

If the entity class is in a so called "transient" state, i.e. it's not yet persisted and it has no identifier, then the hashCode/equals methods can not return a value, they ought to blow up, ideally implicitly with a NullPointerException when the method attempts to traverse the ID. Either way, this will effectively stop application code from putting a non-managed entity into a hash-based data structure. In fact, why not go one step further and blow up if the class and identifier are equal, but other important attributes such as the version are unequal (IllegalStateException)! Fail-fast in a deterministic way is always the preferred option.

Word of caution: Also document the blowing-up behavior. Documentation is important in and by itself, but it will hopefully also stop junior developers in the future to do something stupid with your code (they have this tendency to suppress NullPointerException where it happened and the last thing on their mind is side-effects lol).

Oh, and always use getClass() instead of instanceof. The equals-method requires symmetry. If b is equal to a, then a must be equal to b. With subclasses, instanceof breaks this relationship (a is not instance of b).

Although I personally always use getClass() even when implementing non-entity classes (the type is state, and so a subclass adds state even if the subclass is empty or only contains behavior), instanceof would've been fine only if the class is final. But entity classes must not be final (§2.1) so we're really out of options here.

Some folks may not like getClass(), because of the persistence provider's proxy wrapping the object. This might have been a problem in the past, but it really shouldn't be. A provider not returning different proxy classes for different entities, well, I'd say that's not a very smart provider lol. Generally, we shouldn't solve a problem until there is a problem. And, it seems like Hibernate's own documentation doesn't even see it worthwhile mentioning. In fact, they elegantly use getClass() in their own examples (see this).

Lastly, if one has an entity subclass that is an entity, and the inheritance mapping strategy used is not the default ("single table"), but configured to be a "joined subtype", then the primary key in that subclass table will be the same as the superclass table. If the mapping strategy is "table per concrete class", then the primary key may be the same as in the superclass. An entity subclass is very likely to be adding state and therefore just as likely to be logically a different thing. But an equals implementation using instanceof can not necessarily and secondarily rely on the ID only, as we saw may be the same for different entities.

In my opinion, instanceof has no place at all in a non-final Java class, ever. This is especially true for persistent entities.

Solution 10 - Java

There are obviously already very informative answers here but I will tell you what we do.

We do nothing (ie do not override).

If we do need equals/hashcode to work for collections we use UUIDs. You just create the UUID in the constructor. We use http://wiki.fasterxml.com/JugHome for UUID. UUID is a little more expensive CPU wise but is cheap compared to serialization and db access.

Solution 11 - Java

Please consider the following approach based on predefined type identifier and the ID.

The specific assumptions for JPA:

  • entities of the same "type" and the same non-null ID are considered equal
  • non-persisted entities (assuming no ID) are never equal to other entities

The abstract entity:

@MappedSuperclass
public abstract class AbstractPersistable<K extends Serializable> {

  @Id @GeneratedValue
  private K id;

  @Transient
  private final String kind;

  public AbstractPersistable(final String kind) {
    this.kind = requireNonNull(kind, "Entity kind cannot be null");
  }

  @Override
  public final boolean equals(final Object obj) {
    if (this == obj) return true;
    if (!(obj instanceof AbstractPersistable)) return false;
    final AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
    return null != this.id
        && Objects.equals(this.id, that.id)
        && Objects.equals(this.kind, that.kind);
  }

  @Override
  public final int hashCode() {
    return Objects.hash(kind, id);
  }

  public K getId() {
    return id;
  }

  protected void setId(final K id) {
    this.id = id;
  }
}

Concrete entity example:

static class Foo extends AbstractPersistable<Long> {
  public Foo() {
    super("Foo");
  }
}

Test example:

@Test
public void test_EqualsAndHashcode_GivenSubclass() {
  // Check contract
  EqualsVerifier.forClass(Foo.class)
    .suppress(Warning.NONFINAL_FIELDS, Warning.TRANSIENT_FIELDS)
    .withOnlyTheseFields("id", "kind")
    .withNonnullFields("id", "kind")
    .verify();
  // Ensure new objects are not equal
  assertNotEquals(new Foo(), new Foo());
}

Main advantages here:

  • simplicity
  • ensures subclasses provide type identity
  • predicted behavior with proxied classes

Disadvantages:

  • Requires each entity to call super()

Notes:

  • Needs attention when using inheritance. E.g. instance equality of class A and class B extends A may depend on concrete details of the application.
  • Ideally, use a business key as the ID

Looking forward to your comments.

Solution 12 - Java

I have always used option 1 in the past because I was aware of these discussions and thought it was better to do nothing until I knew the right thing to do. Those systems are all still running successfully.

However, next time I may try option 2 - using the database generated Id.

Hashcode and equals will throw IllegalStateException if the id is not set.

This will prevent subtle errors involving unsaved entities from appearing unexpectedly.

What do people think of this approach?

Solution 13 - Java

Business keys approach doesn't suit for us. We use DB generated ID, temporary transient tempId and override equal()/hashcode() to solve the dilemma. All entities are descendants of Entity. Pros:

  1. No extra fields in DB
  2. No extra coding in descendants entities, one approach for all
  3. No performance issues (like with UUID), DB Id generation
  4. No problem with Hashmaps (don't need to keep in mind the use of equal & etc.)
  5. Hashcode of new entity doesn't changed in time even after persisting

Cons:

  1. There are may be problems with serializing and deserializing not persisted entities
  2. Hashcode of the saved entity may change after reloading from DB
  3. Not persisted objects considered always different (maybe this is right?)
  4. What else?

Look at our code:

@MappedSuperclass
abstract public class Entity implements Serializable {

    @Id
    @GeneratedValue
    @Column(nullable = false, updatable = false)
    protected Long id;

    @Transient
    private Long tempId;

    public void setId(Long id) {
        this.id = id;
    }

    public Long getId() {
        return id;
    }

    private void setTempId(Long tempId) {
        this.tempId = tempId;
    }

    // Fix Id on first call from equal() or hashCode()
    private Long getTempId() {
        if (tempId == null)
            // if we have id already, use it, else use 0
            setTempId(getId() == null ? 0 : getId());
        return tempId;
    }

    @Override
    public boolean equals(Object obj) {
        if (super.equals(obj))
            return true;
        // take proxied object into account
        if (obj == null || !Hibernate.getClass(obj).equals(this.getClass()))
            return false;
        Entity o = (Entity) obj;
        return getTempId() != 0 && o.getTempId() != 0 && getTempId().equals(o.getTempId());
    }

    // hash doesn't change in time
    @Override
    public int hashCode() {
        return getTempId() == 0 ? super.hashCode() : getTempId().hashCode();
    }
}

Solution 14 - Java

IMO you have 3 options for implementing equals/hashCode

  • Use an application generated identity i.e. a UUID
  • Implement it based on a business key
  • Implement it based on the primary key

Using an application generated identity is the easiest approach, but comes with a few downsides

  • Joins are slower when using it as PK because 128 Bit is simply bigger than 32 or 64 Bit
  • "Debugging is harder" because checking with your own eyes wether some data is correct is pretty hard

If you can work with these downsides, just use this approach.

To overcome the join issue one could be using the UUID as natural key and a sequence value as primary key, but then you might still run into the equals/hashCode implementation problems in compositional child entities that have embedded ids since you will want to join based on the primary key. Using the natural key in child entities id and the primary key for referring to the parent is a good compromise.

@Entity class Parent {
  @Id @GeneratedValue Long id;
  @NaturalId UUID uuid;
  @OneToMany(mappedBy = "parent") Set<Child> children;
  // equals/hashCode based on uuid
}

@Entity class Child {
  @EmbeddedId ChildId id;
  @ManyToOne Parent parent;

  @Embeddable class ChildId {
    UUID parentUuid;
    UUID childUuid;
    // equals/hashCode based on parentUuid and childUuid
  }
  // equals/hashCode based on id
}

IMO this is the cleanest approach as it will avoid all downsides and at the same time provide you a value(the UUID) that you can share with external systems without exposing system internals.

Implement it based on a business key if you can expect that from a user is a nice idea, but comes with a few downsides as well

Most of the time this business key will be some kind of code that the user provides and less often a composite of multiple attributes.

  • Joins are slower because joining based on variable length text is simply slow. Some DBMS might even have problems creating an index if the key exceeds a certain length.
  • In my experience, business keys tend to change which will require cascading updates to objects referring to it. This is impossible if external systems refer to it

IMO you shouldn't implement or work with a business key exclusively. It's a nice add-on i.e. users can quickly search by that business key, but the system shouldn't rely on it for operating.

Implement it based on the primary key has it's problems, but maybe it's not such a big deal

If you need to expose ids to external system, use the UUID approach I suggested. If you don't, you could still use the UUID approach but you don't have to. The problem of using a DBMS generated id in equals/hashCode stems from the fact that the object might have been added to hash based collections before assigning the id.

The obvious way to get around this is to simply not add the object to hash based collections before assigning the id. I understand that this is not always possible because you might want deduplication before assigning the id already. To still be able to use the hash based collections, you simply have to rebuild the collections after assigning the id.

You could do something like this:

@Entity class Parent {
  @Id @GeneratedValue Long id;
  @OneToMany(mappedBy = "parent") Set<Child> children;
  // equals/hashCode based on id
}

@Entity class Child {
  @EmbeddedId ChildId id;
  @ManyToOne Parent parent;

  @PrePersist void postPersist() {
    parent.children.remove(this);
  }
  @PostPersist void postPersist() {
    parent.children.add(this);
  }

  @Embeddable class ChildId {
    Long parentId;
    @GeneratedValue Long childId;
    // equals/hashCode based on parentId and childId
  }
  // equals/hashCode based on id
}

I haven't tested the exact approach myself, so I'm not sure how changing collections in pre- and post-persist events works but the idea is:

  • Temporarily Remove the object from hash based collections
  • Persist it
  • Re-add the object to the hash based collections

Another way of solving this is to simply rebuild all your hash based models after an update/persist.

In the end, it's up to you. I personally use the sequence based approach most of the time and only use the UUID approach if I need to expose an identifier to external systems.

Solution 15 - Java

This is a common problem in every IT system that uses Java and JPA. The pain point extends beyond implementing equals() and hashCode(), it affects how an organization refer to an entity and how its clients refer to the same entity. I've seen enough pain of not having a business key to the point that I wrote my own blog to express my view.

In short: use a short, human readable, sequential ID with meaningful prefixes as business key that's generated without any dependency on any storage other than RAM. Twitter's Snowflake is a very good example.

Solution 16 - Java

If UUID is the answer for many people, why don't we just use factory methods from business layer to create the entities and assign primary key at creation time?

for example:

@ManagedBean
public class MyCarFacade {
  public Car createCar(){
    Car car = new Car();
    em.persist(car);
    return car;
  }
}

this way we would get a default primary key for the entity from the persistence provider, and our hashCode() and equals() functions could rely on that.

We could also declare the Car's constructors protected and then use reflection in our business method to access them. This way developers would not be intent on instantiate Car with new, but through factory method.

How'bout that?

Solution 17 - Java

I tried to answer this question myself and was never totally happy with found solutions until i read this post and especially DREW one. I liked the way he lazy created UUID and optimally stored it.

But I wanted to add even more flexibility, ie lazy create UUID ONLY when hashCode()/equals() is accessed before first persistence of the entity with each solution's advantages :

  • equals() means "object refers to the same logical entity"
  • use database ID as much as possible because why would I do the work twice (performance concern)
  • prevent problem while accessing hashCode()/equals() on not yet persisted entity and keep the same behaviour after it is indeed persisted

I would really apreciate feedback on my mixed-solution below

> public class MyEntity {

> @Id() > @Column(name = "ID", length = 20, nullable = false, unique = true) > @GeneratedValue(strategy = GenerationType.IDENTITY) > private Long id = null;

> @Transient private UUID uuid = null;

> @Column(name = "UUID_MOST", nullable = true, unique = false, updatable = false) > private Long uuidMostSignificantBits = null; > @Column(name = "UUID_LEAST", nullable = true, unique = false, updatable = false) > private Long uuidLeastSignificantBits = null;

> @Override > public final int hashCode() { > return this.getUuid().hashCode(); > }

> @Override > public final boolean equals(Object toBeCompared) { > if(this == toBeCompared) { > return true; > } > if(toBeCompared == null) { > return false; > } > if(!this.getClass().isInstance(toBeCompared)) { > return false; > } > return this.getUuid().equals(((MyEntity)toBeCompared).getUuid()); > }

> public final UUID getUuid() { > // UUID already accessed on this physical object > if(this.uuid != null) { > return this.uuid; > } > // UUID one day generated on this entity before it was persisted > if(this.uuidMostSignificantBits != null) { > this.uuid = new UUID(this.uuidMostSignificantBits, this.uuidLeastSignificantBits); > // UUID never generated on this entity before it was persisted > } else if(this.getId() != null) { > this.uuid = new UUID(this.getId(), this.getId()); > // UUID never accessed on this not yet persisted entity > } else { > this.setUuid(UUID.randomUUID()); > } > return this.uuid; > }

> private void setUuid(UUID uuid) { > if(uuid == null) { > return; > } > // For the one hypothetical case where generated UUID could colude with UUID build from IDs > if(uuid.getMostSignificantBits() == uuid.getLeastSignificantBits()) { > throw new Exception("UUID: " + this.getUuid() + " format is only for internal use"); > } > this.uuidMostSignificantBits = uuid.getMostSignificantBits(); > this.uuidLeastSignificantBits = uuid.getLeastSignificantBits(); > this.uuid = uuid; > }

Solution 18 - Java

In practice it seems, that Option 2 (Primary key) is most frequently used. Natural and IMMUTABLE business key is seldom thing, creating and supporting synthetic keys are too heavy to solve situations, which are probably never happened. Have a look at https://github.com/spring-projects/spring-data-jpa/blob/master/src/main/java/org/springframework/data/jpa/domain/AbstractPersistable.java">spring-data-jpa AbstractPersistable implementation (the only thing: https://blog.oio.de/2010/09/24/instanceof-fails-with-hibernate-lazy-loading-and-entity-class-hierarchy/">for Hibernate implementation use Hibernate.getClass).

public boolean equals(Object obj) {
	if (null == obj) {
		return false;
	}
	if (this == obj) {
		return true;
	}
	if (!getClass().equals(ClassUtils.getUserClass(obj))) {
		return false;
	}
	AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
	return null == this.getId() ? false : this.getId().equals(that.getId());
}

@Override
public int hashCode() {
	int hashCode = 17;
	hashCode += null == getId() ? 0 : getId().hashCode() * 31;
	return hashCode;
}

Just aware of manipulating new objects in HashSet/HashMap. In opposite, the Option 1 (remain Object implementation) is broken just after merge, that is very common situation.

If you have no business key and have a REAL needs to manipulate new entity in hash structure, override hashCode to constant, as below Vlad Mihalcea was advised.

Solution 19 - Java

Below is a simple (and tested) solution for Scala.

  • Note that this solution does not fit into any of the 3 categories given in the question.

  • All my Entities are subclasses of the UUIDEntity so I follow the don't-repeat-yourself (DRY) principle.

  • If needed the UUID generation can be made more precise (by using more pseudo-random numbers).

Scala Code:

import javax.persistence._
import scala.util.Random

@Entity
@Inheritance(strategy = InheritanceType.TABLE_PER_CLASS)
abstract class UUIDEntity {
  @Id  @GeneratedValue(strategy = GenerationType.TABLE)
  var id:java.lang.Long=null
  var uuid:java.lang.Long=Random.nextLong()
  override def equals(o:Any):Boolean= 
    o match{
      case o : UUIDEntity => o.uuid==uuid
      case _ => false
    }
  override def hashCode() = uuid.hashCode()
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMRalwasserView Question on Stackoverflow
Solution 1 - JavaStijn GeukensView Answer on Stackoverflow
Solution 2 - JavanandaView Answer on Stackoverflow
Solution 3 - JavalwellerView Answer on Stackoverflow
Solution 4 - JavaChris LercherView Answer on Stackoverflow
Solution 5 - JavaAndrei Андрей ЛисточкинView Answer on Stackoverflow
Solution 6 - JavaVlad MihalceaView Answer on Stackoverflow
Solution 7 - JavajbylerView Answer on Stackoverflow
Solution 8 - JavaDrewView Answer on Stackoverflow
Solution 9 - JavaMartin AnderssonView Answer on Stackoverflow
Solution 10 - JavaAdam GentView Answer on Stackoverflow
Solution 11 - JavaauxView Answer on Stackoverflow
Solution 12 - JavaNeil StevensView Answer on Stackoverflow
Solution 13 - JavaDemelView Answer on Stackoverflow
Solution 14 - JavaChristian BeikovView Answer on Stackoverflow
Solution 15 - JavaChristopher YangView Answer on Stackoverflow
Solution 16 - JavaillEatYourPuppiesView Answer on Stackoverflow
Solution 17 - Javauser2083808View Answer on Stackoverflow
Solution 18 - JavaGrigory KislinView Answer on Stackoverflow
Solution 19 - JavajhegedusView Answer on Stackoverflow