JPA entities have a couple of pitfalls and consequences when handling them in your application. Especially understanding how JPA manages and synchronizes the state of an entity is essential to avoid unexpected behavior. This can be unintuitive when passing JPA entities as method parameters in Spring Data in- and outside a transaction. In this post I will explain a best practice when working with JPA entities and when it’s okay to pass it through method parameters and when you should avoid it.

Source Code

You can find the example code on GitHub

Consider this scenario:

@Service
class UpdateUsernameService {

    @Transactional
    fun updateUsername(user: User, username: String) {
        user.username = username
    }

}

Reading this method you’d expect that the username of the User is updated. But the pitfall lies in how this method is used. Let’s expect the method is called from the outside (without a persistence context):

@Test
fun `user is not updated when passing a detached entity`() {
		log.info("Find user")
		val user = userRepository.findByIdOrNull(id)!!
		// user is in a detached state since we are outside of a persistence context
		log.info("Call update username")
		updateUsernameService.updateUsername(user, "updated")

		log.info("Assert")
		assertThat(userRepository.findByIdOrNull(id)!!.username)
				.isNotEqualTo("updated")
}

This test is demonstrating that the user is not updated as expected. Enabling the logging for Hibernates SessionImpl with org.hibernate.internal.SessionImpl=trace makes it clear why the update wasn’t persisted:

What is SessionImpl?

A session is Hibernates’ concept to represent a persistent context. Conceptually it wraps a JDBC connection and acts as a factory for a transaction. More information in Hibernates architecture overview

d.h.p.PassingjpaentitiesApplicationTests : Find user
org.hibernate.internal.SessionImpl       : Opened Session [4eae183f-0437-4177-802d-6d4617f25330] at timestamp: 1616259914962
org.hibernate.SQL                        : select user0_.id as id1_0_0_, user0_.username as username2_0_0_ from user user0_ where user0_.id=?
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [4eae183f-0437-4177-802d-6d4617f25330]
d.h.p.PassingjpaentitiesApplicationTests : Call update username
org.hibernate.internal.SessionImpl       : Opened Session [5dcba3ac-02df-41e5-b165-e6dd64f471c2] at timestamp: 1616259914974
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : Automatically flushing session
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [5dcba3ac-02df-41e5-b165-e6dd64f471c2]
d.h.p.PassingjpaentitiesApplicationTests : Assert
org.hibernate.internal.SessionImpl       : Opened Session [32c1f7f9-307f-44f5-8329-47d76b45703b] at timestamp: 1616259914984
org.hibernate.SQL                        : select user0_.id as id1_0_0_, user0_.username as username2_0_0_ from user user0_ where user0_.id=?
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [32c1f7f9-307f-44f5-8329-47d76b45703b]

As you can see in the log we open three sessions. One when we execute userRepository.findByIdOrNull(id) to fetch the entity from the database. Another when opening the transaction when calling updateUsernameService.updateUsername(user, "updated"), and the last one when we call userRepository.findByIdOrNull(id) in the assert. But as you can clearly see with the SQL logging we don’t execute any update statement when we want to update the entity. This behavior is expected because the session that was opened when we call updateUsernameService.updateUsername(user, "updated") is not aware of the user Entity we pass as a parameter. It is in a detached state. With this knowledge two options exist how we can fix the situation.

Open a session before the update

One option is opening the session earlier:

@Test
fun `user is updated inside of a persistent context`() {
		transactionTemplate.execute {
				log.info("Find user")
				val user = userRepository.findByIdOrNull(id)!!
				// This time the user is inside a persistent context and JPA takes care to persist it
				log.info("Call update username")
				updateUsernameService.updateUsername(user, "updated")
		}

		log.info("Assert")
		assertThat(userRepository.findByIdOrNull(id)!!.username)
				.isEqualTo("updated")
}

Here we use the TransactionTemplate to open a Transaction and with that a Session outside of updateUsernameService.updateUsername(user, "updated") whereas we find the User entity inside it. With this approach the @Transactional is not opening a nested transaction but supporting the existing one.

Info

You can change the default behavior of @Transactional by defining the Propagation. The default value is REQUIRED which will create a new transaction when no exist.

The log output demonstrates that this time we do execute an update statement on the database:

org.hibernate.internal.SessionImpl       : Opened Session [e0c016d4-e924-414a-b05b-bbcd58cadfe0] at timestamp: 1616265054604
d.h.p.PassingjpaentitiesApplicationTests : Find user
org.hibernate.SQL                        : select user0_.id as id1_0_0_, user0_.username as username2_0_0_ from user user0_ where user0_.id=?
d.h.p.PassingjpaentitiesApplicationTests : Call update username
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : Automatically flushing session
org.hibernate.SQL                        : update user set username=? where id=?
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [e0c016d4-e924-414a-b05b-bbcd58cadfe0]
d.h.p.PassingjpaentitiesApplicationTests : Assert
org.hibernate.internal.SessionImpl       : Opened Session [f0223820-821d-417f-9d15-f0cdf852fd17] at timestamp: 1616265054626
org.hibernate.SQL                        : select user0_.id as id1_0_0_, user0_.username as username2_0_0_ from user user0_ where user0_.id=?
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [f0223820-821d-417f-9d15-f0cdf852fd17]

Avoid passing JPA Entities

Another - my favorite - solution for the problem is not passing the JPA entity as a method parameter in the first place. The consequence of this thought is that you need to find the entity everytime you want to modify it.

@Transactional
fun updateUsernameById(userId: UUID, username: String) {
		val user = userRepository.findByIdOrNull(userId) ?: throw IllegalStateException()
		user.username = username
}

Isn’t that unperformant since we create multiple selects on the database?

No. Hibernate uses a concept called the first-level cache. This cache is enabled by default and holds all entities over a session/transaction. Even if you fetch the same entity by another attribute the cache is able to figure that out and does not select it twice from the database. Imagine this other service that is calling our UpdateUsernameService:

@Service
class BanUserService(
    private val userRepository: UserRepository,
    private val updateUsernameService: UpdateUsernameService
) {

    @Transactional
    fun banUser(username: String) {
        val user = userRepository.findByUsername(username) ?: throw IllegalStateException() // We do a select on the database
        updateUsernameService.updateUsernameById(user.id, "Banned")
        // do some other operations
    }

}

When we call banUser(username: String) we can verify in the logs that the select is only executed once. Even though, we fetched the user by different attributes in two different methods.

org.hibernate.internal.SessionImpl       : Opened Session [80093c1f-c424-4e79-8bc8-e5760ee35d42] at timestamp: 1616270839745
org.hibernate.SQL                        : select user0_.id as id1_0_, user0_.username as username2_0_ from user user0_ where user0_.username=?
org.hibernate.internal.SessionImpl       : SessionImpl#beforeTransactionCompletion()
org.hibernate.internal.SessionImpl       : Automatically flushing session
org.hibernate.SQL                        : update user set username=? where id=?
org.hibernate.internal.SessionImpl       : SessionImpl#afterTransactionCompletion(successful=true, delayed=false)
org.hibernate.internal.SessionImpl       : Closing session [80093c1f-c424-4e79-8bc8-e5760ee35d42]

This behavior makes it performance wise adequate to never pass JPA entities between (public) methods and to let them be transparent for the caller.

Another Option - Mandatory Transactions

There is another option. Instead of fetching the entity all the time you can declare your transaction as mandatory. That will instruct JPA to throw an exception when this method is called without a persistent context.

@Transactional(propagation = Propagation.MANDATORY)
fun updateUsernameWithMandatoryTransaction(user: User, username: String) {
    user.username = username
}

This approach will save us to fetch the entity again, but it also means that we introduce another possibility that our code fails at runtime. This is a violation of Murphy’s law.

Passing the ID in this example can also fail at runtime

Of course the approach of passing the (primitive) id can also fail at runtime when you pass the wrong identifier by accident. But this is an issue that can be solved with typed ids - which I might cover in another blog at some point.

Conclusion

All three options have caveats. Passing the entity directly leaves the responsibility to the client if a transaction is present or not. Enforcing that with a mandatory transaction is slightly better, but still requires the client to be aware of that requirement. But passing the identifier all the time is tedious. Especially when you consider the most secure implementation would be when you declare every method as transactional and fetch the entity from the database. But let’s be honest - this is too much boilerplate.

So, what should we do then? I’d say be consistent in your code. How about relying on the rule that a transaction is opened before accessing the service layer? Most of the time this will be in the controller and in 99% of the code this will work fine and in the rare cases where this is not the correct way of opening the persistent context it will be clear that you don’t want that behavior, and you’ll make it explicit - and maybe in the safest way possible.

Update - 2021-03-24

An old colleague of me made me aware of the option to declare a transaction as mandatory as a third option. Also, after some more thoughts about that topic and good input from colleagues I completely changed my conclusion.