<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=299788&amp;fmt=gif">

JPA and Spring Data Awesomeness

Software Development, Spring, Software Solutions

By Scott Smith
Since the dreary days of JDBC the Java community has developed a myriad ways of persisting data to a relational datastore. I'm going to spend the next few paragraphs taking a look at a few of the better options out there. But first a little history...
Objection Relational Mapping (ORM) has been around for more than 20 years. In the Java world there were also proprietary ORM solutions dating back to the mid-90s. But in 2001, Gavin King released Hibernate as Open Source and things have never been the same. At the time EJBs were the Enterprise solution for persistence. JDBC was the Java API for database connectivity. But people were beginning to notice the brittle nature of these solutions. Code maintenance was high, development cycles were long, and general dissatisfaction was brewing over the boilerplate code that had to be arduously developed over and over for the same types of solutions.
 
Hibernate, and soon after Spring, became a new paradigm for faster and lighter Java based solutions. By 2006 Sun Microsystems publicly acknowledged their previously duplicitous EJB persistence solutions by releasing EJB 3.0. I cynically say duplicitous because Container Managed Persistence was never an efficient solution to the problem of persistence. They deprecated Container Managed Persistence and wholly embraced the new Java Persistence API, JPA. They released a reference implementation based on Toplink, another ORM solution. And soon thereafter, Hibernate began offering JPA support as well.
 
Revisions and enhancements, to both Toplink and Hibernate, have increased the options we have in approaching a given persistence problem. The Java Persistence API has likewise offered enhancements. Making matters more confusing, and also more convenient, Spring Data was recently released to offer even more streamlined development of persistence. Now we hardly have to develop a formal Data Access Object, or DAO, to encapsulate persistence. I'm going to take a look at a few of these options, what they can do, and how to decide which option is best for a given situation.
We'll start with an EJB 3.0 Entity Bean and a DAO using some JPA constructs using Java Persistence Query Language, JPQL. Then we'll move into Criteria Queries. After that we'll bring in Spring Data to generate basic database access. Then we'll show how to override this behavior in both implementation code and auto-generated code.
Sometimes the wiring for these sorts of projects can get a little tricky, but that only has to be endured once for a project. After that initial configuration, things get a bit easier.
Our sample Entity Bean is a Person. Java imports, accessors, and mutators are left out for brevity:
@Entity 
@Table(name = "person") 
@NamedQueries({ 
// Person.findAll does not conform to proper naming so it does not override. 
@NamedQuery(name = "Person.findAll", query = "select p from Person p order by name"), // Never gets called, only the generated one is used. 
@NamedQuery(name = "Person.findById", query = "select p from Person p where p.id = :id"), // Only called because of the Repository Implementation 
/* 
* Person.findByName is intentionally wrong to show that this does override by using the proper naming 
*/ 
@NamedQuery(name = "Person.findByName", query = "select p from Person p where p.name != ?1") // Overrides the generated method 
}) 
public class Person implements Serializable { 
@Id @GeneratedValue 
private String id; 
private String name; 
private int age; 
} 

Our PersonDao has a few simple methods using the JPA Entity Manager to make queries and insert data into the database.
 
@Repository 
public class PersonDao { 
private static final Logger logger = LoggerFactory.getLogger(App.class); 
@PersistenceContext 
EntityManager em; 
@Transactional 
public List findAllNamedQuery() { 
List personList = new LinkedList(); // Null Object Pattern 
try { 
personList = (List) em.createNamedQuery("Person.findAll") 
.getResultList(); 
} catch (NoResultException nre) { 
// This is okay, it means the database was empty. 
} 
return personList; 
} 
@Transactional 
public List findAllCriteriaQuery() { 
List personList = new LinkedList(); // Null Object Pattern 
try { 
CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder(); 
CriteriaQuery personCriteriaQuery = criteriaBuilder.createQuery(Person.class); 
Root personRoot = personCriteriaQuery.from(Person.class); 
personCriteriaQuery.select(personRoot); 
TypedQuery personTypedQuery = em.createQuery(personCriteriaQuery); 
personList = personTypedQuery.getResultList(); 
} catch (NoResultException nre) { 
// This is okay, it means the database was empty. 
} 
return personList; 
} 
@Transactional 
public void persist(Person person) {
em.persist(person); 
} 
@Transactional 
public Person merge(Person person) { 
em.merge(person); 
return person; 
} 
} 

Let's walk through a few key points.

The Entity Bean is a simple Java bean with annotation markers to decorate attributes for use by the Entity Manger.

In the old days we used Hibernate Query Language, HQL, to write simplified, SQL, such as object queries. These queries were compiled at runtime and corresponding SQL was generated to access the database. They were not validated by the compiler or container and they had to be recompiled again and again. There was a performance penalty and ORM gained a fair amount of criticism for it...even to the point that developers would resort to JDBC or even calling stored procedures to get the necessary throughput. There was also the potential for more bugs because code had to be specifically tested to trigger the HQL compilation to SQL.

@NamedQueries are special. @NamedQueries are validated at deployment time. Sometimes at compile time as well. Scanning for @NamedQueries happens throughout the code base. They are compiled and validated, once. So errors are uncovered faster and more reliably. This also means less runtime penalty because the compilation does not have to happen again and again for a given query. Therefore, @NamedQueries are the fastest ORM solution. JPQL/HQL should never be used in strings, even dynamically generated ones. I like to keep my @NamedQuery definitions on the Entity Bean that corresponds to the returned object type for the query. It makes locating and reusing them easier and more likely.

Now we'll take a look at the PersonDao.
It's marked @Repository because it is a Spring Managed Singleton Bean. @Repository doesn't have any more meaning than @Component. It's just used to indicate it is a Repository class. I'm calling it a DAO because it is in this case. But you can see how it could add more domain return types so it was more than a DAO. I don't want to get mired in the difference between them in this case.

An Entity Manager is being dependency injected. Incidentally, Entity Managers are not thread safe. Only Entity Manager Factories are thread safe. But I'm using Spring to manage it and it injects a new one for each thread to make it seem thread safe. I could also inject the Entity Manager Factory, but then my code would have to pull out an Entity Manager on the local execution stack to do database operations.

@Transactional annotations are used. Either Spring Proxies or AspectJ can be used to weave in transaction code. The alternative would be to start and stop transactions with the Entity Manager directly in code.

So far we've used @NamedQueries to validate and compile our JPQL code making the overhead of ORM lower, thereby increasing performance and decreasing the likelihood of bugs. Our mappings are defined in annotations instead of XML which is easier to read and maintain thanks to JPA. We're using Spring the manage this bean and only one instance is created. The memory footprint is low and the overhead of object creation is minimized. We're adding convenience to the inconvenient fact that Entity Managers are not thread safe. We don't have to pull out Entity Managers from Entity Manager Factories in code for each thread. We're weaving transactions so we don't have to write transaction code. By doing these things, we've already eliminated a lot boilerplate code. Our findAllNamedQuery() method is now reduced to a simple call to the Entity Manager using a @NamedQuery. An exception is caught because of the possibility of an empty result set.
 
This is pretty good stuff, but what about the use case of dynamically generating the contents of a @NamedQuery using programming logic?  I thought you told us never to do that? Sometimes we have to. That's what the Criteria API is all about. It gives us strong typing, validation, and performance. It's more difficult to use than @NamedQueries but it's a much better construct than JPQL Strings. It is a bit verbose as the findAllCriteriaQuery() illustrates the same query as findAllNamedQuery() written with the Criteria API.

Finally, we cap off our DAO with the equivalent of create(), or persist(), and update, or merge().

You may already be familiar with a lot of these constructs and ideas. That's great. But can we simplify our code even more? This is where Spring Data comes in. We can simply define an interface and Spring Data will generate the implementation!
Let's take a look at another class, PersonRepositoryGenerated.
public interface PersonRepositoryGenerated extends PagingAndSortingRepository<person, string=""> { 
public List<Person> findAll(); 
/** 
* This is overridden by the Person Entity @NamedQuery. That @NamedQuery is intentionally 
* wrong. Comment it out to test what Spring Data generates from this to see the difference. 
* @param name 
* @return 
*/ 
public Person findByName(String name); 
/** 
* Again, this @Query is intentionally wrong to illustrate that the generated one 
* is overridden. Comment this @Query out to see what Spring Data will do (correctly). 
* @param age 
* @return 
*/ 
@Query("select p from Person p where age != ?1") 
public Person findByAge(int age); 
} 

This particular class extends the PagingAndSortingRepository. There are others you can extend. I've found this one useful when pagination is needed in the application. We're not doing that here, but it's still nice to show it in this example. Also, the configuration specifies the packages included in the Spring Data scanning so implementations are generated for those interfaces at compile time. I am not showing the necessary configuration in this example either. It's a good idea to keep your package name spacing logical, consistent, and structured. It's easy to be confused by the code organization when there are multiple data access strategies and some don't even have implementations! Try to be disciplined so your code communicates your intentions to future maintainers.

I put some interesting notes in the code. In an existing application, retrofitting Spring Data can have some pitfalls. That's what we're exploring here. It's worth noting that @NamedQueries, again, wherever they exist in your code base, will override generated Spring Data code - *if* they conform to Spring Data naming conventions. A notable exception is the findAll @NamedQuery, in this example Person.findAll. It does *not* override the generated Spring Data code!! At first glance, it would seem that it should.
The findByName() interface method is also generated but overridden by the @NamedQuery on the Person Entity class. I made the @NamedQuery intentionally wrong to illustrate that the @NamedQuery is picked up instead of the SpringData generated one.

The next example is the findByAge() interface method. It uses a new construct called @Query to override the default Spring Data generated query. This obviously expresses the developer's intention a little more clearly. I'm using Spring Data to generate all the boilerplate data access code, but I'm overriding the query itself. A notable drawback is this JPQL query is not available to be reused in other places. But if that's not an issue, this is a better way to go.

What happens when a @NamedQuery and a @Query exist? I'll leave that experiment as an exercise for the reader. But I would advise against it because your intentions will be confusing to other developers in the future.

You can see that Spring Data can eliminate a lot of rote data access coding but you do not get to skip knowing the fundamentals because there will always be situations where Spring Data doesn't make sense. In this example, we've illustrated and discussed that each construct has a valid use case and which one you choose is a value judgement based on your particular needs. Spring Data is another great tool that will eliminate a lot rote coding for simpler use cases. The basic accesses can be handled in this way. More complex use cases may require more comprehensive approaches as shown in the PersonDao.
 

TAGS: Software Development, Spring, Software Solutions

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Subscribe to Our Newsletter

Recent Blog Posts