Sunday, February 8, 2015

Replacing JPA java serialization engine to JSON

JPA default behavior for storing complex types (classes, containers, etc..) which are used as fields and were not annotated with the @Embeded, @OneToOne, @OneToMany, @ElementCollection... annotation is storing them as byte array in the database.
The byte array is written and read with Java serialization.
Although Java serialization is great for lots of scenarios it has a number of down sides.
In small objects (which is the case lots of times with these blob fields) it generates relative large byte arrays because class meta info it adds per entry.
Another downside is the fact you can't read the values when examining the database.
If you have the following classes:

class Address {
  String country;
  String street;
  String zipCode;
}

class Person {
  public int id;
  public Address;
}

you will find in the database a field named address of type byte array which you can't view from the database level.

This is why replacing the default JPA serialization to other engines can be very helpful.
For example by replacing with few lines of code the engine to Jackson (JSON serialization engine) we managed to reduce blob fields size in 40% and serialization is quicker too.
But the best part is that afterwards instead of finding binary data in the database you will find JSON strings which you can read and even manipulate if needed from database layer (perhaps on application new version upgrade process, stored procedures, selects).
In additional Postgres and other database enables you to create indexes on inner JSON fields which provide you even more added value.

If you are using EclipseLink 2.6 as your JPA provider and postgres as your database, you can benefit from these features / new possibilities by combining the following two features:
java-persistence-performance (Check out the EclipseLink JPA section at the bottom)
datatype-json (Check indexing options)

So you can now store full objects / objects fields as JSON in the database and still index / query them.

* It can be done prior to EclipseLink 2.6 version too.
* If you need code examples let me know
* See also: stackoverflow.com/questions/changing-jpa-eclipselink-java-serialization-to-using-jackson-json

Saturday, January 7, 2012

Combining the power of Hibernate Search and Solr

For people working with Hibernate to manage their objects' persistence, Hibernate search is a real savior. After trying to develop similar functionality (collecting all objects changes and sending them to a full text search engine upon transaction commit) you find out pretty fast that there are lots of pitfalls out there.
On the other hand Solr has some significant advantages (like 1:m facets if you need it) which make it a better decision on some scenarios.
Since Hibernate Search is written in a very extendable way, it turns out there is an easy option to combine both together. This way you get Hibernate Search to do all the Hibernate monitoring hard work and collecting all changes, while you can use Solr to do all searches and faceting work.
By configuring Hibernate Search default worker backend with the "hibernate.search.default.worker.backend" configuration parameter ( or by configuring Hibernate Search to forward all changes through JMS) you can catch all changes in a very convenient format and forward them to Solr using the SolrJ client interface.
Although in most scenarios you will probably prefer working with Hibernate search for both updates and queries, if for some reason you need Solr and you use Hibernate, you can still get a lot from Hibernate search.
You can build a system in a very short time where FTS fields are annotated using Hibernate Search annotations and all updates end up in the Solr server. Then you can use Solr for queries and configure it to return only the IDs of the relevant documents / objects. Afterwards you can use the Ids list with hibernate to translate the list to hibernate objects.
Any comments / inputs are always welcomed...
If you are intrested I've written a small example which shows the main concepts at:
https://github.com/avner-levy/hibernate_search_solr_integration

    Avner