NoSQL Distilled : Book Review



I have finished reading the book titled "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence" written by  Martin Fowler and Pramod J. Sadalage  recently and thought of a writing short review. First of all it short book and you can run through the pages fast. As the title says, the book is really an introduction of No SQL databases.



It was..

The number one reason for the use of NoSQL databases is not performance and should be our use case. This means if our data model does not fit well in to the relational model, rather than twisting the data we can choose alternate data stores. This is absolutely critical if you want NoSQL to be used in most of the mid sized enterprise applications.  Also companies will start to believe that they should consider NoSQL it even if they are not Google or Amazon. 

Normally the technology books written by its creators of it will only focus on the advantages of using it. It can be quite an issue if you do not do proper research before choosing your application architecture. Since both Martin and Pramod do not belong to this category, they does a fair evaluation of various choices available to us. The section which says where to use it and where not to use is particularly impressive on that front.

Even though the book is selling you a new idea of NoSQL databases, it does not plays down the significance and importance of existing RDBMS. The book also brings light to some of the fundamental trade offs in the world of data stores like  consistency, availability and performance.

and is also..

When I was reading the book I thought that Polygot persistence is about using multiple data stores in same application depending on the use cases. But authors also seems to suggest that applications should wrap around the data stores with services ( like web services or REST). If we need to do this ideally we could only have one data store accessed by one service. Which would conflict the first statement. I am not sure about the reverence this question but I would love to have read more about using the different data store in the same application and its pros and cons.

I mean to say, 

This is a very great book if you are  looking get in to wild jungle of NoSQL databases. The authors does a great job in giving the brief introduction and making sure the basics are covered.

Package your classes by Feature and not by Layers


Most of the enterprise Java applications share some similarities in their design. Mostly the packaging of these applications are driven by the framework used by them like Spring, EJBs or Hibernate etc. Alternatively you can group you packages by features. Like any other item regarding modeling this is not free from any issues. Lets discuss some trade-off and how get around them. In this post we will discuss the pros and cons of both approaches against common usage scenarios.

Package By Layer (PBL)

This is a the first thing that developers do when the create an enterprise application in to split it to number of layers like DAO, SERVICE, VIEW etc.. This gives nice separation of code when we use different frameworks at different layers. For example if I were to use Hibernate, Spring and JSF, then I will have all my Hibernate dependent code in the DAO layer and JSF related code in the VIEW Layer. This is sort of good in case I need to migrate to a new framework only at the view layer or DAO Layer.




Package By Feature (PBF)

Package-by-feature uses packages to reflect the feature set. It places all items related to a single feature (and only that feature) into a single directory/package. This results in packages with high cohesion and high modularity, and with minimal coupling between packages. Items that work closely together are placed next to each other. They aren't spread out all over the application.

This also increases coherence as a large percentage of a the dependencies of a class are located close to that class.
 



Comparing the approaches

Let me compare both the approaches in below dimensions.

1. Adding a new Feature.

In case of PBL code has to be added to VIEW, SERVICE and DAO Layers and it can be tedious. PBF solves this problem by grouping all the code related to same feature in to a single directory.

2. Changing a framework used.

As discussed already PBL makes it easier to change a framework as all the related code are kept at same place. Here we know exactly the scope of the change and its impact. In case of PBF we need to dig into all the feature set to see the framework related classes.

If you choose to migrate the framework module by module, then it could be argued that PBF is better than PBL.

3. Code Navigation.

As developers needs to work on the features most of the time using PBF is easier for code navigation. When you know exactly know what has to be done its not much advantage.

4. Keeping the Common Code.

Every application will have some components which will be reused across the features ie, the features are not always exclusive. In such case if we package all the features separately the interactions between them can be quite messy. As a general principle we need to reduce such interactions and increase the cohesion inside the package. We can get around the situation be adding such entities to a common package. This approach is used in many projects including Hibernate.

Conclusion

Most application will have 4 types of classes. ie,

1. Domain Objects
2. Business Services
3. Data Retrieval Logic
4. Data Representation Logic

If we use PBF it gives us a good structural representation but does not give any functional representation. We need our architecture to resemble the problem domain. So its better to use Package By Feature style. We could internally classify the classes by using naming conventions like MyFeatureDAO or MyFeatureService . This way we could communicate the intend of the classes.

I have found some discussions on this topic and hope that might also help you choose.