There are a lot of ways to create software. In fact, there are even a lot of ways to create good software. When it comes to application server development, one of them has stood the test of time, and for good reason: Java Enterprise Edition.
Java EE is much more than a software library. It is also an architecture and a phiplosophy. JEE is not for the faint of heart; if you are building a throw-away prototype, you headed the wrong way. But if you came to read about an architecture that can carry major enterprises and support massive scale applications, then you're right on track. JEE is the heavy artillery of software development. And it's freaking awesome.
In this article, we will explore the architectural side of JEE. An article about the implementation will follow soon.
A little side note. When I talk about Java Enterprise in this article, I'm not specifically talking about the library, but the architecture and the philosophy. The official Java EE libraries are just one way of implementing this stack.
So you are ready for some serious software development? Cast aside any untyped languages, fancy scripting, hype-driven development and hipster-tech, let's get serious and write software that runs for the next 20+ years. Here is the Java EE architecture in a nutshell:
Okay, granted, you need a big nutshell. Let's start with some initial observations:
- Every communcation with the outside world is strictly request-response based.
- The incoming request passes through several layers before it reaches your application code. Each layer can refuse, redirect or alter the request. The motivation behind these layers is separation of concerns.
- Many of the layers are already implemented and you just need to make use of them.
- JEE is all about allowing developers to focus on business functionality only. 99% of everything else has been taken care of for you.
When a request reaches our server machine over the network, it is first passed to the operating system. The OS will determine to which application to forward the request, based on the port it has been sent to. In this case, the application is the Java Virtual Machine. The JVM internally runs an application container (such as Apache Tomcat or Glassfish). An application container implements the Java Servlet API. An application container has several responsibilities:
- It manages one or more applications which are delivered as servlets. In practice, most cotainers only hold a single servlet, but in theory one Tomcat can hold arbitrarily many servlets.
- It provides an implementation of the servlet API. This allows the contained applications to talk to the container. The most prominent use of this feature is to establish a filter chain (more on that later).
- It provides integration with the Services API of the operating system. That way, an application running inside the container can be started, terminated and rebooted as an OS-level service. For that reason, even though Java is multi-platform, many application containers contain platform-specific code (so the contained applications remain platform-independent).
- It redirects incoming requests to the correct application based on the path mapping. It is not uncommon to see one registered servlet for static content (bound to /static) and one for the dynamic API of your application server (bound to /api).
- It manages the thread pool for requests and binds each request to a thread. As the thread holds the request context, it is discouraged to manually start new threads in a JEE environment (unless you know exactly what you are doing).
Traditionally, Java EE applications are deployed via archive files known as WAR files (for Web ARchive) or EAR files (for Enterprise ARchive). The internal file structure of these archives is standardized. The application containers extract the contained files on startup and launch the contained servlet(s). While doing so, the container binds your servlet to the specified port (either specified in code or a configuration file).
Usually when working in the JEE architecture, you will not implement everything from scratch. Many tasks are the exactly the same from one JEE application to the next, so it makes a lot of sense to use a suitable framework. Predominantly, there are the actual JEE reference implementation, and the Spring framework. I can't say much about "vanilla" JEE, as I exclusively used the Spring framework so far. We will discuss it in more detail in the next article.
Every incoming request, before it is handed to your application, must pass a series of so-called Servlet Filters, which form a filter chain. Once a request passes the first filter, the second filter kicks in, and so on. Each filter has the option to block a request. Application containers allow to customize the filter chain via the Servlet API. The JEE framework implementations use the filter chain for many tasks, including session management and security. Filters can also have side-effects; if there is a task you want to perform per request, you will often see the implementation in the form of a servlet filter. Also, if you need to bind some information to the request itself, servlet filters are a common place to do so.
The presentation layer is where your actual application code meets an incoming request for the first time. This request has passed the servlet filter chain, so the user session is set up and ready to go, and all authentication has already been taken care of. In the early days of JEE, the presentation layer was the place where server-side generation of HTML pages has occurred. Nowadays, the presentation layer consists of a collection of REST controllers that offer various endpoints that make up your REST API. If you are faced with older applications, you will also encounter XML webservices in the presentation layer. A common thing to do in the presentation layer is server-side validation of user input and general request validation. In the same way as you should never write SQL queries in your GUI code, the presentation layer must not attempt to access the database directly. A class in the presentation layer is only allowed to talk to another presentation layer class, a service layer class, or an element of the data model which was returned by the service layer.
The service layer is where your actual application code resides. This is the place to put your business rules into code. The service layer is where you move data around in your data model, create new elements, delete old ones, etc. Depending on your use case, the service layer may be as small as "forward this call to the repository layer", or an extremely involved process. Classes from the service layer may only talk to other services, or to classes from the repository layer.
This is the last layer in your code that modifies your data before it hits the database. The predominant element in this layer are Repositories (also known as Data Access Objects, or DAO*s). These classes simply offer a number of methods that allow you to *persist, load, delete and query your data in the database. What is important here is that you must never let any specifics of your data store escape the repository layer - its very purpose is to make sure that you can exchange the data store with a different one (potentially even an SQL database with a NoSQL store!). Internally, your repository methods will contain the actual query statements. If you are working with a standard JEE stack, then you will have a Java Persistence API (JPA) Provider such as Hibernate in place. JPA allows you to convert your domain model to SQL tables and back with relative ease. It still has a lot of pitfalls and would be deserving of its own article. As you probably already guessed, the repository layer classes do not call any other classes outside of their own layer, except for JPA classes.
The data model represents the data in your domain. It is the only architectural element that will be used by all three layers of your application. It is therefore crucial that the domain model classes have NO references to any other classes, except for classes that reside within the domain model themselves. In contrast to the presentation-, service- and persistence-layer classes, the domain model is stateful. Typically, you will not want to have a lot of logic in the domain model; it mostly exists to hold your data and provide a clean API, the actual complex modifications are done in the business layer. The domain model, while not explicitly required in JEE, almost always follows the Java Bean pattern. Proper getters and setters are not negotiable here if you want to make use of standard frameworks for easily handling your domain model, such as Bean Validation and JPA (more on that later). A domain model element is your typical POJO - private fields, a constructor, and getters and setters. Usually, frameworks like JPA, Jackson and JAXB will in addition force you to give each class a default constructor, because these classes need to be instantiable via Java reflection. In contrast to almost all other classes in the JEE architecture, having a clean implementation of
hashCode() is crucial for domain model POJOs. Usually, each domain model element has a unique ID for this purpose, which also coincides with its ID in the database tables.
A request is always bound to a thread in JEE which is instantiated and managed by the Application Container (typically in a Thread Pool). This means that a JEE server application is always inherently concurrent, you cannot avoid that. As we all know, properly dealing with concurrency is hard. Thankfully, the JEE architecture has you covered when it comes to concurrency. If you look at the picture above, you see four users working with the application in parallel, each being represented by a request/response bound to a thread. There is one particular detail worth noting: the threads never intersect. The application performs no synchronization, and instead leaves it to a component which is really good at doing so: the database.
How is this possible? How can we have all these layers above the database without having to consider multi-threading? Recall when concurrency becomes an issue: when several threads access the same data. You want to avoid this case at all costs in a JEE application (there are exceptions, such as application-level caches). In order to do so, all classes that belong to the repository layer and the service layer are stateless in JEE. They have no fields, neither private nor public ones, which hold mutable state. So what about the data? The data is loaded per user and per request. When a request arrives at the service layer (the presentation layer is a bit of an exception here) then a new database transaction is opened for exclusive use by this user. The services then gather the requested data and/or perform the requested modifications, all inside this single transaction. Before the result is passed to the presentation layer, the transaction is committed and closed.
This architecture has two big advantages:
- The server is stateless, which is a nice property to have, e.g. for testing. It helps to keep business logic very simple and works well with a more functional programming style.
- The only place where concurrent modifications ever meet is at the database, but they are specifically engineered to handle that.
The cost is of course that each thread builds its own (partial) view of the data model. So if two users request the same piece of data, it will be held in memory twice.
There would be so much more to say about JEE. I often feel that it gets a lot of undeserved criticism simply because it is misunderstood. It plays really well with modern programming styles and languages and it helps to build very stable applications. In a way, JEE is not so much about what it provides to you as a programmer, but rather what it protects you from (concurrency issues, data integrity issues, ...). The JEE architecture is a prime example for defensive programming in this regard - it is all about safety first. This architecture has a proven track record of being well-suited for large projects and teams.
In the next article, we will take a closer look on the actual implementation of this architecture by a concrete example - it will take a lot less code on our part to make all of this happen than you think.