Don't let your persistence layer bleed into your presentation layer

Architecture Pitfalls: Don't let your persistence layer bleed into your presentation layer

6398865a7f278db7708416bc_daan-mooij-91LGCVN5SAI-unsplash.jpg

Founder of Black Parrot Labs and co-founder of Apiman.

Marc Savy

Often, when developers start building a new Java web application they begin by defining the set of JPA entities that form their persistence layer (e.g. using an ORM like Hibernate). These entities are then returned in response to various queries, and passed through to service layer, where they ultimately reach the presentation layer. When you add something new to your JPA model, it automatically appears in the results return to the client — this seems an appealing feature to many, at first blush.

The belief this is a good idea is exacerbated by many frameworks' examples using this problematic pattern out-of-the-box, because doing it correctly historically involved lots of boilerplate-y glue code that clutters your samples.

However, what appeared a good method to get your new project rolling rapidly soon becomes a morass for several reasons:

Tight Coupling

By letting your data layer spill out to your clients, you are implicitly binding together your data model and your public API. They are now the same thing, and any time you need to change your data model, you are changing your public API — sometimes radically.

Ultimately, the specifics of the persistence layer implementation should be a detail that your consumers don't know about.

API Brittleness

You strictly limit your ability to refactor your data model, as any changes will automatically be reflected in the payloads returned to your users. Now if you want to change a datatype, or find that your existing model is inefficient, you can't change it without rewriting your client to match.

This is API brittleness, and imposes significant work on downstream teams as they have to continually refactor their applications in response to data model changes that they should not care about.

It is critical to the long-term maintainability of any software that the persistence layer can be changed without rewriting the clients that depend on it.

Data Bloat

We only want the highlighted part exposed for certain calls, but we may end up pulling in a massive number of related entities unintentionally.

Often, JPA data models contain a significant number of entity relationships, allowing you to access associated data conveniently.

By passing relationship-containing entities to your presentation layer, you are typically forcing the entire collection to be fetched so that it can be serialized into the response, irrespective of whether it is appropriate or necessary. I have seen cases where several megabytes of data were inadvertently returned in relationships that were not relevant to the specific response.

There are some workarounds that can keep this anti-pattern limping along for a bit longer, typically involving labelling certain fields with @JsonIgnore or nulling out the field before returning it. A number of other hacks exist which are variations on this theme, but whilst they solve the immediate data bloat issue, they are blunt instruments that affect calls elsewhere in your API which may not want to ignore the relationship — but will now have those objects 'disappear'.

Like beheading the proverbial hydra, solving one issue with a hack typically causes two more problems to pop up in its place.

Cycles

In data models it is typically permitted to have cycles to model bidirectional relationships. However, many serialization frameworks really do not appreciate this, and you may end up with infinite loops as your JSON serializer chases its own tail until your program crashes.

You can mitigate by using @JsonIgnore or @JsonManagedReference / @JsonBackReference and similar serialization tricks. Indeed, some languages and frameworks support replacing "seen-before" objects with references. But, this happening is an glaring sign that your abstractions are leaky.

Refactoring Anxiety

If making changes to the data model causes instability elsewhere in the application, your engineering team will go to extreme lengths to avoid refactoring — for fear of causing an unanticipated incident.

I call this refactoring anxiety. It causes development to slow down and technical debt to pile up as your engineering team avoids change at all costs.

Resistance to refactoring is antithetical to modern agile software design practices that encourage frequent, smaller refactoring, rather than infrequent 'big bang' changes that are more likely to fail.

Unexpected Data Modification

Transactional boundaries start becoming unclear, and actions automatically triggered by the presentation layer serializing your entities might require a transaction (or even start one). It also isn't good for performance to have mega-sized transactions, and you are often forced to have a transaction spanning the from the presentation layer — even if you don't really want to start it there.

Furthermore, because you are operating directly on data model entities in many different contexts (e.g. business logic layer, presentation layer), it's very easy to accidentally trigger an action that your ORM will end up persisting. This is particularly true when you have cascade relationships. Whoops.

How to avoid this?

The moral of the story is: don't mess around with persistence entities unless you really mean to interact with your persistence layer in some manner.

I'll write another blog (and link here in future) for some advice and patterns on the variety of approaches for solving this; as always, there are several different ways, and it depends a bit upon which architecture you are using — there are lots of different vocabularies, and people often don't really agree on what they mean. But, I'd like to provide an abridged version here to get people started:

Consider creating an independent representation of the view that consumers of your persistence layer should see that is independent of your persistence model — i.e. a set of classes that precisely and stably represents what you want consumers of your persistence layer to consume.

Depending on your architecture people these will have different names, but the concept is similar. Some people reflexively shout "ANTI-PATTERN" when certain terminology is mentioned — but inherently we're talking about abstracting the design of your persistence layer from presentation (and other layers, in most architectures suited for larger applications). Consider it like an API contract that you really want to avoid changing.

Reduce boilerplate and hand-coding of 'dumb glue code' by using mappers such as MapStruct or ModelMapper (my favourite is MapStruct as it uses code generation rather than reflection). In many cases, you just want to map stuff across in fairly simple patterns. These mapper frameworks allow you to do that.

I suggest doing this from the start, as it's often difficult to retrofit due to consumers inadvertently depending on behaviour you never intended (e.g. additional data sneaking into responses to certain requests).

Book Recommendations

A few book recommendations, as people have been asking.

High-Performance Java Persistence, Vlad Mihalcea

Vlad is very well known in the Java and JPA community. His book is a great guide to all things JPA and Hibernate, with many patterns that are applicable across different ORM stacks. If you are having performance problems with your Hibernate/JPA application, High-Performance Java Persistence is very helpful.

HPJP is fairly broad in its purview, covering the fundamentals of relational databases, and how to design your application to be sympathetic to the underlying technologies. It's also a good reference when kicking off new projects, helping instil best practices in your engineering and architecture teams. It's much easier to establish beneficial design patterns early on, rather than piling up technical debt that's difficult to undo later

SQL Antipatterns: Avoiding the Pitfalls of Database Programming, Bill Karwin

Often you can learn a lot from seeing what not to do; that's the essence of this post, in many ways! Bill Karwin covers a wide range of common SQL anti-patterns that he's seen over the years, and provides alternative solutions that are much friendlier to your RDBMS. It's another great book to have when kicking off projects and as a general reference.

Black Parrot Labs harnesses the best of open source to provide production-grade solutions to real-world problems.

With our extensive experience in Java enterprise middleware, integration, and distributed systems, you can rely on us to support your most complex needs.

We look forward to hearing from you: [email protected]

Architecture Pitfalls: Don't let your persistence layer bleed into your presenta...

Architecture Pitfalls: Don't let your persistence layer bleed into your presentation layer

Tight Coupling

API Brittleness

Data Bloat

Cycles

Refactoring Anxiety

Unexpected Data Modification

How to avoid this?

Book Recommendations

High-Performance Java Persistence, Vlad Mihalcea

SQL Antipatterns: Avoiding the Pitfalls of Database Programming, Bill Karwin

Recommend

净利率超越茅台，巨子生物隐患难除

“推箱子”还能怎么玩？两人迷你团队用半年时间给出了一个新的答案

微软亚马逊等巨头与交易所结盟瞄准云计算市场、遭监管密切关注

索尼技术专家对VR/AR显示器的发展做了预测

Everything you need to know about State in Jetpack Compose with examples

董明珠困在雷军里_创事记_新浪科技_新浪网

Better Interviews

石四药集团附属公司签订合作意向书，拟获委托生产布洛芬缓释片

Principles & Practice in Repository Layer

The Go Programming Langauge (gopl) Book Learning Notes | UniFreak’s blog

About Joyk