4

The best way to hide the JPA entity identifier

 1 year ago
source link: https://vladmihalcea.com/hide-jpa-entity-identifier/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Last modified: Aug 22, 2023

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, I’m going to show you the best way to hide the JPA entity identifier so that the users of your application won’t be able to guess and access data that belongs to other users.

This has been a recurring question that I’ve been getting when running training or workshops, so I decided it’s a good idea to formalize it in an article,

Domain Model

Let’s assume we are using the following Post and PostComment entities:

Masquerade Id Entities

Notice that both the Post and PostComment use numerical identifiers, and, most often, this is a great choice since auto-incremented identifiers are great for B+Tree indexes.

If you’re using MySQL, MariaDB, or SQL Server, the database table is shaped as a Clustered Index, it’s much more efficient to use an auto-incremented numerical identifier for the Primary Key instead of a random UUID one.

Why hide the JPA entity identifier

While numeric identifiers have a lot of advantages:

  • They can be as compact as possible since we can choose between 1, 2, 4, or 8 bytes (tinyint, smallint, int, bigint).
  • Being monotonically increasing, they are very suitable for B+Tree indexes that you will create for your Primary and Foreign Key columns.

There is also one big disadvantage associated with using numeric record identifiers. The identifiers are very easy to guess, and this may allow clients to fabricate HTTP requests with values that might extract data they are not supposed to access.

So, for this very single reason, it’s quite common for developers to choose UUIDs instead of numeric identifiers. However, as I explained in this article, UUIDs are not a great choice from a performance consideration because:

  • They are huge (128 bits), and this puts pressure on the number of table records and index entries you can cache in the Buffer Pool.
  • The v4 UUID values are random, which affects the B+Tree page fill factor and the number of balancing operations that will be triggered due to the randomness of new entries.

So, instead of choosing a UUID and suffering from performance issues, there is a much better way.

Masquerading the JPA entity identifier

Since the numerical identifiers are fine from the database storage perspective, we don’t really need to switch to random column values just to make it harder for people to guess the record identifiers.

Instead, we can simply encrypt the row identifiers when we send them to the client and decrypt them back when the client sends back the values in a subsequent request.

For instance, let’s say that we want to extract the latest Post records that have been created, and since we might have a lot of entries, we use Keyset Pagination, as I explained in this article.

Since we are using Keyset Pagination, we are implementing the Top-N and Next-N data access methods in the following custom Repository:

public class CustomPostRepositoryImpl implements CustomPostRepository {
private final EntityManager entityManager;
private final CriteriaBuilderFactory criteriaBuilderFactory;
public CustomPostRepositoryImpl(
EntityManager entityManager,
CriteriaBuilderFactory criteriaBuilderFactory) {
this.entityManager = entityManager;
this.criteriaBuilderFactory = criteriaBuilderFactory;
}
@Override
public PagedList<PostDTO> findTopN(Sort sortBy, int pageSize) {
return sortedCriteriaBuilder(sortBy)
.page(0, pageSize)
.withKeysetExtraction(true)
.getResultList();
}
@Override
public PagedList<PostDTO> findNextN(
Sort sortBy,
PagedList<PostDTO> previousPage) {
return sortedCriteriaBuilder(sortBy)
.page(
previousPage.getKeysetPage(),
previousPage.getPage() * previousPage.getMaxResults(),
previousPage.getMaxResults()
)
.getResultList();
}
private CriteriaBuilder<PostDTO> sortedCriteriaBuilder(
Sort sortBy) {
CriteriaBuilder<Post> criteriaBuilder = criteriaBuilderFactory
.create(entityManager, Post.class)
.from(Post.class, "p");
sortBy.forEach(order -> {
criteriaBuilder.orderBy(
order.getProperty(),
order.isAscending()
);
});
return criteriaBuilder.selectNew(PostDTO.class)
.with("p.id")
.with("p.title")
.end();
}
}

Notice that both the Top-N and Next-N methods don’t fetch the Post entity. Instead, they return a paginated list of PostDTO instances.

The PostDTO class looks as follows:

public class PostDTO {
private final String id;
private final String title;
public PostDTO(Long id, String title) {
this.id = CryptoUtils.encrypt(id);
this.title = title;
}
public String getId() {
return id;
}
public String getTitle() {
return title;
}
}

Notice that the type of the id is String and that it will store the encrypted value of the actual Post identifier. By encrypting the entity identifier, the client will no longer be able to guess its value or the value of any other identifier of the Post table records.

The CryptoUtils class defines the encrypt and decrypt methods, and if you’re curious about it, you can take a look at it on GitHub.

Now, when fetching the first page of PostDTO entries, we can see that while the identifier values are encrypted for the external users, we are still able to decrypt their values when needed:

PagedList<PostDTO> topPage = forumService.firstLatestPosts(PAGE_SIZE);
List<String> topIds = topPage.stream()
.map(PostDTO::getId)
.toList();
assertEquals(
"3qEiB21WnB/yQ4muQe6cpw==",
topIds.get(0)
); 
assertEquals(
Long.valueOf(50),
CryptoUtils.decrypt(topIds.get(0), Long.class)
);
assertEquals(
"9jfsI1A92KIzd34ZfRxgtQ==",
topIds.get(1)
);
assertEquals(
Long.valueOf(49),
CryptoUtils.decrypt(topIds.get(1), Long.class)
);

If the clients want to access the comments of a given post entry, they can call the following service method:

public List<PostCommentDTO> findCommentsByPost(String postId) {
return postRepository.findCommentsByPost(
CryptoUtils.decrypt(postId, Long.class)
);
}

Notice that we are decrypting the Post identifier prior to calling the PostRepository method that extracts the list of PostCommentDTO entries.

The PostCommentDTO can also masquerade the actual PostComment identifier:

public class PostCommentDTO {
private final String id;
private final String review;
public PostCommentDTO(Long id, String review) {
this.id = CryptoUtils.encrypt(id);
this.review = review;
}
public String getId() {
return id;
}
public String getReview() {
return review;
}
}

And you can see that comment identifiers are encrypted as well prior to sending them back to the client:

List<PostCommentDTO> comments = forumService.findCommentsByPost(
firstPost.getId()
);
assertEquals(
10,
comments.size()
);
assertEquals(
"ltAKs4jLw8N7q7SHeUR2Kw==",
comments.get(0).getId()
);
assertEquals(
Long.valueOf(1),
CryptoUtils.decrypt(comments.get(0).getId(), Long.class)
);

That’s it!

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Encrypting and decrypting the entity identifier is a very straightforward technique that allows us to hide the underlying numerical value.

By using this strategy, we can enjoy the advantages of using numerical identifiers on the database side while also making sure that the users cannot get or even access data they are not supposed to extract.

Transactions and Concurrency Control eBook

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK