API with NestJS #14. Improving performance of our Postgres database with indexes

October 19, 2020

This entry is part 14 of 21 in the API with NestJS

As our system grows, certain queries on our database might fail us in terms of performance. One of the popular ways of dealing with this issue are indexes. This article explores how we can use them both through TypeORM and writing our own Postgres queries.

Introduction to indexes

When we store information on a disk, we do so with blocks of data. When searching through it, we need to scan the entirety of it to find matching entries. Iterating over it from cover to cover does not seem like the most performant approach.

In the second part of this series, we’ve created a table of posts.

One of the most common queries that we might want to run here is to find posts of a

SELECT * FROM post WHERE "authorId" = 1;

Unfortunately, this means scanning the entire post table to find matching entries. As our table grows, this is going to take more and more time. We can improve this with the help of indexes.

The job of indexes is to make our queries faster. It requires quite a bit of disk space by holding a copy of the indexed field values and pointing to the record they relate to.

CREATE INDEX post_authorId_index ON post ("authorId");

Postgres folds column names that we don’t put in double quotes to lower case. This is why we need to write "authorId" above

We can imagine them as key and value pairs. In our case, the keys would be ids of the authors, and the values would be pointers to the posts. This way, Postgres has a lot easier time finding all of the posts of a certain author.

This information is stored in a separate data structure. Whenever we query the data, Postgres can use it under the hood to increase the speed.

Unfortunately, it takes a noticeable amount of space, and Postgres needs to keep it synchronized. Every time we insert or update the data, Postgres needs to update the indexes too. When thinking about adding indexes, we need to consider the pros and cons.

Indexes could benefit our update queries if they have some search conditions, though.

Types of scans

Please note that the above select query that we perform needs to extract the data because we want to access the posts’ contents. Postgres has a concept of index-only scans when the index contains all information required by a query. For example, when we count the number of posts, we might experience an even greater improvement in speed because Postgres does not need to read our table’s contents.

The type of scan is chosen under the hood by Postgres. We can inspect it using the EXPLAIN command.

EXPLAIN SELECT * FROM post WHERE authorId = 1;

We can expect one of a few different scans to be applied:

sequential scan
- sequentially scanning all items of a table
index scan
- uses indexes to increase the performance of the scan. Accesses the data from the index and uses it to fetch the data from the actual table
index-only scan
- also uses indexes but only scans the index data structure
bitmap scan
- a process between an index scan and sequential scan

For a more detailed comparison of various scan methods, check out this article.

Implementing Indexes with TypeORM

So far, in this series, we’ve been using TypeORM. We can use it to generate indexes for certain columns using the @Index() decorator.

import { Entity, ManyToOne, PrimaryGeneratedColumn, Index } from 'typeorm';

import User from '../users/user.entity';

@Entity()

class Post {

@PrimaryGeneratedColumn()

public id: number;

// ...

@Index()

@ManyToOne(() => User, (author: User) => author.posts)

public author: User;

export default Post;

After firing up pgAdmin, we can see that TypeORM generated a name for our index.

We can avoid the above behavior by providing a name when using the @Index() decorator.

import { Entity, ManyToOne, PrimaryGeneratedColumn, Index } from 'typeorm';

import User from '../users/user.entity';

@Entity()

class Post {

@PrimaryGeneratedColumn()

public id: number;

// ...

@Index('post_authorId_index')

@ManyToOne(() => User, (author: User) => author.posts)

public author: User;

export default Post;

Multicolumn indexes

We might sometimes find ourselves making queries with multiple conditions, such as:

SELECT * FROM post WHERE "authorId" = 1 AND "categoryId" = 2

The performance of the above might be improved by creating an index that uses two columns.

CREATE INDEX post_authorId_columnId_index ON post ("authorId", "columnId");

TypeORM also supports indexes with multiple columns. To specify it, we need to use the @Index() decorator on the entity.

@Entity()

@Index(['postId', 'authorId'])

class Post {

// ...

Keep in mind that Postgres states in its documentation that multicolumn indexes should be used sparingly. Usually, an index on a single column is enough, and using more than three columns probably won’t be helpful.

Index types

Postgres has a few index types available under the hood. By default, it uses B-tree indexes that fit most cases. We also have a few other options:

Generalized Inverted Indexes (GIN)
- designed to handle cases where the values contain more than one key – for example, arrays
Hash indexes
- can only handle simple equality checks
Block Range Indexes (BRIN)
- used for large tables with columns that have a linear sort order
Generalized Search Try (GIST)
- useful for indexing geometric data and text search

Unfortunately, TypeORM does not support creating indexes with custom types. If we’d need one of the above types, we would have to write the query ourselves. For example, we could write a migration with it.

Summary

In this article, we’ve looked into the basics of creating indexes in the Postgres database. We’ve also briefly touched on the subject of various index types. To better understand how our database works, we also used the EXPLAIN command to see how effective our indexes are. Since indexes can substantially improve our application’s performance if used currently, they are definitely worth checking out.

Series Navigation<< API with NestJS #13. Implementing refresh tokens using JWT API with NestJS #15. Defining transactions with PostgreSQL and TypeORM >>

Introduction to indexes

Types of scans

Implementing Indexes with TypeORM

Multicolumn indexes

Index types

Summary

Recommend

API with NestJS #15. Defining transactions with PostgreSQL and TypeORM

API with NestJS #16. Using the array data type with PostgreSQL and TypeORM

API with NestJS #17. Offset and keyset pagination with PostgreSQL and TypeORM

API with NestJS #18. Exploring the idea of microservices

API with NestJS #19. Using RabbitMQ to communicate with microservices

API with NestJS #20. Communicating with microservices using the gRPC framework

API with NestJS #21. An introduction to CQRS

Apple II Library: The 4am Collection

Apple II Compilation #011 (19xx)(-)

Ted Nelson's Junk Mail Cartons

About Joyk