9

SQL vs NoSQL Database Comparison | Payload CMS | Blog

 1 year ago
source link: https://payloadcms.com/blog/sql-vs-nosql-cutting-through-the-tech-twitter-noise
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

SQL vs. NoSQL - cutting through the Tech Twitter noise

James Mikrut

May 11, 2023
SQL vs NoSQL

We've done a lot of thinking around database support lately, and even hosted a community discussion on Discord. Right now we're MongoDB-only, but our most upvoted roadmap item is adding support for additional databases. And transitioning from NoSQL thinking to an SQL mindset has really spotlighted the concrete differences.

If you've ever wondered exactly what NoSQL is good for, and alternatively, why SQL might be a better fit for certain applications, and you're tired of Tech Twitter telling you what to think, then read on for a real-world comparison.

In this post, I'm going to cover the big challenges that we'll face when adapting our features and functionalities over from NoSQL to SQL, and along the way, I'll highlight the strengths and weaknesses of each paradigm with some objective ramifications.

Data Definition Language

Since we're currently solely on MongoDB, we don't need to do things manually like adding a new table column when you add a new field to your Payload config. But with SQL, we need to think about that (and make it painless).

DDL, or Data Definition Language, is how SQL databases define and maintain their structure. Even when you use something like Prisma, which takes care of a lot of the headache for you, you need to become intimately familiar with how Prisma migrations work. There, Prisma translates its own schema syntax into SQL migrations for you.

Don't know what any of this means? Go ahead and try to simply follow the "Create a column in a table" example. If you know SQL and if this is not new to you, then you're probably okay with this. But there's work involved here.

When you decide you want to store a new field in a NoSQL database like MongoDB, you literally have to do nothing besides just start storing the data. No migrations, no generated DDL, no schema / database "sync".

That's obviously a big win for NoSQL, but that also means that the NoSQL structure is significantly more loosey-goosey and that can lead to errors. Lots of devs (not to mention project requirements) straight up require the structural constraints of SQL, and allowing any type of data to be stored in a NoSQL document can present issues in terms of data integrity. So there are definitely pros and cons to both approaches.

Migrations

Fascinatingly, another very hotly requested Payload feature is a first-party way of handling migrations. And that got us thinking. When you look at the Prisma migration resources above, clearly, they are using the word "migration" to describe the "migration" of your database shape from A to B. This is the expected and most widely used meaning of "migration".

But we're on MongoDB, and that doesn't apply. But developers are still asking for migration support. There is a specific distinction here that I'd like to make. Migrations can mean many things in many different contexts.

Changing database shape

Most often, they are referred to as simply modifying your database structure (DDL). This is a requirement for SQL in general. For example, adding a column to a table is a manual `ALTER` that needs to be run at some point.

Transforming existing data

But there's a whole different side to the story. What about if you need to change or transform your existing data? What if you had a full_name field, and you needed to split it up into two separate fields, first_name and last_name? Stuff like this is super common, and requires manual, programmatic management. Whenever anyone has asked Payload for migration support, this is what they meant.

Due to this second aspect of "migrations", that means that the concept is relevant for both NoSQL and SQL databases alike. Application frameworks like Laravel handle this very well, with a full, true "migrations" workflow that allows you to handle both DDL and data-based transformations within the same paradigm, all based on database transactions. I personally think this is by far the best way to solve this and this is an often-overlooked aspect of ORMs in general. Some ORMs come with great programmatic migration support, while others solely handle the DDL side (the pain in the ass side) in raw, generated SQL. Excuse my French, but I've been spoiled here by MongoDB, and I spoiled myself intentionally.

If you end up listening to our community planning call, you'll see that we cover our goals and future vision in-depth on that call. We want to model ourselves after Laravel's consistent (solve for both) approach by not outputting raw SQL, but rather, outputting TS files that give you granular, functional control.

Relational data and performance

One of the biggest wins for SQL in the minds of engineers is its heavily structured relational architecture. Without a doubt, relationships are a first-class citizen in SQL. Much of the querying language itself is dedicated specifically to combining and relating data from separate tables. But relationships can certainly be done in NoSQL, so what's the deal? What are the real-world differences?

First up, there is an entire array of functionality within SQL that actually enforces relational data integrity. Within MongoDB, those types of constraints would all need to be written manually, and you'll need to ultimately "own" the integrity of your data. In NoSQL, you might have a relationship to another document. You could go delete that document, but unless you've handled this yourself, the original relationship would remain, yet be completely invalid. That's not ideal, and that places more work on you.

Let's take Payload, for example. Surprise, surprise. We have a relationship field, and it can store IDs to other related documents which are seamlessly merged in when you retrieve documents from the DB. We leverage the dataloader pattern to batch together all "populations" required for a given query, returning them all super fast and with as few separate queries to the DB as possible. We actually even outperform SQL-based frameworks quite a bit. In a purely relational test, we were 3x faster than Directus and 7x faster than Strapi while both were running Postgres, and we were on MongoDB.

Without knowing what's under the hood of either of those platforms, my takeaway here is that relationships are possible in both NoSQL and SQL. If you were to write out queries deliberately for very specific joins between tables, and have the ability to define queries so that a single query is made vs. dynamic populations, then SQL will likely outshine MongoDB in terms of relational performance. But - your application logic will determine which is faster, and it's easy to get it wrong with either side.

One thing to note here is that if you are using a GraphQL API, or any type of dynamic / expressive querying language, it's impossible to make a single query that returns all populated "relations". This is because it's up to the shape of the query to determine what to "populate". This is known as the N+1 problem. You first need to load the main document, and only then can you determine what to load after. If this applies to you, then most relational performance gains from SQL are non-applicable.

Handling complex data structures

One absolutely massive win that NoSQL has over SQL is that you can store complex data directly in a single document. It can be stored in the same shape as you want it to be in when you read it back from the database. Some SQL databases like Postgres have pretty great JSON support via JSON columns, but if you over-use JSON columns in a SQL database, you're not taking advantage of the strengths of SQL in the first place and might as well go with NoSQL.

So what's it look like to store a complex data structure in SQL, without losing the benefits of SQL? For a simple example, take Payload's Group field:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK