7

Should You Keep Your Business Logic In Your Database?

 1 year ago
source link: https://www.percona.com/blog/should-you-keep-your-business-logic-in-your-database/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Should You Keep Your Business Logic In Your Database?

July 31, 2023

David Stokes

Open source database architects usually do not implement business logic in their databases. This is in stark contrast to many commercial databases where this is a common practice. In the first case, all the heuristics are kept at the application layer, and the database has little or no effect on the data quality. The second case has the rules required by management are handled by the database itself, not relying on the software developers to be fully cognizant of rules.

Now part of this stance in the open source world is that there was a lack of tooling for business logic in the past in the MySQL universe. Constraint checks that actually do what they are supposed to do is still a very recent addition to MySQL and found only in 8.0.16 or later, which is another reason to upgrade. Previous MySQL versions checked the syntax but did not perform the required work for a constraint check. But now that the tooling exists, it may be time to reexamine the traditional stance. PostgreSQL has long featured the ability to do what you will see in the following examples, and MongoDB also has some capabilities in this area. So the capabilities are there if you are inclined to move your business logic to your database layer.

Keeping bad data out of a database is much less expensive in time and money than correcting it after it is snuggly placed in your tables. You may already use some of the following tools to filter your data. But you can have more rigor in assuring your data is pristine before it becomes a row or a document.

Some examples? There are many ways to protect your data, but we will start with ENUMs and VIEWs, as they have been around MySQL for a very long time.

ENUMs

An ENUM is a string object with a value chosen from a list of permitted values that are enumerated explicitly in the column specification at table creation time. – MySQL Manual

The concept of the ENUM is great. You set up predefined values that you want to include in your tables, kick out data that does not match those predefined values, and the data is compliant with what you need. MongoDB, MySQL, and PostgreSQL all accommodate ENUMs. In the following example, an ENUM is created that will take only three values and reject other values, supposedly.

SQL > CREATE TABLE enum_example (id int, is_good ENUM ('yes','no','maybe'));
Query OK, 0 rows affected (0.0188 sec)
SQL > INSERT INTO enum_example (id, is_good) values (1,'yes');
Query OK, 1 row affected (0.0066 sec)
SQL > INSERT INTO enum_example (id, is_good) values (2,'no');
Query OK, 1 row affected (0.0034 sec)
SQL > INSERT INTO enum_example (id, is_good) values (3,'unknown');
ERROR: 1265: Data truncated for column 'is_good' at row 1
SQL > show warnings;
+-------+------+----------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------+
| Error | 1265 | Data truncated for column 'is_good' at row 1 |
+-------+------+----------------------------------------------+
1 row in set (0.0009 sec)

In the above example, all is good when we input the predefined values, and, as expected, the row with non-compliant data is rejected outright. But what happens when we try everybody’s favorite value of NULL?

SQL > INSERT INTO enum_example (id, is_good) values (3,NULL);
Query OK, 1 row affected (0.0040 sec)
SQL > SELECT * FROM enum_example;
+----+---------+
| id | is_good |
+----+---------+
| 1 | yes |
| 2 | no |
| 3 | NULL |
+----+---------+
3 rows in set (0.0027 sec)

Dang! NULL is not one of our predefined good values, and yet there it sits, fat and happy, in our database table. This is different from what we want either logically or as data. This NULL problem can be avoided by setting the SQL Mode to STRICT.

Another reason ENUMs are not extremely popular is that they are tricky to sort because ENUMS are sorted based on their index number. In the above example, ‘yes’ would sort first, ‘no’ second, and ‘unknown’ would be third. It can be tricky if that order is not good for you or if you need to sort the values alphabetically.

SQL > SELECT id, is_good FROM enum_example ORDER BY is_good;
+----+---------+
| id | is_good |
+----+---------+
| 3 | NULL |
| 1 | yes |
| 2 | no |
+----+---------+
3 rows in set (0.0012 sec)

And NULL again does what NULLs do and gets sorted first. The concept of ENUMs is pretty spiffy, but the implementation can cause headaches. This does not mean that you should not use them, but you need to be aware of their shortcomings and make accommodations for them.

VIEWS

You may not think of views as part of the realm of business logic, but they are valuable for obfuscating data from prying eyes. Views can be treated like tables, and you can define them so that the permissions of the person using the view are much less than the creator of the view. The original table has restricted access for approved users only, and the view can be established so that the caller can have indirect access to some of that table.

In the following example, we have a table with a secret column. The various permissions to restrict access are omitted for clarity. This table is created so that only selected accounts can see that secret column.

SQL > CREATE TABLE secret_stuff (id int, secret int);
Query OK, 0 rows affected (0.0412 sec)
SQL > INSERT INTO secret_stuff (id, secret) VALUES (1,1),(2,2);
Query OK, 2 rows affected (0.0062 sec)
Records: 2 Duplicates: 0 Warnings: 0
SQL > SELECT * FROM secret_stuff;
+----+--------+
| id | secret |
+----+--------+
| 1 | 1 |
| 2 | 2 |
+----+--------+
2 rows in set (0.0011 sec)

A view is then created. The user of this view has no direct access to the secret column but can get to other less protected data in the table.

SQL > CREATE VIEW see_secret AS SELECT id, 'XYZ' FROM secret_stuff;
Query OK, 0 rows affected (0.0084 sec)SQL > select * from see_secret;
+----+-----+
| id | XYZ |
+----+-----+
| 1 | XYZ |
| 2 | XYZ |
+----+-----+
2 rows in set (0.0045 sec)

The data in the table is protected from seeing the secret columns directly as they can not see the name of, let alone access, that column.

VIEWS with Data Masking in Percona Server for MySQL

A cool feature of Percona Server for MySQL is the ability to mask data. This way, part of the secret column is kept away from the user of the view, such as the last four numbers of a phone number or some identification number.

More details can be found in Data Masking With Percona Server for MySQL – An Enterprise Feature at a Community Price.

Check constraints

MySQL 8.0.16 finally delivered working check constraints. In earlier versions, they were ignored. That caused much gnashing of teeth by MySQL DBAs, and many other fans of other databases would justifiably point out this problem while making ‘toy database jokes. But for the past several years, check constraints have been available if under-utilized.

There are two types of constraints- one for columns ad one for tables. In the next example, a constraint is established on column b to ensure it has a value greater than one. Note that the constraint is named b_gt_1 and that you can use that constraint name once per schema. It is highly recommended that you name the constraints to make it easier to identify when troubles arise.

SQL > CREATE TABLE cc (a INT,
                       b INT CONSTRAINT b_gt_1 CHECK (b > 1));
Query OK, 0 rows affected (0.0189 sec)
SQL > INSERT INTO cc (a,b) VALUES (1,2);
Query OK, 1 row affected (0.0064 sec)
SQL > INSERT INTO cc (a,b) VALUES (2,0);
ERROR: 3819: Check constraint 'b_gt_1' is violated.

In the next example, the last constraint defined is on the table, and two columns are compared.

SQL > CREATE TABLE cc2 (a INT,
                        b INT CONSTRAINT b_gt_2 CHECK (b > 2),
                        check (a > b));
Query OK, 0 rows affected (0.0216 sec)
SQL > INSERT INTO cc2 VALUES (1,2);
ERROR: 3819: Check constraint 'b_gt_2' is violated.
SQL > INSERT INTO cc2 VALUES (1,3);
ERROR: 3819: Check constraint 'cc2_chk_1' is violated.

Notice in the above that when the table constraint is violated, the server has assigned the name ‘cc2_chk_1’. When the constraint fails, and the error message pops up, you are forced to look for the unnamed constraint and, in this case, find the first one. So multiple unnamed constraints can be a pain to find.

Column and table constraints can be combined and as complicated as you are willing to make them.

Triggers

The use of triggers can fix a lot of problems. They can be executed before, after, or upon deletion of a row in a table, allowing that data to be saved in another table. Why is that important? It allows you to create events around changes in the data. Maybe when a customer deletes their account, you want to add their email address to the ‘do not bulk email’ table, record the timestamp of their leaving in another table, and check on any orders being processed they may have.

The following example is an audit trail of when customer representatives change on a customer trouble ticket account. For some reason, management wants to see how often the current customer representative changes. For reasons of clarity, timestamp columns are omitted in the tables below.

The first table is for the trouble tickets.

SQL > CREATE TABLE ticket (id int, customer int, rep_current char(10);
Query OK, 0 rows affected (0.0205 sec)

The next table is the log for changes in those trouble tickets.

SQL > create table ticket_log (id int, customer int, representative char(10));
Query OK, 0 rows affected (0.0156 sec)

Now a trigger is set up so that changes in the ticket table get stored in the ticket_log table.

SQL> DELIMITER $$
SQL > CREATE TRIGGER ticket_change AFTER UPDATE ON ticket FOR EACH ROW
-> BEGIN
-> INSERT INTO ticket_log VALUES (OLD.id, OLD.customer, NEW.rep_current);
-> END $$
Query OK, 0 rows affected (0.0088 sec)
SQL> DELIMITER ;
SQL > UPDATE ticket SET rep_current='Mary' WHERE id=10;
Query OK, 1 row affected (0.0034 sec)
Rows matched: 1 Changed: 1 Warnings: 0
SQL > select * from ticket_log;
+----+----------+----------------+
| id | customer | representative |
+----+----------+----------------+
| 10 | 234 | Mary |
+----+----------+----------------+
1 row in set (0.0009 sec)

Depending on your database, you can have multiple triggers on your table, specify the order of their execution, and define the account that uses the trigger when you need extra data security. Examples of logical checks that could be implemented at this level are customer credit limits, minimum order policies, and stock-on-hand reorder quantity checks that let a business keep humming along.

Sadly for MongoDB users, triggers are only available on the Atlas platform.

Stored procedures

PostgreSQL has fantastic support for stored procedures, while MongoDB has a ‘sort of, kind of’ equivalent, and MySQL has minimal support. A stored procedure can be more secure than ad-hoc queries, can sometimes reduce network traffic, and can keep the code centralized on the database server.
Do We Want The Smarts In The Code Or The Database?

Conclusion

As previously mentioned, most open source database practitioners keep the business logic in their code. That is a practical and reasonable approach as long as all those doing the coding understand the rules and properly implement those rules. But as staff grows, the dissemination of that knowledge may not be passed on properly. Costly mistakes can be made, laws unintentionally violated, and rows are converted to useless gibberish when this happens.

Putting the business logic in the database makes sense in many places. If you sell products only to adults, you will want to check the customer’s age. Promotional details like having the minimum order value for free shipping is $50 or ordering ten or more and getting a 15% discount are easy to implement. Do you have enough stock on hand to ship immediately, or do you need to provide an estimated delivery date from a supplier’s warehouse? That logic is not spread out over several applications but is kept at the database level as a firewall, an arbitrator, and a sole reference.

In the modern world where the number of databases is mushrooming, and the only thing growing faster is the number of projects using those databases, the ability to enforce business rules and logic by relying on an ever-expanding codebase is not practical. Ask yourself if application code change created by an increase in postal shipping rates for your operation is best served by searching a voluminous codebase, updating programs, and deploying new code with a high probability of missing some applications or making one change at the database level.

Now is the time to start looking at your systems to see where you can use some of the above techniques to start moving some of your business logic to the database. There will be cases where having your database do the work is the obvious solution and the sole implementation of the business logic.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!

Share This Post!

Subscribe
Connect with
guest

Label

0 Comments

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK