Using the JOIN'ed mapping tables better than several fields in the same table?

advertisements

I have a table of about 360,000 records and here's performing a query on two indexed fields:

SELECT COUNT(*)
FROM emails
WHERE
department_id IN(1,2,3,4)
AND category_id IN (5,6,7,8)

(Time: 0.9624802)

           id: 1
  select_type: SIMPLE
        table: emails
         type: range
possible_keys: emails_department_id_idx,emails_category_id_idx
          key: emails_category_id_idx
      key_len: 5
          ref: NULL
         rows: 54018
        Extra: Using where

So only one index is being used there. (I can get an index merge to work when using simpler comparisons or range criteria, but I need to do checks like this against a list of IDs).

Here I created two new tables to map this relationship, and using JOIN's I replicated the same results:

SELECT COUNT(*)
FROM emails
LEFT JOIN email_to_department ON (email_to_department.email_id = emails.id AND email_to_department.department_id IN (1,2,3,4))
LEFT JOIN email_to_category ON (email_to_category.email_id = emails.id AND email_to_category.category_id IN (5,6,7,8))
WHERE
email_to_department.department_id IS NOT NULL
AND email_to_category.category_id IS NOT NULL

(Time: 0.5217777)

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: email_to_category
         type: range
possible_keys: PRIMARY,category_id
          key: category_id
      key_len: 4
      ref: NULL
     rows: 61282
    Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: email_to_department
         type: ref
possible_keys: PRIMARY,department_id
          key: PRIMARY
      key_len: 4
          ref: testdb.email_to_category.email_id
         rows: 1
        Extra: Using where; Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: emails
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: testdb.email_to_category.email_id
         rows: 1
        Extra: Using index
3 rows in set (0.38 sec)

So each query now uses an index, and trims almost half the time off. Is this bad design? Should I write the rest of these relationships in the same way?

If I add more criteria in the same way, the JOIN'd version of the query seems to get faster and the other remains more or less the same.

Doing a simple query on just a single indexed field is very very fast of course:

SELECT COUNT(*)
FROM emails
WHERE department_id IN(1,2,3,4)

Is there another strategy I might use to make these kinds of queries faster still? There are other properties that need to be filtered on as well, and in different combinations, so creating multi-column indexes won't really help.

This is really a question about database normalization. You can look for information on that topic in many places.

Basic answer Since there is a huge literature about this, and there are a lot of differences, I will just point out that it boils down to trade-offs; Speed versus storage requirements, or ease of use versus data duplication. You may want to understand what normalization is in order to understand why you would or would not want to do it.

Further reading This is a deep topic, you may want to learn more - there are hundreds of books and thousands of scholarly papers on these issues. For instance, look at this previous SO question about designing a database: Database design: one huge table or separate tables?, or this one: First-time database design: am I overengineering? or Database Normalization Basics on About.com.

Using the JOIN'ed mapping tables better than several fields in the same table?

Using the JOIN'ed mapping tables better than several fields in the same table?

Recommend

神策数据发布《新国货连锁餐饮品牌数字营销研究报告》

Hosting an Alexa Skill on Pipedream

每周分享第154期

Migrating to utf8mb4: Things to Consider

海伦司赚钱难，10 元小酒馆留不住年轻人？

使用CSS size-adjust和unicode-range改变任意文字尺寸

What is QAOps?

NASA Continues Slow And Steady Pace Towards Moon

Select a branch via URL param for Jenkins job build

2021年如何对安卓app抓包教程

About Joyk