Nine times out of ten, redundancies impede performance. That’s why we remove as many redundancies from databases as possible.

But this isn’t always the best approach to streamlining query speed. Sometimes, a little redundancy is healthy for a database, and denormalization proves exactly that.

Here, we’ll discuss what denormalization is, why we use it, and when we should or shouldn’t implement it—along with a few actionable tips.

Database Denormalization Defined

In the broadest strokes, we can describe denormalization as a way to improve query performance by adding otherwise redundant data to tables.

Instead of removing duplicate data (i.e., normalizing the database), we place it in tables where, upon a given query, it will be more readily available for reads. The data duplicates take up more memory, but they also make certain queries faster since complex joins become unnecessary and fewer tables are involved.

Normalization Vs Denormalization

In contrast to denormalization, normalization entails removing redundant data to save memory space. Cutting out this clutter improves general database performance, especially when a lot of insert, delete, or update operations are used.

So, denormalization isn’t a lack of normalization, nor is it the act of undoing normalization per se. It’s more of an exception to the rule of normalization: sometimes, redundancies are actually performant.

Why Denormalize Databases?

Denormalization offers a few benefits that make it worthwhile:

Lower complexity: fewer tables are needed overall; thus, the schema becomes simpler and more manageable.

Faster queries: denormalization reduces the number of necessary joins for particular operations.

More readable data: data becomes more accessible for reads since it has fewer tables.

Fewer errors: lower number of tables translates to a lower number of sticking points.

The Bad Side of Denormalization

Denormalization isn’t optimal in every way, as it also has its disadvantages:

Inconsistencies are more likely: since duplicated data is in multiple places, updates are more likely to create inconsistencies.

Harder to update and insert data: update and insert operations become more costly and the respective code more difficult to write.

Greater memory usage: more redundancy means more data taking up disk and memory space.

Other queries may slow down: the queries you want to improve through denormalization will be faster, but other queries might suffer as a result.

When to Use Denormalization

In some cases, the pros of denormalization outweigh the cons. Here are instances where denormalization is really useful:

Streamlined reporting, especially involving data we use often

Preemptively computing frequently used values

Maintaining data history

Optimizing operations that heavily use joins

Databases where you can afford some queries slowing down

OLAP databases for applications that mostly read data

Common Denormalization Techniques

Mirrored Tables

A mirrored table partially or completely copies an existing table, is stored in another location, and is optimized for faster queries. The data in that table is more easily available, which in turn makes reading data faster.

Pre-joining Tables

The main purpose of pre-joining tables is to put together data an application uses frequently. To avoid repeatedly processing complex join operations, we create a table that holds data from other tables that get read the most.

Table Splitting

This technique entails splitting normalized tables into smaller ones that are queried and managed more easily. We have two methods of table splitting:

Horizontal: splitting tables by rows while keeping the same columns; useful for departmentalized operations

Vertical: splitting tables by columns and keeping the same rows; useful when some columns are accessed much more than others

Storing Details With Their Master

In most cases, we keep the main information in master tables and more specific info in slave tables. Storing these particular details in the master table can help fetch them faster. That said, too many detail-related records in the master table can slow down queries.

Repeating a Detail With the Master

This technique is kind of a solution to the problem mentioned above. If a single detail record is fetched a lot, it’s a good idea to keep only that record on the master table. That way, we don’t over-clutter the table.

Short-Circuit Keys

Certain master-detail relationships are several layers deep (e.g., four tables in between), making them more complex. To optimize queries that connect master tables with the farthest details, a short-circuit key essentially removes the tables between them.

Database Denormalization: Best Practices

Below are a few things to keep in mind if you’re denormalizing a database:

Be sure you need denormalization in the first place.

Analyze query performance to see which queries could actually benefit from denormalization

Test denormalized tables thoroughly before going live

Don’t denormalize too much (keep the mantra of “normalize till it hurts, denormalize till it doesn’t” in mind)

Document and monitor performance changes carefully to see how effective your denormalization effort is

Carefully design denormalized tables

Keep redundant data consistent via triggers that apply updates to all tables with said data

In case you want to revert changes made by denormalization, keep database backups on hand

Inery•

2 years ago

Our Vision for the Internet: By the users, for the users

Leading the forefront of the web3 to give control over data back to the users by ensuring that the internet is open, accessible and collaborative. ...READ MORE

Inery•

2 years ago

The True Cost of Centralized Database Management

The faults of centralized database management harm companies and consumers every day. Click here to see how—and how to avoid the cost. ...READ MORE

Inery•

2 years ago

Inery Community Updates Edition #1

Community Corner ...READ MORE

Inery•

7 months ago

Data Disasters: The Worst Cases of Poor Database Management (And What We Can Learn)

Poor database management has led to some of the worst data disasters ever. Dive into these cases and find out how Inery can keep data secure. ...READ MORE

Inery•

What Is Database Denormalization? Guide + Tips

Databases Unchained: Navigating the Centralized-Decentralized Landscape

How To Transfer MongoDB Data Into IneryDB

Database Denormalization Defined

Normalization Vs Denormalization

Why Denormalize Databases?

The Bad Side of Denormalization

When to Use Denormalization

Common Denormalization Techniques

Mirrored Tables

Pre-joining Tables

Table Splitting

Storing Details With Their Master

Repeating a Detail With the Master

Short-Circuit Keys

Database Denormalization: Best Practices

Inery•

Our Vision for the Internet: By the users, for the users

Inery•

The True Cost of Centralized Database Management

Inery•

Inery Community Updates Edition #1

Inery•

Data Disasters: The Worst Cases of Poor Database Management (And What We Can Learn)