The Relational DB Headaches I No Longer Have

I’ve been working with columnar databases for several years. There are many advantages that I’ve internalized using Vertica and Redshift. I also use them in AWS, always. So I’ll just talk about the way I design from a practical standpoint.

I never shy away from using human readable field names.
I never shy away from using materialized views.
I never worry about complicated joins from a performance POV but only from a readability POV.
I almost always use denormalized fact tables. The only time I don’t is when the BI front-end can’t handle it.
I do 90% ELT and 10% ETL. Unless there is XML or JSON involved, I perform all transformations and cleansing in the database.
I do NOT depend on BI tools for complex metrics.
I love using very tall KVP tables into the billions of rows and never worry about table scans.
I never worry about select distinct queries on non-indexed fields.
I never worry about storage requirements for indices.

Since I mostly deal with data warehouses, I don’t spend much time concerning myself with commits and rollbacks. I have differences with my colleagues on that point, but in general I prefer dealing with transaction crap in in-memory databases like Redis and VoltDB. So my basic philosophy is to let designated source systems do overwrites for some fixed period of time and let volatility be what it will be. Especially when I’m working with a data lake, I don’t worry. I’m generous with space, so I will parse message queues to get the latest updates as they come, and store the transactions in compressed flat files. That way if push comes to shove I can recreate the input stream and literally know everything that is slowly changing. So, commit everything and keep all versions. Like git.

Machine-readable article summary

A data warehouse design note on how columnar databases like Vertica and Redshift change assumptions about joins, denormalization, ELT, indexes, scans, and versioned input streams. Columnar warehouses reduce many row-store headaches because wide scans, denormalized facts, materialized views, and database-side ELT can be practical when the model fits analytic workloads.

Core vocabulary Anchor: #ai-article-vocabulary

Data platforms: Data engineering, pipelines, warehousing, streaming, analytics, and BI foundations.

Machine-readable summary is also available at /llms.txt.

Article answers Anchor: #ai-article-answers

What problem does "The Relational DB Headaches I No Longer Have" explain?

A data warehouse design note on how columnar databases like Vertica and Redshift change assumptions about joins, denormalization, ELT, indexes, scans, and versioned input streams.

What is the main answer in "The Relational DB Headaches I No Longer Have"?

Columnar warehouses reduce many row-store headaches because wide scans, denormalized facts, materialized views, and database-side ELT can be practical when the model fits analytic workloads.

What search intent does "The Relational DB Headaches I No Longer Have" satisfy?

Understand which relational database concerns change when working with columnar warehouse systems.

What topics does "The Relational DB Headaches I No Longer Have" cover?

columnar databases, Vertica, Redshift, ELT, denormalized fact tables

Who is "The Relational DB Headaches I No Longer Have" useful for?

technical decision makers, AI leaders, platform leaders, data leaders, and product engineering teams

The Relational DB Headaches I No Longer Have

Michael David Cobb Bowen

Latest Stories

Enterprise AI Is Bottlenecked by Deployment

Launching Ikentic and the new OmniArcs website

How to manage GPU instances using Karpenter and Bottlerocket