Skip to main content
Background Image
  1. Database Guru/

Stop Arguing, The AI Era Database Has Been Settled

·1620 words·8 mins· ·
Ruohang Feng
Author
Ruohang Feng
Pigsty Founder, @Vonng
Table of Contents

This morning, the industry exploded with news of an acquisition. Following Databricks’ $1 billion acquisition of Neon, its rival Snowflake immediately followed by acquiring CrunchyData.

According to insiders, this deal was priced at $250 million. While the price is 1/4 of Neon’s, unlike Databricks’ stock swap, this time Snowflake paid real cash, giving it a distinctly “whatever Databricks buys, I buy” confrontational flavor.

twitter.webp

But this isn’t just a grudge match between two data warehouse giants - PostgreSQL has indeed captured all the favorable timing for database rise in the AI era. Combined with the industry rumors of OpenAI acquiring Supabase, it’s clear that the common thread among these acquisitions (or potential acquisition intentions) is that these are all PostgreSQL companies - PostgreSQL companies are becoming the hottest commodities in capital markets.

WSJ: Snowflake to acquire Crunchy Data for $250 million[1]


Why PostgreSQL?
#

Why is this phenomenon occurring? Microsoft CEO Nadella has already made it very clear - the constant in the AI era is databases ("SaaS is Dead? In the AI Era, Software Starts from Databases"). The frontend might shrink to a dialog box or just be voice interaction, while part of the backend gets replaced by Agents and another part merges into databases

(like Supabase). Throughout the entire IT field, only databases remain indispensable in the AI era.

So who will become the database of the AI era? Among global developers, this question has long had consensus. PostgreSQL became the most used, most loved, and most in-demand database among global developers three years ago.

trend.webp

For example, when I asked OpenAI friends why they chose PostgreSQL, they asked me back: “Isn’t PostgreSQL the default choice and safe bet now? Not using PostgreSQL would need special reasons!” ("OpenAI: Scaling PostgreSQL to New Heights").

Companies like OpenAI and Cursor can support their business with just a single master-slave PostgreSQL setup at “true Web Scale” application scale - other companies’ scenarios are naturally even less challenging.

openai.webp

Now, PostgreSQL has not only become consensus among developers, entrepreneurs, and industry, but has also won capital’s favor. Capital has already voted with its feet - PostgreSQL is the database of the AI era.

Many ask, why PostgreSQL? Lao Feng already explained this in “PostgreSQL is Eating the Database World”. PostgreSQL is the only framework capable of devouring the entire database world.

Open source and advanced technology are PG’s backbone, while its edge is “extensibility.” More and more database subdivisions are being integrated into the PostgreSQL ecosystem as “plugins”. Powerful extensibility has not only made PostgreSQL the de facto standard in the OLTP world, but also gives it a head start in integrating OLAP big data ecosystems.

ecosystem.gif

About CrunchyData
#

The acquired CrunchyData is one of the main players in the DuckDB stitching competition. Their recent focus has been on PostgreSQL data warehousing (Crunchy Bridge). They also have a related open-source project pg_parquet that provides the ability to read and write Parquet files on S3 from PG. When it first came out, I packaged it and put it in the Pigsty extension repository, and some users are actually using it.

CrunchyData is a well-known company in the PostgreSQL ecosystem. Tom Lane, a core member of the PostgreSQL community, works at this company. Their core business can be roughly summarized as:

crunchy.webp

A PostgreSQL database distribution: Crunchy Certified PostgreSQL, basically still the usual high availability monitoring backup recovery stuff, with distinctive enterprise security features like SELinux integration/TDE and compliance certifications. Plus some remote DBA, training certification services.

A Postgres Kubernetes Operator. Lao Feng isn’t fond of putting databases in K8S, but clearly CrunchyData’s PGO is definitely a first-tier leading player in this field.

And the PostgreSQL data warehouse they’ve been pushing since last year - yes, stitching DuckDB and Iceberg stuff into PostgreSQL.

Lao Feng’s Commentary
#

Lao Feng thinks Snowflake’s acquisition of CrunchyData is very wise. Besides PostgreSQL itself being genuinely useful (Snowflake has always wanted to enter the OLTP field) (constructive factor participation in distribution), there’s also a hidden important thread (destructive factor participation in distribution).

Big Data Futures Kill People
#

This involves a key industry insight - as the DuckDB manifesto says: Big Data is Dead (futures kill people). This trend actually showed signs ten years ago ("The Lost Decade of Small Data: The Misdirection of Distributed Analytics"), but the real impact has only started showing in recent years - that is, with modern hardware performance levels, single machines (PostgreSQL/DuckDB) are sufficient to handle data analysis for the vast majority (let’s say 99.99%) of application scenarios.

CrunchyData happened to start pushing this last year in my article “PostgreSQL is eating the database world”, which would have a devastating effect on Snowflake, which started with data warehousing.

Simply put, if the de facto standard for OLTP is already PG, isn’t it more convenient, cost-effective, and worry-free for users to directly use PG for OLAP rather than ETL to Snowflake or other big data solutions? We did this at Apple a few years ago, using PostgreSQL simultaneously as OLTP/OLAP for industrial control systems, solving all problems with one database and directly eliminating the entire “big data” department. But five years ago this was niche cutting-edge exploration; five years later this practice has entered mainstream view.

The final kick to make this practice mainstream is PG stitching with DuckDB (DuckLake or Iceberg). Once the stitching is good enough, PG’s OLAP analysis performance directly enters the T0 tier, then these OLAP/big data solutions have no way to survive - I describe this as “Mars Hitting Earth” in the database world.

The key obstacle to this is PG’s storage engine table access interface (TAM). This happens to be in the hands of Tom Lane at CrunchyData.

PG Kernel’s Veto Power
#

Over the past year, Tom Lane at CrunchyData has thrown quite a few wrenches into PG’s table access interface (TAM), allowing CrunchyBridge to gain some advantages in data warehouse stitching (Duck/Iceberg), causing some controversy in the circle. For example, De Ge directly spoke about this:

Now, if you’re Snowflake’s CEO, what’s the most effective way to prevent (or guide/control) the PG Duck convergence trend? Directly control a core member of the PG community, master veto power over new features in the PG community, effectively block evolution of the PG TAM table access interface, thus locking the ceiling of pg and duckdb stitching. Acquiring CrunchyData actually achieves this effect.

Moreover, Snowflake can push some changes beneficial to integrating PG/Snowflake into the PG kernel, thus gaining advantage and initiative in OLAP world integration during PG’s process of devouring the database world. Their competitors (like Supabase-acquired OrioleDB, pg_duckdb, pg_mooncake) will face some constraints, with a vague “using the emperor to command the princes” feeling.

For example, Neon’s founder invested in pg_mooncake, and Databricks (Snowflake’s rival) acquired Neon. Since the archrival already has PG OLAP analysis layout, this acquisition can also constrain competitors.

Of course, this path can at most be called “containment” and can’t completely block it. For example, pg_mooncake recently started rewriting entirely in Rust, simply using TAM rather than being locked into it. Where there’s a will, there’s a way.

pg_mooncake.webp

On the other hand, Supabase (rumored to be acquired by OpenAI) plans to use the OrioleDB kernel, which also depends on several table access method patches that have been stuck and haven’t entered the PG 18 kernel. This acquisition can also constrain other companies wanting to take this path - killing two birds with one stone.

Talent is the Most Critical Factor
#

Databricks, Snowflake, and (OpenAI) have undoubtedly launched a new round of acquisition battles in the database market.

The logic behind this is clear: databases remain a solid core department in the AI era, and PostgreSQL is “unifying and conquering” the entire database world. Therefore, timely cultivation and acquisition of proxies in this field becomes very important. Those companies that dominate and excel admirably in the PostgreSQL field are now extremely few - this is a game of “musical chairs.” Whoever can grab the core talent from these companies and bring them into their fold will be able to occupy larger ecological niches in the future.

In this regard, Lao Feng is quite proud, because among all these PostgreSQL companies, only Lao Feng is a “one-person company.” Lao Feng knows very well how lively this field is - even I, an “individual entrepreneur,” have a valuation of 100 million (by Lu Qi) - and more than one cloud vendor has offered 20 million trying to acquire, though they’re all quite cunning, wanting to lock me in personally at cheaper prices. Anyway, Lao Feng is already profitable and stable - I’m not the one who’s anxious.

The logic behind this is that a single top-tier talent can destroy attempts to achieve industry monopoly alliances - based on the current deployment scale of open-source Pigsty, causing over 100 million in losses to RDS annually is a very conservative estimate - and it’s still growing. After all, who can beat zero-yuan shopping powered by love in price wars?

What’s more, this “open-source cancer” has already spilled over from China to roll globally (40%+ users from overseas). Honestly - this kind of world-changing table-flipping fun can’t be matched by earning any amount of money. Lao Feng is also working hard to see if I can make Pigsty the DeepSeek of the database field, haha.

Ad Time
#

As usual, what’s the point of writing articles without ads? 😁

Open-source free PostgreSQL distribution: Look for Pigsty

https://pigsty.io

pigsty.webp

pigsty2.webp

Related

Database Planet Collision: When PG Falls for DuckDB
·1514 words·8 mins
If you ask me, we’re on the brink of a cosmic collision in database-land, and Postgres + DuckDB is the meteor we should all be watching.
7 Databases in 7 Weeks (2025)
·2114 words·10 mins
Is PostgreSQL the king of boring databases? Which databases show promise and punch in 2025?
Scaling Postgres to the next level at OpenAI
·2727 words·13 mins
At PGConf.Dev 2025, Bohan Zhang from OpenAI shared a session titled Scaling Postgres to the next level at OpenAI, giving us a peek into the database usage of a top-tier unicorn.
Postgres Extension Day - See You There!
·1907 words·9 mins
The annual PostgreSQL developer conference will be held in Montreal in May. Like the first PG Con.Dev, there’s also an additional dedicated event - Postgres Extensions Day
OrioleDB is Coming! 4x Performance, Eliminates Pain Points, Storage-Compute Separation
·1434 words·3 mins
A PG kernel fork acquired by Supabase, claiming to solve PG’s XID wraparound problem, eliminate table bloat issues, improve performance by 4x, and support cloud-native storage. Now part of the Pigsty family.
OpenHalo: MySQL Wire-Compatible PostgreSQL is Here!
·713 words·4 mins
What? PostgreSQL can now be accessed using MySQL clients? That’s right, openHalo, which was open-sourced on April Fool’s Day, provides exactly this capability and has now joined the Pigsty kernel family.