PipelineDB extends PostgreSQL with streaming primitives—continuous views over unbounded input. Although the upstream project was discontinued, the concepts remain useful.
Install & configure#
PipelineDB ships as an extension. Install the RPM/DEB, then edit postgresql.conf:
shared_preload_libraries = 'pipelinedb'
max_worker_processes = 128
Restart Postgres. You must raise max_worker_processes or PipelineDB will fail to launch.
Example: Wikipedia page views#
- Create a stream (foreign table backed by the PipelineDB handler):
CREATE FOREIGN TABLE wiki_stream (
hour timestamp,
project text,
title text,
view_count bigint,
size bigint
) SERVER pipelinedb;
- Create a continuous view to materialize rolling aggregates:
CREATE VIEW wiki_stats WITH (action = materialize) AS
SELECT hour,
project,
count(*) AS total_pages,
sum(view_count) AS total_views,
min(view_count) AS min_views,
max(view_count) AS max_views,
avg(view_count) AS avg_views,
percentile_cont(0.99) WITHIN GROUP (ORDER BY view_count) AS p99_views,
sum(size) AS total_bytes_served
FROM wiki_stream
GROUP BY hour, project;
- Ingest data via
COPY:
curl -sL http://pipelinedb.com/data/wiki-pagecounts | gunzip |
psql -c "COPY wiki_stream (hour, project, title, view_count, size) FROM STDIN"
wiki_stats now updates continuously as new rows arrive.
Core concepts#
- Streams – foreign tables representing append-only input.
- Continuous views – materialized aggregates that update incrementally as stream events arrive.
- Transforms – optional preprocessing stages.
PipelineDB lets you express streaming jobs with plain SQL and reuse the Postgres toolchain you already know.








