Skip to content
pg_ducklake

pg_ducklake

pg_ducklake : DuckLake lakehouse extension for PostgreSQL, backed by DuckDB and Parquet

Overview

ID Extension Package Version Category License Language
2490
pg_ducklake
pg_ducklake
1.0.0
OLAP
MIT
C++
Attribute Has Binary Has Library Need Load Has DDL Relocatable Trusted
--sLd--
No
Yes
Yes
Yes
no
no
Relationships
Schemas ducklake
See Also
pg_duckdb
duckdb_fdw
pg_mooncake
pg_analytics
pg_parquet
columnar
citus_columnar

Packages

Type Repo Version PG Major Compatibility Package Pattern Dependencies
EXT
PIGSTY
1.0.0
18
17
16
15
14
pg_ducklake -
RPM
PIGSTY
1.0.0
18
17
16
15
14
pg_ducklake_$v -
DEB
PIGSTY
1.0.0
18
17
16
15
14
postgresql-$v-pg-ducklake -
Linux / PG PG18 PG17 PG16 PG15 PG14
el8.x86_64
MISS
MISS
MISS
MISS
MISS
el8.aarch64
MISS
MISS
MISS
MISS
MISS
el9.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
el9.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
el10.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
el10.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d12.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d12.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d13.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
d13.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u22.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u22.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u24.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u24.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u26.x86_64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
u26.aarch64
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
PIGSTY 1.0.0
Package Version OS ORG SIZE File URL
pg_ducklake_18 1.0.0 el9.x86_64 pigsty 16.0 MiB pg_ducklake_18-1.0.0-1PIGSTY.el9.x86_64.rpm
pg_ducklake_18 1.0.0 el9.aarch64 pigsty 14.6 MiB pg_ducklake_18-1.0.0-1PIGSTY.el9.aarch64.rpm
pg_ducklake_18 1.0.0 el10.x86_64 pigsty 14.6 MiB pg_ducklake_18-1.0.0-1PIGSTY.el10.x86_64.rpm
pg_ducklake_18 1.0.0 el10.aarch64 pigsty 12.9 MiB pg_ducklake_18-1.0.0-1PIGSTY.el10.aarch64.rpm
postgresql-18-pg-ducklake 1.0.0 d12.x86_64 pigsty 13.5 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~bookworm_amd64.deb
postgresql-18-pg-ducklake 1.0.0 d12.aarch64 pigsty 11.6 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~bookworm_arm64.deb
postgresql-18-pg-ducklake 1.0.0 d13.x86_64 pigsty 14.5 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~trixie_amd64.deb
postgresql-18-pg-ducklake 1.0.0 d13.aarch64 pigsty 12.4 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~trixie_arm64.deb
postgresql-18-pg-ducklake 1.0.0 u22.x86_64 pigsty 12.3 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~jammy_amd64.deb
postgresql-18-pg-ducklake 1.0.0 u22.aarch64 pigsty 11.2 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~jammy_arm64.deb
postgresql-18-pg-ducklake 1.0.0 u24.x86_64 pigsty 12.5 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~noble_amd64.deb
postgresql-18-pg-ducklake 1.0.0 u24.aarch64 pigsty 11.5 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~noble_arm64.deb
postgresql-18-pg-ducklake 1.0.0 u26.x86_64 pigsty 13.1 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~resolute_amd64.deb
postgresql-18-pg-ducklake 1.0.0 u26.aarch64 pigsty 12.2 MiB postgresql-18-pg-ducklake_1.0.0-1PIGSTY~resolute_arm64.deb

Source

pig build pkg pg_ducklake;		# build rpm/deb

Install

Make sure PGDG and PIGSTY repo available:

pig repo add pgsql -u   # add both repo and update cache

Install this extension with pig:

pig install pg_ducklake;		# install via package name, for the active PG version

pig install pg_ducklake -v 18;   # install for PG 18
pig install pg_ducklake -v 17;   # install for PG 17
pig install pg_ducklake -v 16;   # install for PG 16
pig install pg_ducklake -v 15;   # install for PG 15
pig install pg_ducklake -v 14;   # install for PG 14

Config this extension to shared_preload_libraries:

shared_preload_libraries = 'pg_ducklake';

Create this extension with:

CREATE EXTENSION pg_ducklake;

Usage

Sources: README, v1.0.0 release, project docs

pg_ducklake adds DuckLake tables to PostgreSQL. DuckLake metadata lives in PostgreSQL while table data is stored in Parquet and queried through DuckDB, giving PostgreSQL SQL clients access to lakehouse features such as snapshots, time travel, partitioning, sort keys, and external object storage.

Create A DuckLake Table

CREATE EXTENSION pg_ducklake;

CREATE TABLE events (
  id int,
  kind text,
  ts timestamptz
) USING ducklake;

INSERT INTO events VALUES
  (1, 'login', now()),
  (2, 'click', now());

SELECT * FROM events ORDER BY id;

Set a table path explicitly when data should live outside the default path:

CREATE TABLE lake_events (
  id int,
  payload jsonb
) WITH (
  ducklake.table_path = 's3://my-bucket/prefix/'
) USING ducklake;

Time Travel

Each commit creates a snapshot. Capture a snapshot id before a change, then query the older state:

SELECT max(snapshot_id) AS before_delete
FROM ducklake.ducklake_snapshot \gset

DELETE FROM events WHERE id = 1;

SELECT * FROM ducklake.time_travel('events'::regclass, :before_delete);

Convert And Load Data

Create DuckLake tables from existing PostgreSQL heap tables or external data readers:

CREATE TABLE row_store AS
SELECT i AS id, 'hello pg_ducklake' AS msg
FROM generate_series(1, 10000) AS i;

CREATE TABLE col_store USING ducklake AS
SELECT * FROM row_store;

CREATE TABLE titanic USING ducklake AS
SELECT * FROM ducklake.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv');

Inlining, Partitioning, And Maintenance

Small writes are inlined in metadata by default to avoid creating many tiny Parquet files. Tune the row limit or flush explicitly:

CALL ducklake.set_option('data_inlining_row_limit', 100);
SELECT * FROM ducklake.flush_inlined_data('events'::regclass);

Partition and sort tables for pruning and analytics:

CALL ducklake.set_partition('events'::regclass, 'bucket(4, id)', 'month(ts)');
CREATE INDEX ON events USING ducklake_sorted (id, ts);

Run maintenance on demand when automatic background maintenance is not enough:

SELECT * FROM ducklake.merge_adjacent_files('events'::regclass);
CALL ducklake.set_option('expire_older_than', '7 days');
SELECT * FROM ducklake.expire_snapshots();
SELECT * FROM ducklake.cleanup_old_files();

External DuckDB Access

DuckDB clients can attach the same DuckLake metadata:

INSTALL ducklake;
LOAD ducklake;
ATTACH 'ducklake:postgres:dbname=postgres host=localhost' AS my_ducklake
  (METADATA_SCHEMA 'ducklake');

SELECT * FROM my_ducklake.public.events;

Caveats

  • Version 1.0.0 supports PostgreSQL 14-18.
  • The README lists Ubuntu 22.04-24.04 and macOS as source-build targets.
  • Cloud credentials are stored through a ducklake_secret foreign server and per-user mappings; protect those catalog objects like other database secrets.
  • For incremental heap-to-DuckLake conversion, upstream points to the separate pg_duckpipe project.
Last updated on