pg_ducklake
pg_ducklake : DuckLake lakehouse extension for PostgreSQL, backed by DuckDB and Parquet
Overview
| ID | Extension | Package | Version | Category | License | Language |
|---|---|---|---|---|---|---|
| 2490 | pg_ducklake
|
pg_ducklake
|
1.0.0 |
OLAP
|
MIT
|
C++
|
| Attribute | Has Binary | Has Library | Need Load | Has DDL | Relocatable | Trusted |
|---|---|---|---|---|---|---|
--sLd--
|
No
|
Yes
|
Yes
|
Yes
|
no
|
no
|
| Relationships | |
|---|---|
| Schemas | ducklake |
| See Also | pg_duckdb
duckdb_fdw
pg_mooncake
pg_analytics
pg_parquet
columnar
citus_columnar
|
Packages
| Type | Repo | Version | PG Major Compatibility | Package Pattern | Dependencies |
|---|---|---|---|---|---|
| EXT | PIGSTY
|
1.0.0 |
18
17
16
15
14
|
pg_ducklake |
- |
| RPM | PIGSTY
|
1.0.0 |
18
17
16
15
14
|
pg_ducklake_$v |
- |
| DEB | PIGSTY
|
1.0.0 |
18
17
16
15
14
|
postgresql-$v-pg-ducklake |
- |
| Linux / PG | PG18 | PG17 | PG16 | PG15 | PG14 |
|---|---|---|---|---|---|
el8.x86_64
|
MISS
|
MISS
|
MISS
|
MISS
|
MISS
|
el8.aarch64
|
MISS
|
MISS
|
MISS
|
MISS
|
MISS
|
el9.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
el9.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
el10.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
el10.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
d12.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
d12.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
d13.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
d13.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u22.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u22.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u24.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u24.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u26.x86_64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
u26.aarch64
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
PIGSTY 1.0.0
|
Source
pig build pkg pg_ducklake; # build rpm/debInstall
Make sure PGDG and PIGSTY repo available:
pig repo add pgsql -u # add both repo and update cacheInstall this extension with pig:
pig install pg_ducklake; # install via package name, for the active PG version
pig install pg_ducklake -v 18; # install for PG 18
pig install pg_ducklake -v 17; # install for PG 17
pig install pg_ducklake -v 16; # install for PG 16
pig install pg_ducklake -v 15; # install for PG 15
pig install pg_ducklake -v 14; # install for PG 14Config this extension to shared_preload_libraries:
shared_preload_libraries = 'pg_ducklake';Create this extension with:
CREATE EXTENSION pg_ducklake;Usage
Sources: README, v1.0.0 release, project docs
pg_ducklake adds DuckLake tables to PostgreSQL. DuckLake metadata lives in PostgreSQL while table data is stored in Parquet and queried through DuckDB, giving PostgreSQL SQL clients access to lakehouse features such as snapshots, time travel, partitioning, sort keys, and external object storage.
Create A DuckLake Table
CREATE EXTENSION pg_ducklake;
CREATE TABLE events (
id int,
kind text,
ts timestamptz
) USING ducklake;
INSERT INTO events VALUES
(1, 'login', now()),
(2, 'click', now());
SELECT * FROM events ORDER BY id;Set a table path explicitly when data should live outside the default path:
CREATE TABLE lake_events (
id int,
payload jsonb
) WITH (
ducklake.table_path = 's3://my-bucket/prefix/'
) USING ducklake;Time Travel
Each commit creates a snapshot. Capture a snapshot id before a change, then query the older state:
SELECT max(snapshot_id) AS before_delete
FROM ducklake.ducklake_snapshot \gset
DELETE FROM events WHERE id = 1;
SELECT * FROM ducklake.time_travel('events'::regclass, :before_delete);Convert And Load Data
Create DuckLake tables from existing PostgreSQL heap tables or external data readers:
CREATE TABLE row_store AS
SELECT i AS id, 'hello pg_ducklake' AS msg
FROM generate_series(1, 10000) AS i;
CREATE TABLE col_store USING ducklake AS
SELECT * FROM row_store;
CREATE TABLE titanic USING ducklake AS
SELECT * FROM ducklake.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv');Inlining, Partitioning, And Maintenance
Small writes are inlined in metadata by default to avoid creating many tiny Parquet files. Tune the row limit or flush explicitly:
CALL ducklake.set_option('data_inlining_row_limit', 100);
SELECT * FROM ducklake.flush_inlined_data('events'::regclass);Partition and sort tables for pruning and analytics:
CALL ducklake.set_partition('events'::regclass, 'bucket(4, id)', 'month(ts)');
CREATE INDEX ON events USING ducklake_sorted (id, ts);Run maintenance on demand when automatic background maintenance is not enough:
SELECT * FROM ducklake.merge_adjacent_files('events'::regclass);
CALL ducklake.set_option('expire_older_than', '7 days');
SELECT * FROM ducklake.expire_snapshots();
SELECT * FROM ducklake.cleanup_old_files();External DuckDB Access
DuckDB clients can attach the same DuckLake metadata:
INSTALL ducklake;
LOAD ducklake;
ATTACH 'ducklake:postgres:dbname=postgres host=localhost' AS my_ducklake
(METADATA_SCHEMA 'ducklake');
SELECT * FROM my_ducklake.public.events;Caveats
- Version 1.0.0 supports PostgreSQL 14-18.
- The README lists Ubuntu 22.04-24.04 and macOS as source-build targets.
- Cloud credentials are stored through a
ducklake_secretforeign server and per-user mappings; protect those catalog objects like other database secrets. - For incremental heap-to-DuckLake conversion, upstream points to the separate
pg_duckpipeproject.