Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    zmre

    postgresql-table-design

    zmre/postgresql-table-design
    Data & Analytics
    3 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Design a PostgreSQL-specific schema. Covers best-practices, data types, indexing, constraints, performance patterns, and advanced features

    SKILL.md

    PostgreSQL Table Design

    Core Rules

    • Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer BIGINT GENERATED ALWAYS AS IDENTITY; use UUID only when global uniqueness/opacity is needed.
    • Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
    • Add NOT NULL everywhere it's semantically required; use DEFAULTs for common values.
    • Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
    • Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or NUMERIC for exact decimal arithmetic).

    PostgreSQL "Gotchas"

    • Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use snake_case for table/column names.
    • Unique + NULLs: UNIQUE allows multiple NULLs. Use UNIQUE (...) NULLS NOT DISTINCT (PG15+) to restrict to one NULL.
    • FK indexes: PostgreSQL does not auto-index FK columns. Add them.
    • No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into NUMERIC(2,0) fails with error, unlike some databases that silently truncate or round.
    • Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
    • Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB); CLUSTER is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
    • MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.

    Data Types

    • IDs: BIGINT GENERATED ALWAYS AS IDENTITY preferred (GENERATED BY DEFAULT also fine); UUID when merging/federating/used in a distributed system or for opaque IDs. Generate with uuidv7() (preferred if using PG18+) or gen_random_uuid() (if using an older PG version).
    • Integers: prefer BIGINT unless storage space is critical; INTEGER for smaller ranges; avoid SMALLINT unless constrained.
    • Floats: prefer DOUBLE PRECISION over REAL unless storage space is critical. Use NUMERIC for exact decimal arithmetic.
    • Strings: prefer TEXT; if length limits needed, use CHECK (LENGTH(col) <= n) instead of VARCHAR(n); avoid CHAR(n). Use BYTEA for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage: PLAIN (no TOAST), EXTENDED (compress + out-of-line), EXTERNAL (out-of-line, no compress), MAIN (compress, keep in-line if possible). Default EXTENDED usually optimal. Control with ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy and ALTER TABLE tbl SET (toast_tuple_target = 4096) for threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes on LOWER(col) (preferred unless column needs case-insensitive PK/FK/UNIQUE) or CITEXT.
    • Money: NUMERIC(p,s) (never float).
    • Time: TIMESTAMPTZ for timestamps; DATE for date-only; INTERVAL for durations. Avoid TIMESTAMP (without timezone). Use now() for transaction start time, clock_timestamp() for current wall-clock time.
    • Booleans: BOOLEAN with NOT NULL constraint unless tri-state values are required.
    • Enums: CREATE TYPE ... AS ENUM for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
    • Arrays: TEXT[], INTEGER[], etc. Use for ordered lists where you query elements. Index with GIN for containment (@>, <@) and overlap (&&) queries. Access: arr[1] (1-indexed), arr[1:3] (slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax: '{val1,val2}' or ARRAY[val1,val2].
    • Range types: daterange, numrange, tstzrange for intervals. Support overlap (&&), containment (@>), operators. Index with GiST. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer [) (inclusive/exclusive) by default.
    • Network types: INET for IP addresses, CIDR for network ranges, MACADDR for MAC addresses. Support network operators (<<, >>, &&).
    • Geometric types: POINT, LINE, POLYGON, CIRCLE for 2D spatial data. Index with GiST. Consider PostGIS for advanced spatial features.
    • Text search: TSVECTOR for full-text search documents, TSQUERY for search queries. Index tsvector with GIN. Always specify language: to_tsvector('english', col) and to_tsquery('english', 'query'). Never use single-argument versions. This applies to both index expressions and queries.
    • Domain types: CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$') for reusable custom types with validation. Enforces constraints across tables.
    • Composite types: CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT) for structured data within columns. Access with (col).field syntax.
    • JSONB: preferred over JSON; index with GIN. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
    • Vector types: vector type by pgvector for vector similarity search for embeddings.

    Do not use the following data types

    • DO NOT use timestamp (without time zone); DO use timestamptz instead.
    • DO NOT use char(n) or varchar(n); DO use text instead.
    • DO NOT use money type; DO use numeric instead.
    • DO NOT use timetz type; DO use timestamptz instead.
    • DO NOT use timestamptz(0) or any other precision specification; DO use timestamptz instead
    • DO NOT use serial type; DO use generated always as identity instead.

    Table Types

    • Regular: default; fully durable, logged.
    • TEMPORARY: session-scoped, auto-dropped, not logged. Faster for scratch work.
    • UNLOGGED: persistent but not crash-safe. Faster writes; good for caches/staging.

    Row-Level Security

    Enable with ALTER TABLE tbl ENABLE ROW LEVEL SECURITY. Create policies: CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id()). Built-in user-based access control at the row level.

    Constraints

    • PK: implicit UNIQUE + NOT NULL; creates a B-tree index.
    • FK: specify ON DELETE/UPDATE action (CASCADE, RESTRICT, SET NULL, SET DEFAULT). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. Use DEFERRABLE INITIALLY DEFERRED for circular FK dependencies checked at transaction end.
    • UNIQUE: creates a B-tree index; allows multiple NULLs unless NULLS NOT DISTINCT (PG15+). Standard behavior: (1, NULL) and (1, NULL) are allowed. With NULLS NOT DISTINCT: only one (1, NULL) allowed. Prefer NULLS NOT DISTINCT unless you specifically need duplicate NULLs.
    • CHECK: row-local constraints; NULL values pass the check (three-valued logic). Example: CHECK (price > 0) allows NULL prices. Combine with NOT NULL to enforce: price NUMERIC NOT NULL CHECK (price > 0).
    • EXCLUDE: prevents overlapping values using operators. EXCLUDE USING gist (room_id WITH =, booking_period WITH &&) prevents double-booking rooms. Requires appropriate index type (often GiST).

    Indexing

    • B-tree: default for equality/range queries (=, <, >, BETWEEN, ORDER BY)
    • Composite: order matters—index used if equality on leftmost prefix (WHERE a = ? AND b > ? uses index on (a,b), but WHERE b = ? does not). Put most selective/frequently filtered columns first.
    • Covering: CREATE INDEX ON tbl (id) INCLUDE (name, email) - includes non-key columns for index-only scans without visiting table.
    • Partial: for hot subsets (WHERE status = 'active' → CREATE INDEX ON tbl (user_id) WHERE status = 'active'). Any query with status = 'active' can use this index.
    • Expression: for computed search keys (CREATE INDEX ON tbl (LOWER(email))). Expression must match exactly in WHERE clause: WHERE LOWER(email) = 'user@example.com'.
    • GIN: JSONB containment/existence, arrays (@>, ?), full-text search (@@)
    • GiST: ranges, geometry, exclusion constraints
    • BRIN: very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after CLUSTER).

    Partitioning

    • Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
    • Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
    • RANGE: common for time-series (PARTITION BY RANGE (created_at)). Create partitions: CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'). TimescaleDB automates time-based or ID-based partitioning with retention policies and compression.
    • LIST: for discrete values (PARTITION BY LIST (region)). Example: FOR VALUES IN ('us-east', 'us-west').
    • HASH: for even distribution when no natural key (PARTITION BY HASH (user_id)). Creates N partitions with modulus.
    • Constraint exclusion: requires CHECK constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
    • Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
    • Limitations: no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.

    Special Considerations

    Update-Heavy Tables

    • Separate hot/cold columns—put frequently updated columns in separate table to minimize bloat.
    • Use fillfactor=90 to leave space for HOT updates that avoid index maintenance.
    • Avoid updating indexed columns—prevents beneficial HOT updates.
    • Partition by update patterns—separate frequently updated rows in a different partition from stable data.

    Insert-Heavy Workloads

    • Minimize indexes—only create what you query; every index slows inserts.
    • Use COPY or multi-row INSERT instead of single-row inserts.
    • UNLOGGED tables for rebuildable staging data—much faster writes.
    • Defer index creation for bulk loads—>drop index, load data, recreate indexes.
    • Partition by time/hash to distribute load. TimescaleDB automates partitioning and compression of insert-heavy data.
    • Use a natural key for primary key such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
    • If you do need a surrogate key, Prefer BIGINT GENERATED ALWAYS AS IDENTITY over UUID.

    Upsert-Friendly Design

    • Requires UNIQUE index on conflict target columns—ON CONFLICT (col1, col2) needs exact matching unique index (partial indexes don't work).
    • Use EXCLUDED.column to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
    • DO NOTHING faster than DO UPDATE when no actual update needed.

    Safe Schema Evolution

    • Transactional DDL: most DDL operations can run in transactions and be rolled back—BEGIN; ALTER TABLE...; ROLLBACK; for safe testing.
    • Concurrent index creation: CREATE INDEX CONCURRENTLY avoids blocking writes but can't run in transactions.
    • Volatile defaults cause rewrites: adding NOT NULL columns with volatile defaults (e.g., now(), gen_random_uuid()) rewrites entire table. Non-volatile defaults are fast.
    • Drop constraints before columns: ALTER TABLE DROP CONSTRAINT then DROP COLUMN to avoid dependency issues.
    • Function signature changes: CREATE OR REPLACE with different arguments creates overloads, not replacements. DROP old version if no overload desired.

    Generated Columns

    • ... GENERATED ALWAYS AS (<expr>) STORED for computed, indexable fields. PG18+ adds VIRTUAL columns (computed on read, not stored).

    Extensions

    • pgcrypto: crypt() for password hashing.
    • uuid-ossp: alternative UUID functions; prefer pgcrypto for new projects.
    • pg_trgm: fuzzy text search with % operator, similarity() function. Index with GIN for LIKE '%pattern%' acceleration.
    • citext: case-insensitive text type. Prefer expression indexes on LOWER(col) unless you need case-insensitive constraints.
    • btree_gin/btree_gist: enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).
    • hstore: key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
    • timescaledb: essential for time-series—automated partitioning, retention, compression, continuous aggregates.
    • postgis: comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
    • pgvector: vector similarity search for embeddings.
    • pgaudit: audit logging for all database activity.

    JSONB Guidance

    • Prefer JSONB with GIN index.
    • Default: CREATE INDEX ON tbl USING GIN (jsonb_col); → accelerates:
      • Containment jsonb_col @> '{"k":"v"}'
      • Key existence jsonb_col ? 'k', any/all keys ?\|, ?&
      • Path containment on nested docs
      • Disjunction jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
    • Heavy @> workloads: consider opclass jsonb_path_ops for smaller/faster containment-only indexes:
      • CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
      • Trade-off: loses support for key existence (?, ?|, ?&) queries—only supports containment (@>)
    • Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression):
      • ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;
      • CREATE INDEX ON tbl (price);
      • Prefer queries like WHERE price BETWEEN 100 AND 500 (uses B-tree) over WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500 without index.
    • Arrays inside JSONB: use GIN + @> for containment (e.g., tags). Consider jsonb_path_ops if only doing containment.
    • Keep core relations in tables; use JSONB for optional/variable attributes.
    • Use constraints to limit allowed JSONB values in a column e.g. config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')

    Examples

    Users

    CREATE TABLE users (
      user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
      email TEXT NOT NULL UNIQUE,
      name TEXT NOT NULL,
      created_at TIMESTAMPTZ NOT NULL DEFAULT now()
    );
    CREATE UNIQUE INDEX ON users (LOWER(email));
    CREATE INDEX ON users (created_at);
    

    Orders

    CREATE TABLE orders (
      order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
      user_id BIGINT NOT NULL REFERENCES users(user_id),
      status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
      total NUMERIC(10,2) NOT NULL CHECK (total > 0),
      created_at TIMESTAMPTZ NOT NULL DEFAULT now()
    );
    CREATE INDEX ON orders (user_id);
    CREATE INDEX ON orders (created_at);
    

    JSONB

    CREATE TABLE profiles (
      user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
      attrs JSONB NOT NULL DEFAULT '{}',
      theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
    );
    CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);
    
    Recommended Servers
    Neon
    Neon
    ThinAir Data
    ThinAir Data
    Prisma
    Prisma
    Repository
    zmre/nix-pai
    Files