适用人群:有SQL基础,想深入数据库
难度:中高
预计学习时间:30-40小时
为什么学 PostgreSQL?
| 优势 | 说明 |
|---|
| 功能最全 | JSON/全文搜索/地理信息/向量搜索 |
| 开源免费 | 无许可证费用 |
| 企业级 | 可靠性、性能、扩展性一流 |
| 云原生 | 所有主流云都提供托管服务 |
| AI友好 | pgvector扩展支持向量搜索 |
高级特性
-- JSONB 操作
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
data JSONB NOT NULL
);
INSERT INTO orders (data) VALUES (
'{"customer": "Alice", "items": [{"name": "Book", "qty": 2}]}'
);
-- JSON查询
SELECT data->>'customer' AS customer,
data->'items'->0->>'name' AS first_item
FROM orders;
-- JSONB索引
CREATE INDEX idx_orders_data ON orders USING GIN (data);
-- 窗口函数
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
AVG(salary) OVER (PARTITION BY department) AS dept_avg,
salary - LAG(salary) OVER (ORDER BY salary) AS diff
FROM employees;
-- CTE(公用表表达式)
WITH monthly_sales AS (
SELECT
date_trunc('month', created_at) AS month,
SUM(amount) AS total
FROM orders
GROUP BY 1
)
SELECT
month,
total,
LAG(total) OVER (ORDER BY month) AS prev_month,
ROUND((total - LAG(total) OVER (ORDER BY month)) /
LAG(total) OVER (ORDER BY month) * 100, 2) AS growth_pct
FROM monthly_sales;
-- 物化视图
CREATE MATERIALIZED VIEW mv_daily_stats AS
SELECT
date(created_at) AS day,
COUNT(*) AS orders,
SUM(amount) AS revenue
FROM orders
GROUP BY 1;
-- 定期刷新
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_stats;
-- 分区表
CREATE TABLE orders (
id SERIAL,
created_at DATE NOT NULL,
amount DECIMAL(10,2)
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2024_01 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE orders_2024_02 PARTITION OF orders
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
-- 全文搜索
ALTER TABLE articles ADD COLUMN tsvector_col tsvector;
UPDATE articles SET tsvector_col = to_tsvector('chinese', title || ' ' || content);
CREATE INDEX idx_fts ON articles USING GIN (tsvector_col);
SELECT title FROM articles
WHERE tsvector_col @@ to_tsquery('chinese', '数据库 & 优化');
-- pgvector 向量搜索
CREATE EXTENSION vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536) -- OpenAI维度
);
-- 创建索引
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- 相似度搜索
SELECT content, 1 - (embedding <=> '[0.1, 0.2, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 5;
-- 性能分析
EXPLAIN ANALYZE SELECT * FROM orders WHERE created_at > '2024-01-01';
-- 连接池状态
SELECT * FROM pg_stat_activity;
-- 索引使用情况
SELECT * FROM pg_stat_user_indexes;
-- 慢查询日志
ALTER SYSTEM SET log_min_duration_statement = 1000; -- 记录超过1秒的查询
性能优化清单
| 优化项 | 操作 | 说明 |
|---|
| 索引 | 分析查询计划,创建合适的索引 | 最重要 |
| 连接池 | 使用PgBouncer | 减少连接开销 |
| VACUUM | 定期清理死元组 | 自动或手动 |
| EXPLAIN | 分析慢查询 | 找到瓶颈 |
| 配置调优 | shared_buffers/work_mem | 根据服务器配置 |
| 分区 | 大表按时间分区 | 提高查询性能 |
推荐资源
| 资源 | 说明 |
|---|
| PostgreSQL官方文档 | 最权威的参考 |
| pgvector文档 | 向量搜索扩展 |
| pgMustard | 查询计划分析工具 |
| Neon | Serverless PostgreSQL |
| Supabase | 开源Firebase替代 |