--- generated_by: repo-insight version: 1 created: 2026-03-03 last_updated: 2026-03-03 source_commit: 0d9dffa coverage: standard --- # 04 — Data Model ## Purpose Documents all PostgreSQL tables, columns, relations, constraints, storage design, and partitioning strategy. ## TL;DR - Single PostgreSQL database `arb_engine`; 15+ tables defined in `db.py` `SCHEMA_SQL` + `auth.py` `AUTH_SCHEMA`. - `agg_trades` is a range-partitioned table (by `time_ms` in milliseconds); monthly partitions auto-created by `ensure_partitions()`. - Dual-write: local PG is primary; Cloud SQL at `10.106.0.3` receives same writes via a secondary psycopg2 pool (non-fatal if down). - All timestamps: `ts` columns are Unix seconds (integer); `time_ms` columns are Unix milliseconds (bigint); `created_at` columns are PG `TIMESTAMP`. - JSONB used for `score_factors` in `paper_trades`/`live_trades`, `detail` in `live_events`, `value` in `market_indicators`. - Auth tokens stored in DB: refresh tokens in `refresh_tokens` table (revocable); no session table. ## Canonical Facts ### Tables #### `rate_snapshots` — Funding Rate Snapshots Populated every 2 s by `background_snapshot_loop()` in `main.py`. | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `ts` | BIGINT NOT NULL | Unix seconds | | `btc_rate` | DOUBLE PRECISION | BTC funding rate (decimal) | | `eth_rate` | DOUBLE PRECISION | ETH funding rate | | `btc_price` | DOUBLE PRECISION | BTC mark price USD | | `eth_price` | DOUBLE PRECISION | ETH mark price USD | | `btc_index_price` | DOUBLE PRECISION | BTC index price | | `eth_index_price` | DOUBLE PRECISION | ETH index price | Index: `idx_rate_snapshots_ts` on `ts`. --- #### `agg_trades` — Aggregate Trades (Partitioned) Partitioned by `RANGE(time_ms)`; monthly child tables named `agg_trades_YYYYMM`. | Column | Type | Description | |--------|------|-------------| | `agg_id` | BIGINT NOT NULL | Binance aggTrade ID | | `symbol` | TEXT NOT NULL | e.g. `BTCUSDT` | | `price` | DOUBLE PRECISION | Trade price | | `qty` | DOUBLE PRECISION | Trade quantity (BTC/ETH/etc.) | | `time_ms` | BIGINT NOT NULL | Trade timestamp ms | | `is_buyer_maker` | SMALLINT | 0=taker buy, 1=taker sell | PK: `(time_ms, symbol, agg_id)`. Indexes: `idx_agg_trades_sym_time` on `(symbol, time_ms DESC)`, `idx_agg_trades_sym_agg` on `(symbol, agg_id)`. Partitions auto-created for current + next 2 months. Named `agg_trades_YYYYMM`. --- #### `agg_trades_meta` — Collection State | Column | Type | Description | |--------|------|-------------| | `symbol` | TEXT PK | e.g. `BTCUSDT` | | `last_agg_id` | BIGINT | Last processed aggTrade ID | | `last_time_ms` | BIGINT | Timestamp of last trade | | `earliest_agg_id` | BIGINT | Oldest buffered ID | | `earliest_time_ms` | BIGINT | Oldest buffered timestamp | | `updated_at` | TEXT | Human-readable update time | --- #### `signal_indicators` — Signal Engine Output (15 s cadence) | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `ts` | BIGINT | Unix seconds | | `symbol` | TEXT | | | `cvd_fast` | DOUBLE PRECISION | CVD 30 m window | | `cvd_mid` | DOUBLE PRECISION | CVD 4 h window | | `cvd_day` | DOUBLE PRECISION | CVD UTC day | | `cvd_fast_slope` | DOUBLE PRECISION | CVD momentum | | `atr_5m` | DOUBLE PRECISION | ATR (5 m candles, 14 periods) | | `atr_percentile` | DOUBLE PRECISION | ATR rank in 24 h history | | `vwap_30m` | DOUBLE PRECISION | VWAP 30 m | | `price` | DOUBLE PRECISION | Current mark price | | `p95_qty` | DOUBLE PRECISION | P95 large-order threshold | | `p99_qty` | DOUBLE PRECISION | P99 large-order threshold | | `buy_vol_1m` | DOUBLE PRECISION | 1 m buy volume | | `sell_vol_1m` | DOUBLE PRECISION | 1 m sell volume | | `score` | INTEGER | Signal score 0–100 | | `signal` | TEXT | `LONG`, `SHORT`, or null | Indexes: `idx_si_ts`, `idx_si_sym_ts`. --- #### `signal_indicators_1m` — 1-Minute Signal Snapshot Subset of `signal_indicators` columns; written at 1 m cadence for lightweight chart queries. --- #### `signal_trades` — Signal Engine Trade Tracking | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `ts_open` | BIGINT | Open timestamp (Unix s) | | `ts_close` | BIGINT | Close timestamp | | `symbol` | TEXT | | | `direction` | TEXT | `LONG` / `SHORT` | | `entry_price` | DOUBLE PRECISION | | | `exit_price` | DOUBLE PRECISION | | | `qty` | DOUBLE PRECISION | | | `score` | INTEGER | Signal score at entry | | `pnl` | DOUBLE PRECISION | Realized PnL | | `sl_price` | DOUBLE PRECISION | Stop-loss level | | `tp1_price` | DOUBLE PRECISION | Take-profit 1 level | | `tp2_price` | DOUBLE PRECISION | Take-profit 2 level | | `status` | TEXT DEFAULT `open` | `open`, `closed`, `stopped` | --- #### `paper_trades` — Paper Trading Records | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `symbol` | TEXT | | | `direction` | TEXT | `LONG`/`SHORT` | | `score` | INT | Signal score | | `tier` | TEXT | `light`/`standard`/`heavy` | | `entry_price` | DOUBLE PRECISION | | | `entry_ts` | BIGINT | Unix ms | | `exit_price` | DOUBLE PRECISION | | | `exit_ts` | BIGINT | | | `tp1_price` | DOUBLE PRECISION | | | `tp2_price` | DOUBLE PRECISION | | | `sl_price` | DOUBLE PRECISION | | | `tp1_hit` | BOOLEAN DEFAULT FALSE | | | `status` | TEXT DEFAULT `active` | `active`, `tp1`, `tp2`, `sl`, `timeout` | | `pnl_r` | DOUBLE PRECISION | PnL in R units | | `atr_at_entry` | DOUBLE PRECISION | ATR snapshot at entry | | `score_factors` | JSONB | Breakdown of signal score components | | `strategy` | VARCHAR(32) DEFAULT `v51_baseline` | Strategy name | | `created_at` | TIMESTAMP | | --- #### `live_trades` — Live Trading Records | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `symbol` | TEXT | | | `strategy` | TEXT | | | `direction` | TEXT | `LONG`/`SHORT` | | `status` | TEXT DEFAULT `active` | | | `entry_price` / `exit_price` | DOUBLE PRECISION | | | `entry_ts` / `exit_ts` | BIGINT | Unix ms | | `sl_price`, `tp1_price`, `tp2_price` | DOUBLE PRECISION | | | `tp1_hit` | BOOLEAN | | | `score` | DOUBLE PRECISION | | | `tier` | TEXT | | | `pnl_r` | DOUBLE PRECISION | | | `fee_usdt` | DOUBLE PRECISION | Exchange fees | | `funding_fee_usdt` | DOUBLE PRECISION | Funding fees paid while holding | | `risk_distance` | DOUBLE PRECISION | Entry to SL distance | | `atr_at_entry` | DOUBLE PRECISION | | | `score_factors` | JSONB | | | `signal_id` | BIGINT | FK → signal_indicators.id | | `binance_order_id` | TEXT | Binance order ID | | `fill_price` | DOUBLE PRECISION | Actual fill price | | `slippage_bps` | DOUBLE PRECISION | Slippage in basis points | | `protection_gap_ms` | BIGINT | Time between SL order and fill | | `signal_to_order_ms` | BIGINT | Latency: signal → order placed | | `order_to_fill_ms` | BIGINT | Latency: order → fill | | `qty` | DOUBLE PRECISION | | | `created_at` | TIMESTAMP | | --- #### `live_config` — Runtime Configuration KV Store | Column | Type | Description | |--------|------|-------------| | `key` | TEXT PK | Config key | | `value` | TEXT | Config value (string) | | `label` | TEXT | Human label | | `updated_at` | TIMESTAMP | | Known keys: `risk_per_trade_usd`, `max_positions`, `circuit_break` (inferred). --- #### `live_events` — Trade Event Log | Column | Type | Description | |--------|------|-------------| | `id` | BIGSERIAL PK | | | `ts` | BIGINT | Unix ms (default: NOW()) | | `level` | TEXT | `info`/`warning`/`error` | | `category` | TEXT | Event category | | `symbol` | TEXT | | | `message` | TEXT | | | `detail` | JSONB | Structured event data | --- #### `signal_logs` — Legacy Signal Log Kept for backwards compatibility with the original funding-rate signal system. | Column | Type | |--------|------| | `id` | BIGSERIAL PK | | `symbol` | TEXT | | `rate` | DOUBLE PRECISION | | `annualized` | DOUBLE PRECISION | | `sent_at` | TEXT | | `message` | TEXT | --- #### Auth Tables (defined in `auth.py` AUTH_SCHEMA) **`users`** | Column | Type | |--------|------| | `id` | BIGSERIAL PK | | `email` | TEXT UNIQUE NOT NULL | | `password_hash` | TEXT NOT NULL | | `discord_id` | TEXT | | `role` | TEXT DEFAULT `user` | | `banned` | INTEGER DEFAULT 0 | | `created_at` | TEXT | **`subscriptions`** | Column | Type | |--------|------| | `user_id` | BIGINT PK → users | | `tier` | TEXT DEFAULT `free` | | `expires_at` | TEXT | **`invite_codes`** | Column | Type | |--------|------| | `id` | BIGSERIAL PK | | `code` | TEXT UNIQUE | | `created_by` | INTEGER | | `max_uses` | INTEGER DEFAULT 1 | | `used_count` | INTEGER DEFAULT 0 | | `status` | TEXT DEFAULT `active` | | `expires_at` | TEXT | **`invite_usage`** | Column | Type | |--------|------| | `id` | BIGSERIAL PK | | `code_id` | BIGINT → invite_codes | | `user_id` | BIGINT → users | | `used_at` | TEXT | **`refresh_tokens`** | Column | Type | |--------|------| | `id` | BIGSERIAL PK | | `user_id` | BIGINT → users | | `token` | TEXT UNIQUE | | `expires_at` | TEXT | | `revoked` | INTEGER DEFAULT 0 | --- #### `market_indicators` — Market Indicator JSONB Store Populated by `market_data_collector.py`. | Column | Type | Description | |--------|------|-------------| | `symbol` | TEXT | | | `indicator_type` | TEXT | `long_short_ratio`, `top_trader_position`, `open_interest_hist`, `coinbase_premium`, `funding_rate` | | `timestamp_ms` | BIGINT | | | `value` | JSONB | Raw indicator payload | Query pattern: `WHERE symbol=? AND indicator_type=? ORDER BY timestamp_ms DESC LIMIT 1`. ### Storage Design Decisions - **Partitioning**: `agg_trades` partitioned by month to avoid table bloat; partition maintenance is automated. - **Dual-write**: Cloud SQL secondary is best-effort (errors logged, never fatal). - **JSONB `score_factors`**: allows schema-free storage of per-strategy signal breakdowns without migrations. - **Timestamps**: mix of Unix seconds (`ts`), Unix ms (`time_ms`, `timestamp_ms`, `entry_ts`), ISO strings (`created_at` TEXT in auth tables), and PG `TIMESTAMP`; be careful when querying across tables. ## Interfaces / Dependencies - `db.py:init_schema()` — creates all tables in `SCHEMA_SQL` - `auth.py:ensure_tables()` — creates auth tables from `AUTH_SCHEMA` - `db.py:ensure_partitions()` — auto-creates monthly `agg_trades_YYYYMM` partitions ## Unknowns & Risks - [unknown] `market_indicators` table schema not in `SCHEMA_SQL`; likely created by `market_data_collector.py` separately — verify before querying. - [risk] Timestamp inconsistency: some tables use TEXT for timestamps (auth tables), others use BIGINT, others use PG TIMESTAMP — cross-table JOINs on time fields require explicit casting. - [inference] `live_config` circuit-break key name not confirmed from source; inferred from `risk_guard.py` behavior. - [risk] `users` table defined in both `SCHEMA_SQL` (db.py) and `AUTH_SCHEMA` (auth.py); duplicate CREATE TABLE IF NOT EXISTS; actual schema diverges between the two definitions (db.py version lacks `discord_id`, `banned`). ## Source Refs - `backend/db.py:166-356` — `SCHEMA_SQL` with all table definitions - `backend/auth.py:28-71` — `AUTH_SCHEMA` auth tables - `backend/db.py:360-414` — `ensure_partitions()`, `init_schema()` - `backend/signal_engine.py:123-158` — `market_indicators` query pattern