- Generate full AI-consumable docs (docs/ai/): system overview, architecture, module cheatsheet, API contracts, data model, build guide, decision log, glossary, and open questions (deep tier coverage) - Add PROBLEM_REPORT.md: categorized bug/risk summary - Add DETAILED_CODE_REVIEW.md: full line-by-line review of all 15 backend files, documenting 4 fatal issues, 5 critical deployment bugs, 4 security vulnerabilities, and 6 architecture defects with prioritized fix plan
302 lines
11 KiB
Markdown
302 lines
11 KiB
Markdown
---
|
||
generated_by: repo-insight
|
||
version: 1
|
||
created: 2026-03-03
|
||
last_updated: 2026-03-03
|
||
source_commit: 0d9dffa
|
||
coverage: standard
|
||
---
|
||
|
||
# 04 — Data Model
|
||
|
||
## Purpose
|
||
Documents all PostgreSQL tables, columns, relations, constraints, storage design, and partitioning strategy.
|
||
|
||
## TL;DR
|
||
- Single PostgreSQL database `arb_engine`; 15+ tables defined in `db.py` `SCHEMA_SQL` + `auth.py` `AUTH_SCHEMA`.
|
||
- `agg_trades` is a range-partitioned table (by `time_ms` in milliseconds); monthly partitions auto-created by `ensure_partitions()`.
|
||
- Dual-write: local PG is primary; Cloud SQL at `10.106.0.3` receives same writes via a secondary psycopg2 pool (non-fatal if down).
|
||
- All timestamps: `ts` columns are Unix seconds (integer); `time_ms` columns are Unix milliseconds (bigint); `created_at` columns are PG `TIMESTAMP`.
|
||
- JSONB used for `score_factors` in `paper_trades`/`live_trades`, `detail` in `live_events`, `value` in `market_indicators`.
|
||
- Auth tokens stored in DB: refresh tokens in `refresh_tokens` table (revocable); no session table.
|
||
|
||
## Canonical Facts
|
||
|
||
### Tables
|
||
|
||
#### `rate_snapshots` — Funding Rate Snapshots
|
||
Populated every 2 s by `background_snapshot_loop()` in `main.py`.
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `ts` | BIGINT NOT NULL | Unix seconds |
|
||
| `btc_rate` | DOUBLE PRECISION | BTC funding rate (decimal) |
|
||
| `eth_rate` | DOUBLE PRECISION | ETH funding rate |
|
||
| `btc_price` | DOUBLE PRECISION | BTC mark price USD |
|
||
| `eth_price` | DOUBLE PRECISION | ETH mark price USD |
|
||
| `btc_index_price` | DOUBLE PRECISION | BTC index price |
|
||
| `eth_index_price` | DOUBLE PRECISION | ETH index price |
|
||
|
||
Index: `idx_rate_snapshots_ts` on `ts`.
|
||
|
||
---
|
||
|
||
#### `agg_trades` — Aggregate Trades (Partitioned)
|
||
Partitioned by `RANGE(time_ms)`; monthly child tables named `agg_trades_YYYYMM`.
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `agg_id` | BIGINT NOT NULL | Binance aggTrade ID |
|
||
| `symbol` | TEXT NOT NULL | e.g. `BTCUSDT` |
|
||
| `price` | DOUBLE PRECISION | Trade price |
|
||
| `qty` | DOUBLE PRECISION | Trade quantity (BTC/ETH/etc.) |
|
||
| `time_ms` | BIGINT NOT NULL | Trade timestamp ms |
|
||
| `is_buyer_maker` | SMALLINT | 0=taker buy, 1=taker sell |
|
||
|
||
PK: `(time_ms, symbol, agg_id)`.
|
||
Indexes: `idx_agg_trades_sym_time` on `(symbol, time_ms DESC)`, `idx_agg_trades_sym_agg` on `(symbol, agg_id)`.
|
||
|
||
Partitions auto-created for current + next 2 months. Named `agg_trades_YYYYMM`.
|
||
|
||
---
|
||
|
||
#### `agg_trades_meta` — Collection State
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `symbol` | TEXT PK | e.g. `BTCUSDT` |
|
||
| `last_agg_id` | BIGINT | Last processed aggTrade ID |
|
||
| `last_time_ms` | BIGINT | Timestamp of last trade |
|
||
| `earliest_agg_id` | BIGINT | Oldest buffered ID |
|
||
| `earliest_time_ms` | BIGINT | Oldest buffered timestamp |
|
||
| `updated_at` | TEXT | Human-readable update time |
|
||
|
||
---
|
||
|
||
#### `signal_indicators` — Signal Engine Output (15 s cadence)
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `ts` | BIGINT | Unix seconds |
|
||
| `symbol` | TEXT | |
|
||
| `cvd_fast` | DOUBLE PRECISION | CVD 30 m window |
|
||
| `cvd_mid` | DOUBLE PRECISION | CVD 4 h window |
|
||
| `cvd_day` | DOUBLE PRECISION | CVD UTC day |
|
||
| `cvd_fast_slope` | DOUBLE PRECISION | CVD momentum |
|
||
| `atr_5m` | DOUBLE PRECISION | ATR (5 m candles, 14 periods) |
|
||
| `atr_percentile` | DOUBLE PRECISION | ATR rank in 24 h history |
|
||
| `vwap_30m` | DOUBLE PRECISION | VWAP 30 m |
|
||
| `price` | DOUBLE PRECISION | Current mark price |
|
||
| `p95_qty` | DOUBLE PRECISION | P95 large-order threshold |
|
||
| `p99_qty` | DOUBLE PRECISION | P99 large-order threshold |
|
||
| `buy_vol_1m` | DOUBLE PRECISION | 1 m buy volume |
|
||
| `sell_vol_1m` | DOUBLE PRECISION | 1 m sell volume |
|
||
| `score` | INTEGER | Signal score 0–100 |
|
||
| `signal` | TEXT | `LONG`, `SHORT`, or null |
|
||
|
||
Indexes: `idx_si_ts`, `idx_si_sym_ts`.
|
||
|
||
---
|
||
|
||
#### `signal_indicators_1m` — 1-Minute Signal Snapshot
|
||
Subset of `signal_indicators` columns; written at 1 m cadence for lightweight chart queries.
|
||
|
||
---
|
||
|
||
#### `signal_trades` — Signal Engine Trade Tracking
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `ts_open` | BIGINT | Open timestamp (Unix s) |
|
||
| `ts_close` | BIGINT | Close timestamp |
|
||
| `symbol` | TEXT | |
|
||
| `direction` | TEXT | `LONG` / `SHORT` |
|
||
| `entry_price` | DOUBLE PRECISION | |
|
||
| `exit_price` | DOUBLE PRECISION | |
|
||
| `qty` | DOUBLE PRECISION | |
|
||
| `score` | INTEGER | Signal score at entry |
|
||
| `pnl` | DOUBLE PRECISION | Realized PnL |
|
||
| `sl_price` | DOUBLE PRECISION | Stop-loss level |
|
||
| `tp1_price` | DOUBLE PRECISION | Take-profit 1 level |
|
||
| `tp2_price` | DOUBLE PRECISION | Take-profit 2 level |
|
||
| `status` | TEXT DEFAULT `open` | `open`, `closed`, `stopped` |
|
||
|
||
---
|
||
|
||
#### `paper_trades` — Paper Trading Records
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `symbol` | TEXT | |
|
||
| `direction` | TEXT | `LONG`/`SHORT` |
|
||
| `score` | INT | Signal score |
|
||
| `tier` | TEXT | `light`/`standard`/`heavy` |
|
||
| `entry_price` | DOUBLE PRECISION | |
|
||
| `entry_ts` | BIGINT | Unix ms |
|
||
| `exit_price` | DOUBLE PRECISION | |
|
||
| `exit_ts` | BIGINT | |
|
||
| `tp1_price` | DOUBLE PRECISION | |
|
||
| `tp2_price` | DOUBLE PRECISION | |
|
||
| `sl_price` | DOUBLE PRECISION | |
|
||
| `tp1_hit` | BOOLEAN DEFAULT FALSE | |
|
||
| `status` | TEXT DEFAULT `active` | `active`, `tp1`, `tp2`, `sl`, `timeout` |
|
||
| `pnl_r` | DOUBLE PRECISION | PnL in R units |
|
||
| `atr_at_entry` | DOUBLE PRECISION | ATR snapshot at entry |
|
||
| `score_factors` | JSONB | Breakdown of signal score components |
|
||
| `strategy` | VARCHAR(32) DEFAULT `v51_baseline` | Strategy name |
|
||
| `created_at` | TIMESTAMP | |
|
||
|
||
---
|
||
|
||
#### `live_trades` — Live Trading Records
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `symbol` | TEXT | |
|
||
| `strategy` | TEXT | |
|
||
| `direction` | TEXT | `LONG`/`SHORT` |
|
||
| `status` | TEXT DEFAULT `active` | |
|
||
| `entry_price` / `exit_price` | DOUBLE PRECISION | |
|
||
| `entry_ts` / `exit_ts` | BIGINT | Unix ms |
|
||
| `sl_price`, `tp1_price`, `tp2_price` | DOUBLE PRECISION | |
|
||
| `tp1_hit` | BOOLEAN | |
|
||
| `score` | DOUBLE PRECISION | |
|
||
| `tier` | TEXT | |
|
||
| `pnl_r` | DOUBLE PRECISION | |
|
||
| `fee_usdt` | DOUBLE PRECISION | Exchange fees |
|
||
| `funding_fee_usdt` | DOUBLE PRECISION | Funding fees paid while holding |
|
||
| `risk_distance` | DOUBLE PRECISION | Entry to SL distance |
|
||
| `atr_at_entry` | DOUBLE PRECISION | |
|
||
| `score_factors` | JSONB | |
|
||
| `signal_id` | BIGINT | FK → signal_indicators.id |
|
||
| `binance_order_id` | TEXT | Binance order ID |
|
||
| `fill_price` | DOUBLE PRECISION | Actual fill price |
|
||
| `slippage_bps` | DOUBLE PRECISION | Slippage in basis points |
|
||
| `protection_gap_ms` | BIGINT | Time between SL order and fill |
|
||
| `signal_to_order_ms` | BIGINT | Latency: signal → order placed |
|
||
| `order_to_fill_ms` | BIGINT | Latency: order → fill |
|
||
| `qty` | DOUBLE PRECISION | |
|
||
| `created_at` | TIMESTAMP | |
|
||
|
||
---
|
||
|
||
#### `live_config` — Runtime Configuration KV Store
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `key` | TEXT PK | Config key |
|
||
| `value` | TEXT | Config value (string) |
|
||
| `label` | TEXT | Human label |
|
||
| `updated_at` | TIMESTAMP | |
|
||
|
||
Known keys: `risk_per_trade_usd`, `max_positions`, `circuit_break` (inferred).
|
||
|
||
---
|
||
|
||
#### `live_events` — Trade Event Log
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | BIGSERIAL PK | |
|
||
| `ts` | BIGINT | Unix ms (default: NOW()) |
|
||
| `level` | TEXT | `info`/`warning`/`error` |
|
||
| `category` | TEXT | Event category |
|
||
| `symbol` | TEXT | |
|
||
| `message` | TEXT | |
|
||
| `detail` | JSONB | Structured event data |
|
||
|
||
---
|
||
|
||
#### `signal_logs` — Legacy Signal Log
|
||
Kept for backwards compatibility with the original funding-rate signal system.
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `id` | BIGSERIAL PK |
|
||
| `symbol` | TEXT |
|
||
| `rate` | DOUBLE PRECISION |
|
||
| `annualized` | DOUBLE PRECISION |
|
||
| `sent_at` | TEXT |
|
||
| `message` | TEXT |
|
||
|
||
---
|
||
|
||
#### Auth Tables (defined in `auth.py` AUTH_SCHEMA)
|
||
|
||
**`users`**
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `id` | BIGSERIAL PK |
|
||
| `email` | TEXT UNIQUE NOT NULL |
|
||
| `password_hash` | TEXT NOT NULL |
|
||
| `discord_id` | TEXT |
|
||
| `role` | TEXT DEFAULT `user` |
|
||
| `banned` | INTEGER DEFAULT 0 |
|
||
| `created_at` | TEXT |
|
||
|
||
**`subscriptions`**
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `user_id` | BIGINT PK → users |
|
||
| `tier` | TEXT DEFAULT `free` |
|
||
| `expires_at` | TEXT |
|
||
|
||
**`invite_codes`**
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `id` | BIGSERIAL PK |
|
||
| `code` | TEXT UNIQUE |
|
||
| `created_by` | INTEGER |
|
||
| `max_uses` | INTEGER DEFAULT 1 |
|
||
| `used_count` | INTEGER DEFAULT 0 |
|
||
| `status` | TEXT DEFAULT `active` |
|
||
| `expires_at` | TEXT |
|
||
|
||
**`invite_usage`**
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `id` | BIGSERIAL PK |
|
||
| `code_id` | BIGINT → invite_codes |
|
||
| `user_id` | BIGINT → users |
|
||
| `used_at` | TEXT |
|
||
|
||
**`refresh_tokens`**
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `id` | BIGSERIAL PK |
|
||
| `user_id` | BIGINT → users |
|
||
| `token` | TEXT UNIQUE |
|
||
| `expires_at` | TEXT |
|
||
| `revoked` | INTEGER DEFAULT 0 |
|
||
|
||
---
|
||
|
||
#### `market_indicators` — Market Indicator JSONB Store
|
||
Populated by `market_data_collector.py`.
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `symbol` | TEXT | |
|
||
| `indicator_type` | TEXT | `long_short_ratio`, `top_trader_position`, `open_interest_hist`, `coinbase_premium`, `funding_rate` |
|
||
| `timestamp_ms` | BIGINT | |
|
||
| `value` | JSONB | Raw indicator payload |
|
||
|
||
Query pattern: `WHERE symbol=? AND indicator_type=? ORDER BY timestamp_ms DESC LIMIT 1`.
|
||
|
||
### Storage Design Decisions
|
||
- **Partitioning**: `agg_trades` partitioned by month to avoid table bloat; partition maintenance is automated.
|
||
- **Dual-write**: Cloud SQL secondary is best-effort (errors logged, never fatal).
|
||
- **JSONB `score_factors`**: allows schema-free storage of per-strategy signal breakdowns without migrations.
|
||
- **Timestamps**: mix of Unix seconds (`ts`), Unix ms (`time_ms`, `timestamp_ms`, `entry_ts`), ISO strings (`created_at` TEXT in auth tables), and PG `TIMESTAMP`; be careful when querying across tables.
|
||
|
||
## Interfaces / Dependencies
|
||
- `db.py:init_schema()` — creates all tables in `SCHEMA_SQL`
|
||
- `auth.py:ensure_tables()` — creates auth tables from `AUTH_SCHEMA`
|
||
- `db.py:ensure_partitions()` — auto-creates monthly `agg_trades_YYYYMM` partitions
|
||
|
||
## Unknowns & Risks
|
||
- [unknown] `market_indicators` table schema not in `SCHEMA_SQL`; likely created by `market_data_collector.py` separately — verify before querying.
|
||
- [risk] Timestamp inconsistency: some tables use TEXT for timestamps (auth tables), others use BIGINT, others use PG TIMESTAMP — cross-table JOINs on time fields require explicit casting.
|
||
- [inference] `live_config` circuit-break key name not confirmed from source; inferred from `risk_guard.py` behavior.
|
||
- [risk] `users` table defined in both `SCHEMA_SQL` (db.py) and `AUTH_SCHEMA` (auth.py); duplicate CREATE TABLE IF NOT EXISTS; actual schema diverges between the two definitions (db.py version lacks `discord_id`, `banned`).
|
||
|
||
## Source Refs
|
||
- `backend/db.py:166-356` — `SCHEMA_SQL` with all table definitions
|
||
- `backend/auth.py:28-71` — `AUTH_SCHEMA` auth tables
|
||
- `backend/db.py:360-414` — `ensure_partitions()`, `init_schema()`
|
||
- `backend/signal_engine.py:123-158` — `market_indicators` query pattern
|