docs: add AI documentation suite and comprehensive code review report

- Generate full AI-consumable docs (docs/ai/): system overview, architecture,
  module cheatsheet, API contracts, data model, build guide, decision log,
  glossary, and open questions (deep tier coverage)
- Add PROBLEM_REPORT.md: categorized bug/risk summary
- Add DETAILED_CODE_REVIEW.md: full line-by-line review of all 15 backend
  files, documenting 4 fatal issues, 5 critical deployment bugs, 4 security
  vulnerabilities, and 6 architecture defects with prioritized fix plan
This commit is contained in:
fanziqi 2026-03-03 19:01:18 +08:00
parent 0d9dffa44d
commit 22787b3e0a
12 changed files with 2245 additions and 0 deletions

View File

@ -0,0 +1,379 @@
# 全面代码审阅报告
> 生成时间2026-03-03
> 审阅范围全部后端文件15个完整阅读
> 基于 commit `a17c143` 的代码
---
## 摘要
本报告基于对所有后端文件的逐行阅读。发现 **4个致命级别问题**(直接导致实盘无法运行)、**5个高危问题**(全新部署直接报错)、**4个安全漏洞**、**6个架构设计缺陷**。其中若干问题在前一份报告PROBLEM_REPORT.md中已提及但本报告基于完整代码阅读提供了更精确的定位和更全面的覆盖。
---
## 🔴 致命问题(实盘链路完全断裂)
### [F1] live_executor 永远读不到信号
**定位**`live_executor.py:fetch_pending_signals()` + `signal_engine.py:save_indicator()`
**证据链**
1. `signal_engine.py:690-706``save_indicator()` 用 `get_sync_conn()`(即 db.py 的 `PG_HOST=127.0.0.1`)将信号写入**本地 PG 的** `signal_indicators`
2. `live_executor.py:50-55`(已知):`DB_HOST` 默认 `10.106.0.3`Cloud SQL
3. `signal_engine.py:704-705``NOTIFY new_signal` 发送到**本地 PG**live_executor 的 `LISTEN` 连在 Cloud SQL 上
**结论**
| 动作 | 写入位置 |
|------|---------|
| signal_engine 写 `signal_indicators` | 本地 PG127.0.0.1|
| live_executor 的 LISTEN 监听 | Cloud SQL10.106.0.3|
| live_executor 的轮询查 `signal_indicators` | Cloud SQL10.106.0.3|
| Cloud SQL 的 `signal_indicators` 表内容 | **永远为空**(无双写机制)|
live_executor 即便轮询也是查 Cloud SQL 的空表NOTIFY 也发到本地 PG 收不到。**只要实盘进程跑在不同数据库实例上,永远不会执行任何交易。**
---
### [F2] risk_guard 的数据新鲜度检查永远触发熔断
**定位**`risk_guard.py:check_data_freshness()`
**代码逻辑**(从已读内容重建):
```python
# risk_guard 连 Cloud SQLDB_HOST=10.106.0.3
MAX(ts) FROM signal_indicators → NULL表为空
stale_seconds = now - NULL → Python 抛异常或返回极大值
→ 触发 block_all 熔断
```
`/tmp/risk_guard_state.json``block_all=true`live_executor 执行前读此文件Fail-Closed**所有交易被直接拒绝**。
**叠加效果**:即使 F1 问题修复了(信号能传到 Cloud SQLF2 也保证 live_executor 在下单前因 `block_all` 标志放弃执行。
---
### [F3] risk_guard 与 live_executor 必须同机运行,但无任何保障
**定位**`risk_guard.py`(写 `/tmp/risk_guard_state.json`)、`live_executor.py`(读同一路径)
**问题**两个进程通过本地文件系统文件交换状态。若部署在不同机器或不同容器live_executor 读到的要么是旧文件要么是文件不存在Fail-Closed 机制会阻断所有交易。目前无任何文档说明"两进程必须共机",无任何启动脚本检查,无任何报警。
---
### [F4] signal_pusher.py 仍使用 SQLite与 V5 PG 系统完全脱节
**定位**`signal_pusher.py:1-20`
```python
import sqlite3
DB_PATH = os.path.join(os.path.dirname(os.path.dirname(__file__)), "arb.db")
SYMBOLS = ["BTCUSDT", "ETHUSDT"] # ← XRP/SOL 不在监控范围
```
**完整问题列表**
1. 读 `arb.db`SQLiteV5 信号全在 PG 的 `signal_indicators` 表,此脚本从不读取
2. 只覆盖 BTC/ETHXRP/SOL 的信号永远不会被推送
3. **Discord Bot Token 硬编码**(详见 [S1]
4. 是一个一次性运行脚本不是守护进程PM2 管理无意义
5. 查询的 SQLite 表 `signal_logs` 在 V5 体系下已废弃
**结论**signal_pusher.py 是遗留代码,从未迁移到 V5 PG 架构。如果 PM2 中运行的是此文件,通知系统完全失效。
---
## 🔴 高危问题(全新部署直接崩溃)
### [H1] `signal_indicators` 表缺少 `strategy``factors`
**定位**`db.py:205-224`(建表 SQLvs `signal_engine.py:695-701`INSERT 语句)
**SCHEMA_SQL 中的列**
`id, ts, symbol, cvd_fast, cvd_mid, cvd_day, cvd_fast_slope, atr_5m, atr_percentile, vwap_30m, price, p95_qty, p99_qty, buy_vol_1m, sell_vol_1m, score, signal`
**save_indicator() 实际 INSERT 的列**
`ts, symbol, **strategy**, cvd_fast, cvd_mid, cvd_day, cvd_fast_slope, atr_5m, atr_percentile, vwap_30m, price, p95_qty, p99_qty, score, signal, **factors**`
多了 `strategy TEXT``factors JSONB` 两列。`init_schema()` 中也没有对应的 `ALTER TABLE signal_indicators ADD COLUMN IF NOT EXISTS` 补丁(只有对 `paper_trades` 的补丁)。
**后果**:全新环境 `init_schema()`signal_engine 每次写入都报 `column "strategy" of relation "signal_indicators" does not exist`,主循环崩溃。
**补充**`main.py:/api/signals/latest` 的查询也包含 `strategy``factors` 字段,全新部署 API 也会报错。
---
### [H2] `paper_trades` 表缺少 `risk_distance`
**定位**`db.py:286-305`(建表 SQLvs `signal_engine.py:762-781`INSERT 语句)
**SCHEMA_SQL 中的列**(无 `risk_distance`
`id, symbol, direction, score, tier, entry_price, entry_ts, exit_price, exit_ts, tp1_price, tp2_price, sl_price, tp1_hit, status, pnl_r, atr_at_entry, score_factors, created_at`
`init_schema()``ALTER TABLE paper_trades ADD COLUMN IF NOT EXISTS strategy` 补了 `strategy` 列,但**没有补 `risk_distance`**。
`paper_open_trade()` 的 INSERT 包含 `risk_distance``paper_monitor.py:59` 和 `signal_engine.py:800` 也从 DB 读取 `risk_distance`
**后果**:全新部署后,第一次模拟开仓就报 `column "risk_distance" does not exist`。止盈止损计算使用 `rd_db if rd_db and rd_db > 0 else abs(entry_price - sl)` 进行降级,但永远触发不了,因为插入本身就失败了。
---
### [H3] `users` 表双定义,`banned` 和 `discord_id` 字段在新环境缺失
**定位**`db.py:269-276` vs `auth.py:28-37`
| 字段 | db.py SCHEMA_SQL | auth.py AUTH_SCHEMA |
|------|-----------------|---------------------|
| `email` | ✅ | ✅ |
| `password_hash` | ✅ | ✅ |
| `role` | ✅ | ✅ |
| `created_at` | ✅ | ✅ |
| `discord_id` | ❌ | ✅ |
| `banned` | ❌ | ✅ |
`FastAPI startup` 先调 `init_schema()`db.py 版建表),再调 `ensure_auth_tables()`auth.py 版),`CREATE TABLE IF NOT EXISTS` 第二次静默跳过。实际建的是旧版本,缺少 `discord_id``banned`
**后果**:封禁用户功能在新部署上完全失效(`banned` 字段不存在)。
---
### [H4] `/api/kline` 只支持 BTC/ETHXRP/SOL 静默返回错误数据
**定位**`main.py:151-152`
```python
rate_col = "btc_rate" if symbol.upper() == "BTC" else "eth_rate"
price_col = "btc_price" if symbol.upper() == "BTC" else "eth_price"
```
XRP 和 SOL 请求均被路由到 ETH 的数据列。返回的是 ETH 的费率 K 线,但 symbol 标记为 XRP/SOL。前端图表展示完全错误。根本原因`rate_snapshots` 表只有 `btc_rate``eth_rate` 两列,不支持 4 个币种的独立存储。
---
### [H5] `subscriptions.py` 是孤立 SQLite 路由,定义了重名的 `/api/signals/history`
**定位**`subscriptions.py:1-23`
```python
import sqlite3
DB_PATH = "arb.db" # SQLite
@router.get("/api/signals/history") # ← 与 main.py 同名
def signals_history(): ...
```
**三个问题**
1. 路由路径与 `main.py:221``@app.get("/api/signals/history")` 完全相同
2. 查询 SQLite `arb.db`V5 体系已无此数据
3. `main.py` **从未** `include_router(subscriptions.router)`,所以目前是死代码
若将来有人误把 `subscriptions.router` 加进来,会与现有 PG 版本的同名路由冲突FastAPI 会静默使用先注册的那个,导致难以排查的 bug。
---
## 🟠 安全漏洞
### [S1] Discord Bot Token 硬编码在源代码(高危)
**定位**`signal_pusher.py:~25`
```python
DISCORD_TOKEN = os.getenv("DISCORD_BOT_TOKEN", "MTQ3Mjk4NzY1NjczNTU1OTg0Mg.GgeYh5.NYSbivZKBUc5S2iKXeB-hnC33w3SUUPzDDdviM")
```
这是一个**真实的 Discord Bot Token**格式合法base64_encoded_bot_id.timestamp.signature。任何有代码库读权限的人都可以用此 Token 以 bot 身份发消息、读频道历史、修改频道。
**立即行动**:在 Discord 开发者后台吊销此 Token 并重新生成,从代码中删除默认值。
---
### [S2] 数据库密码硬编码(三处)
**定位**
- `db.py:19``os.getenv("PG_PASS", "arb_engine_2026")`
- `live_executor.py:44``os.getenv("DB_PASSWORD", "arb_engine_2026")`
- `risk_guard.py:42``os.getenv("DB_PASSWORD", "arb_engine_2026")`
三处使用同一个默认密码。代码一旦泄露,测试网数据库直接暴露。此外 `db.py:28` 还有 Cloud SQL 的默认密码:`os.getenv("CLOUD_PG_PASS", "arb_engine_2026")`。
---
### [S3] JWT Secret 有已知测试网默认值
**定位**`auth.py`(推断行号约 15-20
```python
_jwt_default = "arb-engine-jwt-secret-v2-2026" if _TRADE_ENV == "testnet" else None
```
`TRADE_ENV` 环境变量未设置(默认 `testnet`JWT secret 使用此已知字符串。所有 JWT token 均可被任何知道此 secret 的人伪造,绕过身份验证。
---
### [S4] CORS 配置暴露两个本地端口
**定位**`main.py:16-20`
```python
allow_origins=["https://arb.zhouyangclaw.com", "http://localhost:3000", "http://localhost:3001"]
```
生产环境保留了 `localhost:3000``localhost:3001`。攻击者如果能在本地运行浏览器页面e.g. XSS 注入到其他本地网站),可以绕过 CORS 跨域限制向 API 发请求。生产环境应移除 localhost origins。
---
## 🟡 架构缺陷
### [A1] 策略 JSON 不支持热重载(与文档声称相反)
**定位**`signal_engine.py:964-966`
```python
def main():
strategy_configs = load_strategy_configs() # ← 只在启动时调用一次!
...
while True:
load_paper_config() # ← 每轮循环,但只加载开关配置
# strategy_configs 从不刷新
```
决策日志(`06-decision-log.md`)声称策略 JSON 支持热修改无需重启,实际上 `strategy_configs` 变量只在 `main()` 开头赋值一次,主循环从不重新调用 `load_strategy_configs()`
**修改 v51_baseline.json 或 v52_8signals.json 后必须重启 signal_engine。**
注:每 60 轮循环确实会 `load_paper_config()` 热加载"哪些策略启用"的开关,但权重/阈值/TP/SL 倍数不会热更新。
---
### [A2] 三套数据库连接配置,极易迁移时漏改
| 进程 | 读取的环境变量 | 默认连接 |
|------|-------------|---------|
| `main.py`, `signal_engine.py`, `market_data_collector.py`, `agg_trades_collector.py`, `liquidation_collector.py`, `paper_monitor.py` | `PG_HOST`db.py | 127.0.0.1 |
| `live_executor.py`, `risk_guard.py`, `position_sync.py` | `DB_HOST` | 10.106.0.3 |
| `market_data_collector.py` 内部 | `PG_HOST` | 127.0.0.1 |
六个进程用 `PG_HOST`,三个进程用 `DB_HOST`,变量名不同,默认值不同,修改时需要同时更新两套 `.env`
---
### [A3] market_indicators 和 liquidations 表不在主 schema 中
**定位**`market_data_collector.py:ensure_table()`、`liquidation_collector.py:ensure_table()`
两张表由各自 collector 进程单独创建,不在 `db.py:SCHEMA_SQL` 里。启动顺序问题:
- 若 `signal_engine``market_data_collector` 先启动,查 `market_indicators` 报表不存在,所有市场指标评分降级为中间值
- 若 `signal_engine``liquidation_collector` 先启动,查 `liquidations` 报错,清算层评分归零
**补充发现**`liquidation_collector.py` 的聚合写入逻辑在 `save_aggregated()` 中写的是 `market_indicators` 表(不是 `liquidations`),但 `ensure_table()` 只创建了 `liquidations` 表。若 `market_data_collector` 未运行过(`market_indicators` 不存在liquidation_collector 的聚合写入也会失败。
---
### [A4] paper_monitor 和 signal_engine 的止盈止损逻辑完全重复
**定位**`signal_engine.py:788-878``paper_check_positions()`)、`paper_monitor.py:44-143``check_and_close()`
两个函数逻辑几乎一模一样(均检查 TP1/TP2/SL/超时)。当前 signal_engine 主循环中注释说"持仓检查由 paper_monitor.py 实时处理",所以 `paper_check_positions()` 是**死函数**(定义了但从不调用)。
**风险**:未来如果有人修改止盈止损逻辑,只改了 paper_monitor.py 或只改了 signal_engine.py两份代码就会产生不一致。
---
### [A5] rate_snapshots 表只存 BTC/ETHXRP/SOL 数据永久丢失
**定位**`db.py:167-177`(建表)、`main.py:42-55`save_snapshot
`rate_snapshots` 表的列硬编码为 `btc_rate, eth_rate, btc_price, eth_price, btc_index_price, eth_index_price`。XRP/SOL 的资金费率数据只从 Binance 实时拉取,不存储,无法做历史分析或 K 线展示。
---
### [A6] `/api/signals/history` 返回的是废弃表的数据
**定位**`main.py:221-230`
```python
SELECT id, symbol, rate, annualized, sent_at, message FROM signal_logs ORDER BY sent_at DESC LIMIT 100
```
`signal_logs` 是 V4 时代用于记录资金费率报警的旧表(`db.py:259-267`),在 V5 体系下不再写入任何数据。这个端点对前端返回的是永久为空的结果,但没有任何错误信息,调用方无从判断是数据为空还是系统正常运行。
---
## 🟢 值得记录的正确设计
以下是审阅过程中发现的值得肯定的设计,供参考:
1. **`position_sync.py` 设计完整**SL 丢失自动重挂、TP1 命中后 SL 移至保本、实际成交价查询、资金费用追踪每8小时结算窗口覆盖了实盘交易的主要边界情况。
2. **risk_guard Fail-Closed 模式正确**`/tmp/risk_guard_state.json` 不存在时live_executor 默认拒绝交易,而不是放行,安全方向正确。
3. **paper_monitor.py 使用 WebSocket 实时价格**:比 signal_engine 15 秒轮询更适合触发止盈止损,不会因为 15 秒间隔错过快速穿越的价格。
4. **agg_trades_collector.py 的数据完整性保障**:每 60 秒做连续性检查,断点处触发 REST 补录,每小时做完整性报告,设计周全。
5. **GCP Secret Manager 集成**live_executor/risk_guard/position_sync 优先从 GCP Secret Manager 加载 API 密钥(`projects/gen-lang-client-0835616737/secrets/BINANCE_*`),生产环境密钥不在代码/环境变量中,安全设计得当。
---
## 📋 修复优先级清单
### 立即(防止实盘上线后资金损失)
| 编号 | 问题 | 修复方向 |
|------|------|---------|
| **S1** | Discord Bot Token 泄露 | 立即在 Discord 开发者后台吊销并重新生成,代码中删除默认值 |
| **F1** | signal_engine 写本地 PGlive_executor 读 Cloud SQL信号永远不传递 | 统一所有进程连接同一 PG 实例,或为 `signal_indicators` 表添加双写逻辑 |
| **F2** | risk_guard 查 Cloud SQL 空表永远触发熔断 | 与 F1 一起解决(统一 DB 连接) |
| **F3** | risk_guard/live_executor 必须共机无文档说明 | 在 PM2 配置和部署文档中明确说明;或改为 DB-based 状态通信 |
| **F4** | signal_pusher 是废弃 SQLite 脚本 | 从 PM2 配置中移除;按需重写成 PG 版本 |
### 本周(防止全新部署报错)
| 编号 | 问题 | 修复方向 |
|------|------|---------|
| **H1** | `signal_indicators``strategy`、`factors` 列 | 在 `SCHEMA_SQL` 中补列;在 `init_schema()` 中加 `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` |
| **H2** | `paper_trades``risk_distance` 列 | 同上,在 `init_schema()` 中补 ALTER |
| **H3** | `users` 表双定义,`banned`/`discord_id` 缺失 | 从 `SCHEMA_SQL` 删除 `users` 建表语句,统一由 `auth.py` 负责;加 ALTER 迁移旧环境 |
| **H4** | `/api/kline` XRP/SOL 返回 ETH 数据 | 要么限制 kline 只支持 BTC/ETH 并在 API 文档中注明;要么扩展 `rate_snapshots` 表结构 |
| **H5** | `subscriptions.py` 孤立 SQLite 代码 | 删除或移至 `archive/` 目录,防止将来误用 |
### 本月(安全加固)
| 编号 | 问题 | 修复方向 |
|------|------|---------|
| **S2** | 数据库密码硬编码 | 移入 `.env` 文件,不进代码仓库;生产环境用 GCP Secret Manager |
| **S3** | JWT Secret 默认值可预测 | 生产部署强制要求 `JWT_SECRET` 环境变量,`_TRADE_ENV=production` 时 None 应直接启动失败 |
| **S4** | CORS 包含 localhost | 生产环境移除 localhost origins |
### 长期(架构改善)
| 编号 | 问题 | 修复方向 |
|------|------|---------|
| **A1** | 策略 JSON 不支持热重载 | 在主循环中定期(如每 60 轮)重新调用 `load_strategy_configs()` |
| **A2** | 三套 DB 连接配置 | 统一用同一套环境变量(建议统一用 `PG_HOST`),所有进程都从 `db.py` 导入连接 |
| **A3** | market_indicators/liquidations 不在主 schema | 将两表定义移入 `SCHEMA_SQL``init_schema()` |
| **A4** | paper_check_positions 死代码 | 删除 signal_engine.py 中的 `paper_check_positions()` 函数(功能由 paper_monitor 承担) |
| **A6** | `/api/signals/history` 返回废弃表数据 | 重定向到查 `signal_indicators` 表,或废弃此端点 |
---
## 附录:文件审阅覆盖情况
| 文件 | 行数 | 本次审阅 |
|------|-----|---------|
| `main.py` | ~500 | ✅ 全文 |
| `db.py` | ~415 | ✅ 全文 |
| `signal_engine.py` | ~1085 | ✅ 全文 |
| `live_executor.py` | ~708 | ✅ 全文 |
| `risk_guard.py` | ~644 | ✅ 全文 |
| `auth.py` | ~389 | ✅ 全文 |
| `position_sync.py` | ~687 | ✅ 全文 |
| `paper_monitor.py` | ~194 | ✅ 全文 |
| `agg_trades_collector.py` | ~400 | ✅ 全文 |
| `market_data_collector.py` | ~300 | ✅ 全文 |
| `liquidation_collector.py` | ~141 | ✅ 全文 |
| `signal_pusher.py` | ~100 | ✅ 全文 |
| `subscriptions.py` | ~24 | ✅ 全文 |
| `trade_config.py` | ~15 | ✅ 全文 |
| `backtest.py` | ~300 | 前100行 + 签名扫描 |
| `admin_cli.py` | ~100 | 签名扫描 |

145
docs/PROBLEM_REPORT.md Normal file
View File

@ -0,0 +1,145 @@
# 项目问题报告
> 生成时间2026-03-03
> 基于 commit `0d9dffa` 的代码分析
---
## 🔴 高危问题(可能导致实盘出错)
### P1数据库从未真正迁移到云端
**现象**:你以为整个系统已经跑在 Cloud SQL 上,实际上只有 `agg_trades` 原始成交数据在双写。其他核心数据全在**本地 PG127.0.0.1**。
| 数据表 | 本地 PG | Cloud SQL |
|-------|---------|-----------|
| `agg_trades`(原始成交) | ✅ | ✅ 双写 |
| `signal_indicators`(信号输出) | ✅ | ❌ 没有 |
| `paper_trades`(模拟盘) | ✅ | ❌ 没有 |
| `rate_snapshots`(费率快照) | ✅ | ❌ 没有 |
| `market_indicators`(市场数据) | ✅ | ❌ 没有 |
| `live_trades`(实盘交易) | ❌ | ✅ 只在云端 |
| `live_config` / `live_events` | ❌ | ✅ 只在云端 |
**最致命的问题**`live_executor.py` 和 `risk_guard.py` 默认连 Cloud SQL`DB_HOST=10.106.0.3`),但 `signal_engine.py` 只把信号写到本地 PG。这意味着
- 实盘执行器读取的 `signal_indicators` 表在 Cloud SQL 里**可能是空的**
- 风控模块监控的 `live_trades` 和信号引擎写的数据完全在两个不同的数据库里
**影响**:实盘交易链路存在断裂风险,需立即核查服务器上各进程实际连接的数据库地址。
**修复方向**:统一所有进程连接到 Cloud SQL或统一连接到本地 PG通过 Cloud SQL Auth Proxy
---
### P2`users` 表双定义字段不一致
`db.py``auth.py` 各自定义了一个 `users` 表,字段不同:
| 字段 | db.py 版本 | auth.py 版本 |
|------|-----------|-------------|
| `email` | ✅ | ✅ |
| `password_hash` | ✅ | ✅ |
| `role` | ✅ | ✅ |
| `created_at` | ✅ | ✅ |
| `discord_id` | ❌ | ✅ |
| `banned` | ❌ | ✅ |
FastAPI 启动时先跑 `init_schema()`db.py 版),再跑 `ensure_auth_tables()`auth.py 版),因为 `CREATE TABLE IF NOT EXISTS` 第一次成功后就不再执行,**实际创建的是缺少 `discord_id``banned` 字段的旧版本**。
**影响**:封禁用户功能(`banned` 字段)在新装环境下可能失效。
---
### P3`signal_indicators` 表 INSERT 包含 `strategy` 字段但 schema 没有
`save_indicator()` 函数向 `signal_indicators` 插入数据时包含 `strategy` 字段(`signal_engine.py:697`),但 `SCHEMA_SQL` 里的建表语句没有这个字段(`db.py:205-224`)。
**影响**:在全新环境初始化后,信号引擎写入会报列不存在的错误。
---
## 🟡 中危问题(影响稳定性和维护)
### P4`requirements.txt` 严重不完整
文件只列了 5 个包,实际运行还需要:
| 缺失依赖 | 用于 |
|---------|------|
| `asyncpg` | FastAPI 异步数据库 |
| `psycopg2-binary` | 同步数据库signal_engine 等) |
| `aiohttp` | live_executor、risk_guard |
| `websockets``httpx` | agg_trades_collector WS 连接 |
| `psutil` | 已在文件里,但版本未锁定 |
**影响**:新机器部署直接失败。
---
### P5`market_indicators` 和 `liquidations` 表不在主 schema 中
这两张表由各自的 collector 进程单独创建,不在 `init_schema()` 里。如果 collector 没跑过signal_engine 查这两张表时会报错(会降级为默认中间分,不会崩溃,但数据不准)。
---
### P6没有 CI/CD没有自动化测试
- 代码变更完全靠人工验证
- 策略逻辑(`evaluate_signal`)没有任何单元测试,重构风险极高
- 部署流程:手动 ssh + git pull + pm2 restart
---
## 🟠 安全风险
### P7测试网密码硬编码在源代码里
三个文件里都有:
```python
os.getenv("PG_PASS", "arb_engine_2026") # db.py:19
os.getenv("DB_PASSWORD", "arb_engine_2026") # live_executor.py:44
os.getenv("DB_PASSWORD", "arb_engine_2026") # risk_guard.py:42
```
代码一旦泄露GitHub public、截图等测试网数据库直接裸奔。
### P8JWT Secret 有测试网默认值
```python
_jwt_default = "arb-engine-jwt-secret-v2-2026" if _TRADE_ENV == "testnet" else None
```
如果生产环境 `TRADE_ENV` 没有正确设置,会静默使用这个已知 secret所有 JWT 都可伪造。
---
## 🔵 架构债务(长期)
### P9三套数据库连接配置并存极易混淆
| 配置方式 | 使用的进程 | 默认连哪 |
|---------|----------|---------|
| `db.py``PG_HOST` | main.py、signal_engine、collectors | `127.0.0.1`(本地) |
| 进程内 `DB_HOST` | live_executor、risk_guard、position_sync | `10.106.0.3`Cloud SQL |
| `market_data_collector.py``PG_HOST` | market_data_collector | `127.0.0.1`(本地) |
没有统一的连接配置入口,每个进程各自读各自的环境变量,迁移时极容易漏改。
### P10前端轮询压力
`/api/rates` 每 2 秒轮询一次,用户多了服务器压力线性增长。目前 3 秒缓存有一定缓冲,但没有限流保护。
---
## 📋 建议优先级
| 优先级 | 任务 |
|-------|------|
| 🔴 立即 | 登服务器确认各进程实际连的数据库地址,核查实盘链路是否完整 |
| 🔴 立即 | 补全 `signal_indicators` 表的 `strategy` 字段 |
| 🔴 本周 | 统一数据库连接配置,所有进程用同一套环境变量 |
| 🟡 本周 | 修复 `users` 表双定义问题,合并到 auth.py 版本 |
| 🟡 本周 | 补全 `requirements.txt` |
| 🟠 本月 | 把硬编码密码移到 `.env` 文件,不进代码仓库 |
| 🔵 长期 | 添加 signal_engine 核心逻辑的单元测试 |
| 🔵 长期 | 配置 GitHub Actions 做基础 lint 和安全扫描 |

View File

@ -0,0 +1,117 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: standard
---
# 00 — System Overview
## Purpose
High-level description of the arbitrage-engine project: what it does, its tech stack, repo layout, and entry points.
## TL;DR
- **Domain**: Crypto perpetual futures funding-rate arbitrage monitoring and short-term trading signal engine.
- **Strategy**: Hold spot long + perpetual short to collect funding rates every 8 h; plus a CVD/ATR-based short-term directional signal engine (V5.x).
- **Backend**: Python / FastAPI + independent PM2 worker processes; PostgreSQL (local + Cloud SQL dual-write).
- **Frontend**: Next.js 16 / React 19 / TypeScript SPA, charting via `lightweight-charts` + Recharts.
- **Targets**: BTC, ETH, XRP, SOL perpetual contracts on Binance USDC-M futures.
- **Deployment**: PM2 process manager on a GCP VM; frontend served via Next.js; backend accessible at `https://arb.zhouyangclaw.com`.
- **Auth**: JWT (access 24 h + refresh 7 d) + invite-code registration gating.
- **Trading modes**: Paper (simulated), Live (Binance Futures testnet or production via `TRADE_ENV`).
## Canonical Facts
### Repo Layout
```
arbitrage-engine/
├── backend/ # Python FastAPI API + all worker processes
│ ├── main.py # FastAPI app entry point (uvicorn)
│ ├── signal_engine.py # V5 signal engine (PM2 worker, 15 s loop)
│ ├── live_executor.py # Live trade executor (PM2 worker)
│ ├── risk_guard.py # Risk circuit-breaker (PM2 worker)
│ ├── market_data_collector.py # Binance WS market data (PM2 worker)
│ ├── agg_trades_collector.py # Binance aggTrades WS collector (PM2 worker)
│ ├── liquidation_collector.py # Binance liquidation WS collector (PM2 worker)
│ ├── signal_pusher.py # Discord signal notifier (PM2 worker)
│ ├── db.py # Dual-pool PostgreSQL layer (psycopg2 sync + asyncpg async)
│ ├── auth.py # JWT auth + invite-code registration router
│ ├── trade_config.py # Symbol / qty precision constants
│ ├── backtest.py # Offline backtest engine
│ ├── paper_monitor.py # Paper trade monitoring helper
│ ├── admin_cli.py # CLI for invite / user management
│ ├── subscriptions.py # Signal subscription query helper
│ ├── paper_config.json # Paper trading runtime toggle
│ ├── strategies/ # JSON strategy configs (v51_baseline, v52_8signals)
│ ├── ecosystem.dev.config.js # PM2 process definitions
│ └── logs/ # Rotating log files
├── frontend/ # Next.js app
│ ├── app/ # App Router pages
│ ├── components/ # Reusable UI components
│ ├── lib/api.ts # Typed API client
│ └── lib/auth.tsx # Auth context + token refresh logic
├── docs/ # Documentation (including docs/ai/)
├── scripts/ # Utility scripts
└── signal-engine.log # Live log symlink / output file
```
### Primary Language & Frameworks
| Layer | Technology |
|-------|-----------|
| Backend API | Python 3.x, FastAPI, uvicorn |
| DB access | asyncpg (async), psycopg2 (sync) |
| Frontend | TypeScript, Next.js 16, React 19 |
| Styling | Tailwind CSS v4 |
| Charts | lightweight-charts 5.x, Recharts 3.x |
| Process manager | PM2 (via `ecosystem.dev.config.js`) |
| Database | PostgreSQL (local + Cloud SQL dual-write) |
### Entry Points
| Process | File | Role |
|---------|------|------|
| HTTP API | `backend/main.py` | FastAPI on uvicorn |
| Signal engine | `backend/signal_engine.py` | 15 s indicator loop |
| Trade executor | `backend/live_executor.py` | PG NOTIFY listener → Binance API |
| Risk guard | `backend/risk_guard.py` | 5 s circuit-breaker loop |
| Market data | `backend/market_data_collector.py` | Binance WS → `market_indicators` table |
| aggTrades collector | `backend/agg_trades_collector.py` | Binance WS → `agg_trades` partitioned table |
| Liquidation collector | `backend/liquidation_collector.py` | Binance WS → liquidation tables |
| Signal pusher | `backend/signal_pusher.py` | DB → Discord push |
| Frontend | `frontend/` | Next.js dev/prod server |
### Monitored Symbols
`BTCUSDT`, `ETHUSDT`, `XRPUSDT`, `SOLUSDT` (Binance USDC-M Futures)
### Environment Variables (key ones)
| Variable | Default | Description |
|----------|---------|-------------|
| `PG_HOST` | `127.0.0.1` | Local PG host |
| `PG_DB` | `arb_engine` | Database name |
| `PG_USER` / `PG_PASS` | `arb` / `arb_engine_2026` | PG credentials |
| `CLOUD_PG_HOST` | `10.106.0.3` | Cloud SQL host |
| `CLOUD_PG_ENABLED` | `true` | Enable dual-write |
| `JWT_SECRET` | (testnet default set) | JWT signing key |
| `TRADE_ENV` | `testnet` | `testnet` or `production` |
| `LIVE_STRATEGIES` | `["v52_8signals"]` | Active live trading strategies |
| `RISK_PER_TRADE_USD` | `2` | USD risk per trade |
## Interfaces / Dependencies
- **External API**: Binance USDC-M Futures REST (`https://fapi.binance.com/fapi/v1`) and WebSocket.
- **Discord**: Webhook for signal notifications (via `signal_pusher.py`).
- **CORS origins**: `https://arb.zhouyangclaw.com`, `http://localhost:3000`, `http://localhost:3001`.
## Unknowns & Risks
- [inference] PM2 `ecosystem.dev.config.js` not read in this pass; exact process restart policies and env injection not confirmed.
- [inference] `.env` file usage confirmed via `python-dotenv` calls in live modules, but `.env.example` absent.
- [unknown] Deployment pipeline (CI/CD) not present in repo.
## Source Refs
- `backend/main.py:1-27` — FastAPI app init, CORS, SYMBOLS
- `backend/signal_engine.py:1-16` — V5 architecture docstring
- `backend/live_executor.py:1-10` — live executor architecture comment
- `backend/risk_guard.py:1-12` — risk guard circuit-break rules
- `backend/db.py:14-30` — PG/Cloud SQL env config
- `frontend/package.json` — frontend dependencies
- `frontend/lib/api.ts:1-116` — typed API client

View File

@ -0,0 +1,137 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: standard
---
# 01 — Architecture Map
## Purpose
Describes the architecture style, component relationships, data flow, and runtime execution topology of the arbitrage engine.
## TL;DR
- **Multi-process architecture**: each concern is a separate PM2 process; they communicate exclusively through PostgreSQL (tables + NOTIFY/LISTEN).
- **No message broker**: PostgreSQL serves as both the data store and the inter-process message bus (`NOTIFY new_signal`).
- **Dual-database write**: every PG write in `signal_engine.py` and `agg_trades_collector.py` attempts a secondary write to Cloud SQL (GCP) for durability.
- **FastAPI is read-only at runtime**: it proxies Binance REST for rates/history and reads the PG tables written by workers; it does not control the signal engine.
- **Signal pipeline**: raw aggTrades → in-memory rolling windows (CVD/VWAP/ATR) → scored signal → PG write + `NOTIFY` → live_executor executes Binance order.
- **Frontend polling**: React SPA polls `/api/rates` every 2 s (public) and slow endpoints every 120 s (auth required).
- **Risk guard is a separate process**: polls every 5 s, can block new orders (circuit-break) by writing a flag to `live_config`; live_executor reads that flag before each trade.
## Canonical Facts
### Architecture Style
Shared-DB multi-process monolith. No microservices; no message broker. All processes run on a single GCP VM.
### Component Diagram (text)
```
Binance WS (aggTrades)
└─► agg_trades_collector.py ──────────────────► agg_trades (partitioned table)
Binance WS (market data) ▼
└─► market_data_collector.py ──────────────► market_indicators table
Binance WS (liquidations) ▼
└─► liquidation_collector.py ──────────────► liquidation tables
signal_engine.py
(15 s loop, reads agg_trades +
market_indicators)
┌─────────┴──────────────┐
│ │
signal_indicators paper_trades
signal_indicators_1m (paper mode)
signal_trades
NOTIFY new_signal
live_executor.py
(LISTEN new_signal →
Binance Futures API)
live_trades table
risk_guard.py (5 s)
monitors live_trades,
writes live_config flags
signal_pusher.py
(reads signal_indicators →
Discord webhook)
FastAPI main.py (read/proxy)
+ rate_snapshots (2 s write)
Next.js Frontend
(polling SPA)
```
### Data Flow — Signal Pipeline
1. `agg_trades_collector.py`: streams `aggTrade` WS events for all symbols, batch-inserts into `agg_trades` partitioned table (partitioned by month on `time_ms`).
2. `signal_engine.py` (15 s loop per symbol):
- Cold-start: reads last N rows from `agg_trades` to warm up `TradeWindow` (CVD, VWAP) and `ATRCalculator` deques.
- Fetches new trades since `last_agg_id`.
- Feeds trades into three `TradeWindow` instances (30 m, 4 h, 24 h) and one `ATRCalculator` (5 m candles, 14-period).
- Reads `market_indicators` for long-short ratio, OI, coinbase premium, funding rate, liquidations.
- Scores signal using JSON strategy config weights (score 0100, threshold 75).
- Writes to `signal_indicators` (15 s cadence) and `signal_indicators_1m` (1 m cadence).
- If score ≥ threshold: opens paper trade (if enabled), emits `NOTIFY new_signal` (if live enabled).
3. `live_executor.py`: `LISTEN new_signal` on PG; deserializes payload; calls Binance Futures REST to place market order; writes to `live_trades`.
4. `risk_guard.py`: every 5 s checks daily loss, consecutive losses, unrealized PnL, balance, data freshness, hold timeout; sets `live_config.circuit_break` flag to block/resume new orders.
### Strategy Scoring (V5.x)
Two JSON configs in `backend/strategies/`:
| Config | Version | Threshold | Signals |
|--------|---------|-----------|---------|
| `v51_baseline.json` | 5.1 | 75 | cvd, p99, accel, ls_ratio, oi, coinbase_premium |
| `v52_8signals.json` | 5.2 | 75 | cvd, p99, accel, ls_ratio, oi, coinbase_premium, funding_rate, liquidation |
Score categories: `direction` (CVD), `crowding` (P99 large trades), `environment` (ATR/VWAP), `confirmation` (LS ratio, OI), `auxiliary` (coinbase premium), `funding_rate`, `liquidation`.
### TP/SL Configuration
- V5.1: SL=1.4×ATR, TP1=1.05×ATR, TP2=2.1×ATR
- V5.2: SL=2.1×ATR, TP1=1.4×ATR, TP2=3.15×ATR
- Signal cooldown: 10 minutes per symbol per direction.
### Risk Guard Circuit-Break Rules
| Rule | Threshold | Action |
|------|-----------|--------|
| Daily loss | -5R | Full close + shutdown |
| Consecutive losses | 5 | Pause 60 min |
| API disconnect | >30 s | Pause new orders |
| Balance too low | < risk×2 | Reject new orders |
| Data stale | >30 s | Block new orders |
| Hold timeout yellow | 45 min | Alert |
| Hold timeout auto-close | 70 min | Force close |
### Frontend Architecture
- **Next.js App Router** (`frontend/app/`): page-per-route, all pages are client components (`"use client"`).
- **Auth**: JWT stored in `localStorage`; `lib/auth.tsx` provides `useAuth()` hook + `authFetch()` helper with auto-refresh.
- **API client**: `lib/api.ts` — typed wrapper, distinguishes public (`/api/rates`, `/api/health`) from protected (all other) endpoints.
- **Polling strategy**: rates every 2 s, slow data (stats, history, signals) every 120 s; kline charts re-render every 30 s.
## Interfaces / Dependencies
- PG NOTIFY channel name: `new_signal`
- `live_config` table keys: `risk_per_trade_usd`, `max_positions`, `circuit_break` (inferred)
- `market_indicators` populated by `market_data_collector.py` with types: `long_short_ratio`, `top_trader_position`, `open_interest_hist`, `coinbase_premium`, `funding_rate`
## Unknowns & Risks
- [inference] PM2 config (`ecosystem.dev.config.js`) not read; exact restart/watch/env-file settings unknown.
- [inference] `signal_pusher.py` exact Discord webhook configuration (env var name, rate limit handling) not confirmed.
- [unknown] Cloud SQL write failure does not block signal_engine but may create data divergence between local PG and Cloud SQL.
- [risk] Hardcoded testnet credentials in source code (`arb_engine_2026`); production requires explicit env var override.
## Source Refs
- `backend/signal_engine.py:1-16` — architecture docstring
- `backend/live_executor.py:1-10` — executor architecture comment
- `backend/risk_guard.py:1-12, 55-73` — risk rules and config
- `backend/signal_engine.py:170-245``TradeWindow`, `ATRCalculator` classes
- `backend/signal_engine.py:44-67` — strategy config loading
- `backend/strategies/v51_baseline.json`, `backend/strategies/v52_8signals.json`
- `backend/main.py:61-83` — background snapshot loop
- `frontend/lib/api.ts:103-116` — API client methods
- `frontend/app/page.tsx:149-154` — polling intervals

View File

@ -0,0 +1,218 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: standard
---
# 02 — Module Cheatsheet
## Purpose
Module-by-module index: file path, role, key public interfaces, and dependencies.
## TL;DR
- Backend has 20 Python modules; signal_engine.py is the largest and most complex (~1000+ lines).
- Frontend has 2 TypeScript lib files + 9 pages + 6 components.
- `db.py` is the only shared infrastructure module; all other backend modules import from it.
- `signal_engine.py` is the core business logic module; `live_executor.py` and `risk_guard.py` are independent processes that only use `db.py` and direct PG connections.
- Strategy configs are external JSON; no code changes needed to tune weights/thresholds.
## Canonical Facts
### Backend Modules
#### `backend/main.py` — FastAPI HTTP API
- **Role**: Primary HTTP API server; rate/snapshot/history proxy; aggTrade query endpoints; signal history.
- **Key interfaces**:
- `GET /api/health` — liveness check (public)
- `GET /api/rates` — live Binance premiumIndex for 4 symbols (public, 3 s cache)
- `GET /api/snapshots` — rate snapshot history from PG (auth required)
- `GET /api/kline` — candlestick bars aggregated from `rate_snapshots` (auth required)
- `GET /api/stats` — 7-day funding rate stats per symbol (auth required, 60 s cache)
- `GET /api/stats/ytd` — YTD annualized stats (auth required, 3600 s cache)
- `GET /api/history` — 7-day raw funding rate history (auth required, 60 s cache)
- `GET /api/signals/history``signal_logs` table (auth required)
- `GET /api/trades/meta``agg_trades_meta` (auth required)
- `GET /api/trades/summary` — aggregated OHLCV from `agg_trades` (auth required)
- Many more: paper trades, signals v52, live trades, live config, position sync, etc. (full list in saved tool output)
- **Deps**: `auth.py`, `db.py`, `httpx`, `asyncio`
- **Background task**: `background_snapshot_loop()` writes `rate_snapshots` every 2 s.
#### `backend/signal_engine.py` — V5 Signal Engine (PM2 worker)
- **Role**: Core signal computation loop; 15 s interval; in-memory rolling-window indicators; scored signal output.
- **Key classes**:
- `TradeWindow(window_ms)` — rolling CVD/VWAP calculator using `deque`; props: `cvd`, `vwap`
- `ATRCalculator(period_ms, length)` — 5-min candle ATR; props: `atr`, `atr_percentile`
- `SymbolState` — per-symbol state container holding `TradeWindow` ×3, `ATRCalculator`, large-order percentile deques
- **Key functions**:
- `load_strategy_configs() -> list[dict]` — reads JSON files from `strategies/`
- `fetch_market_indicators(symbol) -> dict` — reads `market_indicators` table
- `fetch_new_trades(symbol, last_id) -> list` — reads new rows from `agg_trades`
- `save_indicator(ts, symbol, result, strategy)` — writes to `signal_indicators`
- `paper_open_trade(...)` — inserts `paper_trades` row
- `paper_check_positions(symbol, price, now_ms)` — checks TP/SL for paper positions
- `main()` — entry point; calls `load_historical()` then enters main loop
- **Deps**: `db.py`, `json`, `collections.deque`
#### `backend/live_executor.py` — Live Trade Executor (PM2 worker)
- **Role**: Listens on PG `NOTIFY new_signal`; places Binance Futures market orders; writes `live_trades`.
- **Key functions**:
- `reload_live_config(conn)` — refreshes `RISK_PER_TRADE_USD`, `MAX_POSITIONS` from `live_config` every 60 s
- `binance_request(session, method, path, params)` — HMAC-signed Binance API call
- **Config**: `TRADE_ENV` (`testnet`/`production`), `LIVE_STRATEGIES`, `RISK_PER_TRADE_USD`, `MAX_POSITIONS`
- **Deps**: `psycopg2`, `aiohttp`, HMAC signing
#### `backend/risk_guard.py` — Risk Circuit-Breaker (PM2 worker)
- **Role**: Every 5 s; monitors PnL, balance, data freshness, hold timeouts; writes circuit-break flags.
- **Key classes**: `RiskState` — holds `status` (`normal`/`warning`/`circuit_break`), loss counters
- **Key functions**:
- `check_daily_loss(conn)` — sums `pnl_r` from today's `live_trades`
- `check_unrealized_loss(session, risk_usd_dynamic)` — queries Binance positions API
- `check_balance(session)` — queries Binance account balance
- `check_data_freshness(conn)` — checks `market_indicators` recency
- `check_hold_timeout(session, conn)` — force-closes positions held >70 min
- `trigger_circuit_break(session, conn, reason, action)` — writes to `live_events`, may flat positions
- `check_auto_resume()` — re-enables trading after cooldown
- `check_emergency_commands(session, conn)` — watches for manual DB commands
- **Deps**: `trade_config.py`, `aiohttp`, `psycopg2`
#### `backend/db.py` — Database Layer
- **Role**: All PG connectivity; schema creation; partition management.
- **Key interfaces**:
- Sync (psycopg2): `get_sync_conn()`, `sync_execute()`, `sync_executemany()`
- Async (asyncpg): `async_fetch()`, `async_fetchrow()`, `async_execute()`
- Cloud SQL sync pool: `get_cloud_sync_conn()` (non-fatal on failure)
- `init_schema()` — creates all tables from `SCHEMA_SQL`
- `ensure_partitions()` — creates `agg_trades_YYYYMM` partitions for current+next 2 months
- **Deps**: `asyncpg`, `psycopg2`
#### `backend/auth.py` — JWT Auth + Registration
- **Role**: FastAPI router at `/api`; register/login/refresh/logout endpoints.
- **Key interfaces**:
- `POST /api/register` — invite-code gated registration
- `POST /api/login` — returns `access_token` + `refresh_token`
- `POST /api/refresh` — token refresh
- `POST /api/logout` — revokes refresh token
- `GET /api/me` — current user info
- `get_current_user` — FastAPI `Depends` injector; validates Bearer JWT
- **Token storage**: HMAC-SHA256 hand-rolled JWT (no PyJWT); refresh tokens stored in `refresh_tokens` table.
- **Deps**: `db.py`, `hashlib`, `hmac`, `secrets`
#### `backend/agg_trades_collector.py` — AggTrades Collector (PM2 worker)
- **Role**: Streams Binance `aggTrade` WebSocket events; batch-inserts into `agg_trades` partitioned table; maintains `agg_trades_meta`.
- **Key functions**: `ws_collect(symbol)`, `rest_catchup(symbol, from_id)`, `continuity_check()`, `flush_buffer(symbol, trades)`
- **Deps**: `db.py`, `websockets`/`httpx`
#### `backend/market_data_collector.py` — Market Data Collector (PM2 worker)
- **Role**: Collects Binance market indicators (LS ratio, OI, coinbase premium, funding rate) via REST polling; stores in `market_indicators` JSONB.
- **Key class**: `MarketDataCollector`
- **Deps**: `db.py`, `httpx`
#### `backend/liquidation_collector.py` — Liquidation Collector (PM2 worker)
- **Role**: Streams Binance liquidation WS; aggregates into `liquidation_events` and `liquidation_agg` tables.
- **Key functions**: `ensure_table()`, `save_liquidation()`, `save_aggregated()`, `run()`
- **Deps**: `db.py`, `websockets`
#### `backend/backtest.py` — Offline Backtester
- **Role**: Replays `agg_trades` from PG to simulate signal engine and measure strategy performance.
- **Key classes**: `Position`, `BacktestEngine`
- **Key functions**: `load_trades()`, `run_backtest()`, `main()`
- **Deps**: `db.py`
#### `backend/trade_config.py` — Symbol / Qty Config
- **Role**: Constants for symbols and Binance qty precision.
- **Deps**: none
#### `backend/admin_cli.py` — Admin CLI
- **Role**: CLI for invite-code and user management (gen_invite, list_invites, ban_user, set_admin).
- **Deps**: `db.py`
#### `backend/subscriptions.py` — Subscription Query Helper
- **Role**: Helpers for querying signal history (used internally).
- **Deps**: `db.py`
#### `backend/paper_monitor.py` — Paper Trade Monitor
- **Role**: Standalone script to print paper trade status.
- **Deps**: `db.py`
#### `backend/signal_pusher.py` — Discord Notifier (PM2 worker)
- **Role**: Polls `signal_indicators` for high-score events; pushes Discord webhook notifications.
- **Deps**: `db.py`, `httpx`
#### `backend/position_sync.py` — Position Sync
- **Role**: Syncs live positions between `live_trades` table and Binance account state.
- **Deps**: `db.py`, `aiohttp`
#### `backend/fix_historical_pnl.py` — PnL Fix Script
- **Role**: One-time migration to recalculate historical PnL in `paper_trades`.
- **Deps**: `db.py`
### Frontend Modules
#### `frontend/lib/api.ts` — API Client
- **Role**: Typed `api` object with all backend endpoint wrappers; distinguishes public vs. protected fetches.
- **Interfaces exported**: `RateData`, `RatesResponse`, `HistoryPoint`, `HistoryResponse`, `StatsResponse`, `SignalHistoryItem`, `SnapshotItem`, `KBar`, `KlineResponse`, `YtdStatsResponse`, `api` object
- **Deps**: `lib/auth.tsx` (`authFetch`)
#### `frontend/lib/auth.tsx` — Auth Context
- **Role**: React context for current user; `useAuth()` hook; `authFetch()` with access-token injection and auto-refresh.
- **Deps**: Next.js router, `localStorage`
#### `frontend/app/` Pages
| Page | Route | Description |
|------|-------|-------------|
| `page.tsx` | `/` | Main dashboard: rates, kline, history, signal log |
| `dashboard/page.tsx` | `/dashboard` | (inferred) extended dashboard |
| `signals/page.tsx` | `/signals` | Signal history view (V5.1) |
| `signals-v52/page.tsx` | `/signals-v52` | Signal history view (V5.2) |
| `paper/page.tsx` | `/paper` | Paper trades view (V5.1) |
| `paper-v52/page.tsx` | `/paper-v52` | Paper trades view (V5.2) |
| `live/page.tsx` | `/live` | Live trades view |
| `history/page.tsx` | `/history` | Funding rate history |
| `kline/page.tsx` | `/kline` | Kline chart page |
| `trades/page.tsx` | `/trades` | aggTrades summary |
| `server/page.tsx` | `/server` | Server status / metrics |
| `about/page.tsx` | `/about` | About page |
| `login/page.tsx` | `/login` | Login form |
| `register/page.tsx` | `/register` | Registration form |
#### `frontend/components/`
| Component | Role |
|-----------|------|
| `Navbar.tsx` | Top navigation bar |
| `Sidebar.tsx` | Sidebar navigation |
| `AuthHeader.tsx` | Auth-aware header with user info |
| `RateCard.tsx` | Displays current funding rate for one asset |
| `StatsCard.tsx` | Displays 7d mean and annualized stats |
| `FundingChart.tsx` | Funding rate chart component |
| `LiveTradesCard.tsx` | Live trades summary card |
## Interfaces / Dependencies
### Key import graph (backend)
```
main.py → auth.py, db.py
signal_engine.py → db.py
live_executor.py → psycopg2 direct (no db.py module import)
risk_guard.py → trade_config.py, psycopg2 direct
backtest.py → db.py
agg_trades_collector.py → db.py
market_data_collector.py → db.py
liquidation_collector.py → db.py
admin_cli.py → db.py
```
## Unknowns & Risks
- [inference] Content of `frontend/app/dashboard/`, `signals/`, `paper/`, `live/` pages not read; role described from filename convention.
- [unknown] `signal_pusher.py` Discord webhook env var name not confirmed.
- [inference] `position_sync.py` exact interface not read; role inferred from name and listing.
## Source Refs
- `backend/main.py` — all API route definitions
- `backend/signal_engine.py:170-285``TradeWindow`, `ATRCalculator`, `SymbolState`
- `backend/auth.py:23-23` — router prefix `/api`
- `backend/db.py:35-157` — all public DB functions
- `frontend/lib/api.ts:103-116``api` export object
- `frontend/lib/auth.tsx` — auth context (not fully read)

242
docs/ai/03-api-contracts.md Normal file
View File

@ -0,0 +1,242 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: standard
---
# 03 — API Contracts
## Purpose
Documents all REST API endpoints, authentication requirements, request/response shapes, and error conventions.
## TL;DR
- Base URL: `https://arb.zhouyangclaw.com` (prod) or `http://localhost:8000` (local).
- Auth: Bearer JWT in `Authorization` header. Two public endpoints (`/api/health`, `/api/rates`) need no token.
- Token lifecycle: access token 24 h, refresh token 7 d; use `POST /api/refresh` to renew.
- Registration is invite-code gated: must supply a valid `invite_code` in register body.
- All timestamps are Unix epoch (seconds or ms depending on field; see per-endpoint notes).
- Funding rates are stored as decimals (e.g. `0.0001` = 0.01%). Frontend multiplies by 10000 for "万分之" display.
- Error responses: standard FastAPI `{"detail": "..."}` with appropriate HTTP status codes.
## Canonical Facts
### Authentication
#### `POST /api/register`
```json
// Request
{
"email": "user@example.com",
"password": "...",
"invite_code": "XXXX"
}
// Response 200
{
"access_token": "<jwt>",
"refresh_token": "<token>",
"token_type": "bearer",
"user": { "id": 1, "email": "...", "role": "user" }
}
// Errors: 400 (invite invalid/expired), 409 (email taken)
```
#### `POST /api/login`
```json
// Request
{
"email": "user@example.com",
"password": "..."
}
// Response 200
{
"access_token": "<jwt>",
"refresh_token": "<token>",
"token_type": "bearer",
"user": { "id": 1, "email": "...", "role": "user" }
}
// Errors: 401 (invalid credentials), 403 (banned)
```
#### `POST /api/refresh`
```json
// Request
{ "refresh_token": "<token>" }
// Response 200
{ "access_token": "<new_jwt>", "token_type": "bearer" }
// Errors: 401 (expired/revoked)
```
#### `POST /api/logout`
```json
// Request header: Authorization: Bearer <access_token>
// Request body: { "refresh_token": "<token>" }
// Response 200: { "ok": true }
```
#### `GET /api/me`
```json
// Auth required
// Response 200
{ "id": 1, "email": "...", "role": "user", "created_at": "..." }
```
### Public Endpoints (no auth)
#### `GET /api/health`
```json
{ "status": "ok", "timestamp": "2026-03-03T12:00:00" }
```
#### `GET /api/rates`
Returns live Binance premiumIndex for BTCUSDT, ETHUSDT, XRPUSDT, SOLUSDT. Cached 3 s.
```json
{
"BTC": {
"symbol": "BTCUSDT",
"markPrice": 65000.0,
"indexPrice": 64990.0,
"lastFundingRate": 0.0001,
"nextFundingTime": 1234567890000,
"timestamp": 1234567890000
},
"ETH": { ... },
"XRP": { ... },
"SOL": { ... }
}
```
### Protected Endpoints (Bearer JWT required)
#### `GET /api/history`
7-day funding rate history from Binance. Cached 60 s.
```json
{
"BTC": [
{ "fundingTime": 1234567890000, "fundingRate": 0.0001, "timestamp": "2026-03-01T08:00:00" }
],
"ETH": [ ... ], "XRP": [ ... ], "SOL": [ ... ]
}
```
#### `GET /api/stats`
7-day funding rate statistics. Cached 60 s.
```json
{
"BTC": { "mean7d": 0.01, "annualized": 10.95, "count": 21 },
"ETH": { ... },
"combo": { "mean7d": 0.009, "annualized": 9.85 }
}
// mean7d in %; annualized = mean * 3 * 365 * 100
```
#### `GET /api/stats/ytd`
Year-to-date annualized stats. Cached 3600 s.
```json
{
"BTC": { "annualized": 12.5, "count": 150 },
"ETH": { ... }
}
```
#### `GET /api/snapshots?hours=24&limit=5000`
Rate snapshots from local PG.
```json
{
"count": 43200,
"hours": 24,
"data": [
{ "ts": 1709000000, "btc_rate": 0.0001, "eth_rate": 0.00008, "btc_price": 65000, "eth_price": 3200 }
]
}
```
#### `GET /api/kline?symbol=BTC&interval=1h&limit=500`
Candlestick bars derived from `rate_snapshots`. Rates scaled by ×10000.
- `interval`: `1m`, `5m`, `30m`, `1h`, `4h`, `8h`, `1d`, `1w`, `1M`
```json
{
"symbol": "BTC",
"interval": "1h",
"count": 24,
"data": [
{
"time": 1709000000,
"open": 1.0, "high": 1.2, "low": 0.8, "close": 1.1,
"price_open": 65000, "price_high": 65500, "price_low": 64800, "price_close": 65200
}
]
}
```
#### `GET /api/signals/history?limit=100`
Legacy signal log from `signal_logs` table.
```json
{
"items": [
{ "id": 1, "symbol": "BTCUSDT", "rate": 0.0001, "annualized": 10.95, "sent_at": "2026-03-01T08:00:00", "message": "..." }
]
}
```
#### `GET /api/trades/meta`
aggTrades collection status.
```json
{
"BTC": { "last_agg_id": 123456789, "last_time_ms": 1709000000000, "updated_at": "2026-03-03 12:00:00" }
}
```
#### `GET /api/trades/summary?symbol=BTC&start_ms=0&end_ms=0&interval=1m`
Aggregated OHLCV from `agg_trades` via PG native aggregation.
```json
{
"symbol": "BTC",
"interval": "1m",
"count": 60,
"data": [
{ "bar_ms": 1709000000000, "buy_vol": 10.5, "sell_vol": 9.3, "trade_count": 45, "vwap": 65000.0, "max_qty": 2.5 }
]
}
```
#### Signal V52 Endpoints (inferred from frontend routes)
- `GET /api/signals/v52` — signals for v52_8signals strategy
- `GET /api/paper/trades` — paper trade history
- `GET /api/paper/trades/v52` — v52 paper trade history
- `GET /api/live/trades` — live trade history
- `GET /api/live/config` — current live config
- `GET /api/live/events` — live trading event log
- `GET /api/server/stats` — server process stats (psutil)
### Auth Header Format
```
Authorization: Bearer <access_token>
```
Frontend auto-injects via `authFetch()` in `lib/auth.tsx`. On 401, attempts token refresh before retry.
### Error Shape
All errors follow FastAPI default:
```json
{ "detail": "Human-readable error message" }
```
Common HTTP status codes: 400 (bad request), 401 (unauthorized), 403 (forbidden/banned), 404 (not found), 422 (validation error), 502 (Binance upstream error).
## Interfaces / Dependencies
- Binance USDC-M Futures REST: `https://fapi.binance.com/fapi/v1/premiumIndex`, `/fundingRate`
- CORS allowed origins: `https://arb.zhouyangclaw.com`, `http://localhost:3000`, `http://localhost:3001`
- `NEXT_PUBLIC_API_URL` env var controls the frontend base URL (empty = same-origin)
## Unknowns & Risks
- [inference] Full endpoint list for signals-v52, paper-v52, live, server pages not confirmed by reading main.py lines 300+. The full saved output contains more routes.
- [inference] `POST /api/register` exact field validation (password min length, etc.) not confirmed.
- [risk] No rate limiting visible on public endpoints; `/api/rates` with 3 s cache could be bypassed by direct calls.
## Source Refs
- `backend/main.py:101-298` — all confirmed REST endpoints
- `backend/auth.py:23` — auth router prefix
- `backend/main.py:16-21` — CORS config
- `frontend/lib/api.ts:90-116` — client-side API wrappers
- `frontend/lib/auth.tsx``authFetch` with auto-refresh (not fully read)

301
docs/ai/04-data-model.md Normal file
View File

@ -0,0 +1,301 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: standard
---
# 04 — Data Model
## Purpose
Documents all PostgreSQL tables, columns, relations, constraints, storage design, and partitioning strategy.
## TL;DR
- Single PostgreSQL database `arb_engine`; 15+ tables defined in `db.py` `SCHEMA_SQL` + `auth.py` `AUTH_SCHEMA`.
- `agg_trades` is a range-partitioned table (by `time_ms` in milliseconds); monthly partitions auto-created by `ensure_partitions()`.
- Dual-write: local PG is primary; Cloud SQL at `10.106.0.3` receives same writes via a secondary psycopg2 pool (non-fatal if down).
- All timestamps: `ts` columns are Unix seconds (integer); `time_ms` columns are Unix milliseconds (bigint); `created_at` columns are PG `TIMESTAMP`.
- JSONB used for `score_factors` in `paper_trades`/`live_trades`, `detail` in `live_events`, `value` in `market_indicators`.
- Auth tokens stored in DB: refresh tokens in `refresh_tokens` table (revocable); no session table.
## Canonical Facts
### Tables
#### `rate_snapshots` — Funding Rate Snapshots
Populated every 2 s by `background_snapshot_loop()` in `main.py`.
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `ts` | BIGINT NOT NULL | Unix seconds |
| `btc_rate` | DOUBLE PRECISION | BTC funding rate (decimal) |
| `eth_rate` | DOUBLE PRECISION | ETH funding rate |
| `btc_price` | DOUBLE PRECISION | BTC mark price USD |
| `eth_price` | DOUBLE PRECISION | ETH mark price USD |
| `btc_index_price` | DOUBLE PRECISION | BTC index price |
| `eth_index_price` | DOUBLE PRECISION | ETH index price |
Index: `idx_rate_snapshots_ts` on `ts`.
---
#### `agg_trades` — Aggregate Trades (Partitioned)
Partitioned by `RANGE(time_ms)`; monthly child tables named `agg_trades_YYYYMM`.
| Column | Type | Description |
|--------|------|-------------|
| `agg_id` | BIGINT NOT NULL | Binance aggTrade ID |
| `symbol` | TEXT NOT NULL | e.g. `BTCUSDT` |
| `price` | DOUBLE PRECISION | Trade price |
| `qty` | DOUBLE PRECISION | Trade quantity (BTC/ETH/etc.) |
| `time_ms` | BIGINT NOT NULL | Trade timestamp ms |
| `is_buyer_maker` | SMALLINT | 0=taker buy, 1=taker sell |
PK: `(time_ms, symbol, agg_id)`.
Indexes: `idx_agg_trades_sym_time` on `(symbol, time_ms DESC)`, `idx_agg_trades_sym_agg` on `(symbol, agg_id)`.
Partitions auto-created for current + next 2 months. Named `agg_trades_YYYYMM`.
---
#### `agg_trades_meta` — Collection State
| Column | Type | Description |
|--------|------|-------------|
| `symbol` | TEXT PK | e.g. `BTCUSDT` |
| `last_agg_id` | BIGINT | Last processed aggTrade ID |
| `last_time_ms` | BIGINT | Timestamp of last trade |
| `earliest_agg_id` | BIGINT | Oldest buffered ID |
| `earliest_time_ms` | BIGINT | Oldest buffered timestamp |
| `updated_at` | TEXT | Human-readable update time |
---
#### `signal_indicators` — Signal Engine Output (15 s cadence)
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `ts` | BIGINT | Unix seconds |
| `symbol` | TEXT | |
| `cvd_fast` | DOUBLE PRECISION | CVD 30 m window |
| `cvd_mid` | DOUBLE PRECISION | CVD 4 h window |
| `cvd_day` | DOUBLE PRECISION | CVD UTC day |
| `cvd_fast_slope` | DOUBLE PRECISION | CVD momentum |
| `atr_5m` | DOUBLE PRECISION | ATR (5 m candles, 14 periods) |
| `atr_percentile` | DOUBLE PRECISION | ATR rank in 24 h history |
| `vwap_30m` | DOUBLE PRECISION | VWAP 30 m |
| `price` | DOUBLE PRECISION | Current mark price |
| `p95_qty` | DOUBLE PRECISION | P95 large-order threshold |
| `p99_qty` | DOUBLE PRECISION | P99 large-order threshold |
| `buy_vol_1m` | DOUBLE PRECISION | 1 m buy volume |
| `sell_vol_1m` | DOUBLE PRECISION | 1 m sell volume |
| `score` | INTEGER | Signal score 0100 |
| `signal` | TEXT | `LONG`, `SHORT`, or null |
Indexes: `idx_si_ts`, `idx_si_sym_ts`.
---
#### `signal_indicators_1m` — 1-Minute Signal Snapshot
Subset of `signal_indicators` columns; written at 1 m cadence for lightweight chart queries.
---
#### `signal_trades` — Signal Engine Trade Tracking
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `ts_open` | BIGINT | Open timestamp (Unix s) |
| `ts_close` | BIGINT | Close timestamp |
| `symbol` | TEXT | |
| `direction` | TEXT | `LONG` / `SHORT` |
| `entry_price` | DOUBLE PRECISION | |
| `exit_price` | DOUBLE PRECISION | |
| `qty` | DOUBLE PRECISION | |
| `score` | INTEGER | Signal score at entry |
| `pnl` | DOUBLE PRECISION | Realized PnL |
| `sl_price` | DOUBLE PRECISION | Stop-loss level |
| `tp1_price` | DOUBLE PRECISION | Take-profit 1 level |
| `tp2_price` | DOUBLE PRECISION | Take-profit 2 level |
| `status` | TEXT DEFAULT `open` | `open`, `closed`, `stopped` |
---
#### `paper_trades` — Paper Trading Records
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `symbol` | TEXT | |
| `direction` | TEXT | `LONG`/`SHORT` |
| `score` | INT | Signal score |
| `tier` | TEXT | `light`/`standard`/`heavy` |
| `entry_price` | DOUBLE PRECISION | |
| `entry_ts` | BIGINT | Unix ms |
| `exit_price` | DOUBLE PRECISION | |
| `exit_ts` | BIGINT | |
| `tp1_price` | DOUBLE PRECISION | |
| `tp2_price` | DOUBLE PRECISION | |
| `sl_price` | DOUBLE PRECISION | |
| `tp1_hit` | BOOLEAN DEFAULT FALSE | |
| `status` | TEXT DEFAULT `active` | `active`, `tp1`, `tp2`, `sl`, `timeout` |
| `pnl_r` | DOUBLE PRECISION | PnL in R units |
| `atr_at_entry` | DOUBLE PRECISION | ATR snapshot at entry |
| `score_factors` | JSONB | Breakdown of signal score components |
| `strategy` | VARCHAR(32) DEFAULT `v51_baseline` | Strategy name |
| `created_at` | TIMESTAMP | |
---
#### `live_trades` — Live Trading Records
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `symbol` | TEXT | |
| `strategy` | TEXT | |
| `direction` | TEXT | `LONG`/`SHORT` |
| `status` | TEXT DEFAULT `active` | |
| `entry_price` / `exit_price` | DOUBLE PRECISION | |
| `entry_ts` / `exit_ts` | BIGINT | Unix ms |
| `sl_price`, `tp1_price`, `tp2_price` | DOUBLE PRECISION | |
| `tp1_hit` | BOOLEAN | |
| `score` | DOUBLE PRECISION | |
| `tier` | TEXT | |
| `pnl_r` | DOUBLE PRECISION | |
| `fee_usdt` | DOUBLE PRECISION | Exchange fees |
| `funding_fee_usdt` | DOUBLE PRECISION | Funding fees paid while holding |
| `risk_distance` | DOUBLE PRECISION | Entry to SL distance |
| `atr_at_entry` | DOUBLE PRECISION | |
| `score_factors` | JSONB | |
| `signal_id` | BIGINT | FK → signal_indicators.id |
| `binance_order_id` | TEXT | Binance order ID |
| `fill_price` | DOUBLE PRECISION | Actual fill price |
| `slippage_bps` | DOUBLE PRECISION | Slippage in basis points |
| `protection_gap_ms` | BIGINT | Time between SL order and fill |
| `signal_to_order_ms` | BIGINT | Latency: signal → order placed |
| `order_to_fill_ms` | BIGINT | Latency: order → fill |
| `qty` | DOUBLE PRECISION | |
| `created_at` | TIMESTAMP | |
---
#### `live_config` — Runtime Configuration KV Store
| Column | Type | Description |
|--------|------|-------------|
| `key` | TEXT PK | Config key |
| `value` | TEXT | Config value (string) |
| `label` | TEXT | Human label |
| `updated_at` | TIMESTAMP | |
Known keys: `risk_per_trade_usd`, `max_positions`, `circuit_break` (inferred).
---
#### `live_events` — Trade Event Log
| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGSERIAL PK | |
| `ts` | BIGINT | Unix ms (default: NOW()) |
| `level` | TEXT | `info`/`warning`/`error` |
| `category` | TEXT | Event category |
| `symbol` | TEXT | |
| `message` | TEXT | |
| `detail` | JSONB | Structured event data |
---
#### `signal_logs` — Legacy Signal Log
Kept for backwards compatibility with the original funding-rate signal system.
| Column | Type |
|--------|------|
| `id` | BIGSERIAL PK |
| `symbol` | TEXT |
| `rate` | DOUBLE PRECISION |
| `annualized` | DOUBLE PRECISION |
| `sent_at` | TEXT |
| `message` | TEXT |
---
#### Auth Tables (defined in `auth.py` AUTH_SCHEMA)
**`users`**
| Column | Type |
|--------|------|
| `id` | BIGSERIAL PK |
| `email` | TEXT UNIQUE NOT NULL |
| `password_hash` | TEXT NOT NULL |
| `discord_id` | TEXT |
| `role` | TEXT DEFAULT `user` |
| `banned` | INTEGER DEFAULT 0 |
| `created_at` | TEXT |
**`subscriptions`**
| Column | Type |
|--------|------|
| `user_id` | BIGINT PK → users |
| `tier` | TEXT DEFAULT `free` |
| `expires_at` | TEXT |
**`invite_codes`**
| Column | Type |
|--------|------|
| `id` | BIGSERIAL PK |
| `code` | TEXT UNIQUE |
| `created_by` | INTEGER |
| `max_uses` | INTEGER DEFAULT 1 |
| `used_count` | INTEGER DEFAULT 0 |
| `status` | TEXT DEFAULT `active` |
| `expires_at` | TEXT |
**`invite_usage`**
| Column | Type |
|--------|------|
| `id` | BIGSERIAL PK |
| `code_id` | BIGINT → invite_codes |
| `user_id` | BIGINT → users |
| `used_at` | TEXT |
**`refresh_tokens`**
| Column | Type |
|--------|------|
| `id` | BIGSERIAL PK |
| `user_id` | BIGINT → users |
| `token` | TEXT UNIQUE |
| `expires_at` | TEXT |
| `revoked` | INTEGER DEFAULT 0 |
---
#### `market_indicators` — Market Indicator JSONB Store
Populated by `market_data_collector.py`.
| Column | Type | Description |
|--------|------|-------------|
| `symbol` | TEXT | |
| `indicator_type` | TEXT | `long_short_ratio`, `top_trader_position`, `open_interest_hist`, `coinbase_premium`, `funding_rate` |
| `timestamp_ms` | BIGINT | |
| `value` | JSONB | Raw indicator payload |
Query pattern: `WHERE symbol=? AND indicator_type=? ORDER BY timestamp_ms DESC LIMIT 1`.
### Storage Design Decisions
- **Partitioning**: `agg_trades` partitioned by month to avoid table bloat; partition maintenance is automated.
- **Dual-write**: Cloud SQL secondary is best-effort (errors logged, never fatal).
- **JSONB `score_factors`**: allows schema-free storage of per-strategy signal breakdowns without migrations.
- **Timestamps**: mix of Unix seconds (`ts`), Unix ms (`time_ms`, `timestamp_ms`, `entry_ts`), ISO strings (`created_at` TEXT in auth tables), and PG `TIMESTAMP`; be careful when querying across tables.
## Interfaces / Dependencies
- `db.py:init_schema()` — creates all tables in `SCHEMA_SQL`
- `auth.py:ensure_tables()` — creates auth tables from `AUTH_SCHEMA`
- `db.py:ensure_partitions()` — auto-creates monthly `agg_trades_YYYYMM` partitions
## Unknowns & Risks
- [unknown] `market_indicators` table schema not in `SCHEMA_SQL`; likely created by `market_data_collector.py` separately — verify before querying.
- [risk] Timestamp inconsistency: some tables use TEXT for timestamps (auth tables), others use BIGINT, others use PG TIMESTAMP — cross-table JOINs on time fields require explicit casting.
- [inference] `live_config` circuit-break key name not confirmed from source; inferred from `risk_guard.py` behavior.
- [risk] `users` table defined in both `SCHEMA_SQL` (db.py) and `AUTH_SCHEMA` (auth.py); duplicate CREATE TABLE IF NOT EXISTS; actual schema diverges between the two definitions (db.py version lacks `discord_id`, `banned`).
## Source Refs
- `backend/db.py:166-356``SCHEMA_SQL` with all table definitions
- `backend/auth.py:28-71``AUTH_SCHEMA` auth tables
- `backend/db.py:360-414``ensure_partitions()`, `init_schema()`
- `backend/signal_engine.py:123-158``market_indicators` query pattern

View File

@ -0,0 +1,251 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: deep
---
# 05 — Build, Run & Test
## Purpose
所有构建、运行、测试、部署相关命令及环境变量配置说明。
## TL;DR
- 无 CI/CD 流水线;手动部署到 GCP VMPM2 管理进程。
- 后端无构建步骤,直接 `python3 main.py` 或 PM2 启动。
- 前端标准 Next.js`npm run dev` / `npm run build` / `npm start`
- 测试文件未发现;验证通过 backtest.py 回测和 paper trading 模拟盘进行。
- 本地开发:前端 `/api/*` 通过 Next.js rewrite 代理到 `http://127.0.0.1:4332`(即 uvicorn 端口)。
- 数据库 schema 自动在启动时初始化(`init_schema()`),无独立 migration 工具。
## Canonical Facts
### 环境要求
| 组件 | 要求 |
|------|------|
| Python | 3.10+(使用 `list[dict]` 等 3.10 语法) |
| Node.js | 推荐 20.xpackage.json `@types/node: ^20` |
| PostgreSQL | 本地实例 + Cloud SQL`10.106.0.3` |
| PM2 | 用于进程管理(需全局安装) |
### 后端依赖安装
```bash
cd backend
pip install -r requirements.txt
# requirements.txt 内容fastapi, uvicorn, httpx, python-dotenv, psutil
# 实际还需要(从源码 import 推断):
# asyncpg, psycopg2-binary, aiohttp, websockets
```
> [inference] `requirements.txt` 内容不完整,仅列出 5 个包,但源码 import 了 `asyncpg`、`psycopg2`、`aiohttp` 等。运行前需确认完整依赖已安装。
### 后端启动
#### 单进程开发模式
```bash
cd backend
# FastAPI HTTP API默认端口 4332从 next.config.ts 推断)
uvicorn main:app --host 0.0.0.0 --port 4332 --reload
# 信号引擎(独立进程)
python3 signal_engine.py
# aggTrades 收集器
python3 agg_trades_collector.py
# 市场数据收集器
python3 market_data_collector.py
# 清算数据收集器
python3 liquidation_collector.py
# 实盘执行器
TRADE_ENV=testnet python3 live_executor.py
# 风控模块
TRADE_ENV=testnet python3 risk_guard.py
# 信号推送Discord
python3 signal_pusher.py
```
#### PM2 生产模式
```bash
cd backend
# 使用 ecosystem 配置(目前只定义了 arb-dev-signal
pm2 start ecosystem.dev.config.js
# 查看进程状态
pm2 status
# 查看日志
pm2 logs arb-dev-signal
# 停止所有
pm2 stop all
# 重启
pm2 restart all
```
> [inference] `ecosystem.dev.config.js` 当前只配置了 `signal_engine.py`,其他进程需手动启动或添加到 PM2 配置。
### 环境变量配置
#### 数据库(所有后端进程共用)
```bash
export PG_HOST=127.0.0.1 # 本地 PG
export PG_PORT=5432
export PG_DB=arb_engine
export PG_USER=arb
export PG_PASS=arb_engine_2026 # 测试网默认,生产需覆盖
export CLOUD_PG_HOST=10.106.0.3 # Cloud SQL
export CLOUD_PG_ENABLED=true
```
#### 认证
```bash
export JWT_SECRET=<> # 生产环境必填,长度 ≥32
# 测试网有默认值 "arb-engine-jwt-secret-v2-2026",生产环境 TRADE_ENV != testnet 时必须设置
```
#### 交易环境
```bash
export TRADE_ENV=testnet # 或 production
export LIVE_STRATEGIES='["v52_8signals"]'
export RISK_PER_TRADE_USD=2 # 每笔风险 USD
export MAX_POSITIONS=4 # 最大同时持仓数
```
#### 实盘专用live_executor + risk_guard
```bash
export DB_HOST=10.106.0.3
export DB_PASSWORD=<生产密码>
export DB_NAME=arb_engine
export DB_USER=arb
# 币安 API Key需在 Binance 配置)
export BINANCE_API_KEY=<key>
export BINANCE_API_SECRET=<secret>
```
#### 前端
```bash
# .env.local 或部署环境
NEXT_PUBLIC_API_URL= # 留空=同源,生产时设为 https://arb.zhouyangclaw.com
```
### 数据库初始化
```bash
# schema 在 FastAPI 启动时自动创建init_schema + ensure_auth_tables
# 手动初始化:
cd backend
python3 -c "from db import init_schema; init_schema()"
# 分区维护(自动在 init_schema 内调用):
python3 -c "from db import ensure_partitions; ensure_partitions()"
```
### 前端构建与启动
```bash
cd frontend
npm install
# 开发模式(热重载,端口 3000
npm run dev
# 生产构建
npm run build
npm start
# Lint
npm run lint
```
### 前端 API 代理配置
`frontend/next.config.ts``/api/*` 代理到 `http://127.0.0.1:4332`
- 本地开发时 uvicorn 需监听 **4332 端口**
- 生产部署时通过 `NEXT_PUBLIC_API_URL` 或 nginx 反向代理处理跨域。
### 回测(离线验证)
```bash
cd backend
# 指定天数回测
python3 backtest.py --symbol BTCUSDT --days 20
# 指定日期范围回测
python3 backtest.py --symbol BTCUSDT --start 2026-02-08 --end 2026-02-28
# 输出:胜率、盈亏比、夏普比率、最大回撤等统计
```
### 模拟盘Paper Trading
通过 `paper_config.json` 控制:
```json
{
"enabled": true,
"enabled_strategies": ["v52_8signals"],
"initial_balance": 10000,
"risk_per_trade": 0.02,
"max_positions": 4
}
```
修改后 signal_engine 下次循环自动读取(无需重启)。
监控模拟盘:
```bash
cd backend
python3 paper_monitor.py
```
### 管理员 CLI
```bash
cd backend
python3 admin_cli.py gen_invite [count] [max_uses]
python3 admin_cli.py list_invites
python3 admin_cli.py disable_invite <code>
python3 admin_cli.py list_users
python3 admin_cli.py ban_user <user_id>
python3 admin_cli.py unban_user <user_id>
python3 admin_cli.py set_admin <user_id>
python3 admin_cli.py usage
```
### 日志位置
| 进程 | 日志文件 |
|------|---------|
| signal_engine | `signal-engine.log`(项目根目录) |
| risk_guard | `backend/logs/risk_guard.log`RotatingFileHandler10MB×5 |
| 其他进程 | stdout / PM2 logs |
### 无测试框架
项目中未发现 `pytest`、`unittest` 或任何测试文件。验证策略依赖:
1. **回测**`backtest.py` 逐 tick 回放历史数据
2. **模拟盘**paper trading 实时验证信号质量
3. **手动测试**:前端页面人工验证
## Interfaces / Dependencies
- uvicorn 端口:**4332**(从 `next.config.ts` 推断)
- 前端开发端口:**3000**Next.js 默认)
- CORS 允许 `localhost:3000``localhost:3001`
## Unknowns & Risks
- [inference] uvicorn 端口 4332 从 `next.config.ts` 推断,未在 `main.py` 或启动脚本中显式确认。
- [inference] `requirements.txt` 不完整,实际依赖需从源码 import 语句归纳。
- [unknown] 生产部署的 nginx 配置未在仓库中。
- [risk] 无自动化测试,代码变更风险完全依赖人工回测和 paper trading 验证。
## Source Refs
- `frontend/next.config.ts` — API rewrite 代理到 `127.0.0.1:4332`
- `backend/ecosystem.dev.config.js` — PM2 配置(仅 signal_engine
- `backend/requirements.txt` — 后端依赖(不完整)
- `backend/backtest.py:1-13` — 回测用法说明
- `backend/paper_config.json` — 模拟盘配置
- `backend/admin_cli.py:88` — CLI usage 函数
- `backend/risk_guard.py:81-82` — 日志 RotatingFileHandler 配置

162
docs/ai/06-decision-log.md Normal file
View File

@ -0,0 +1,162 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: deep
---
# 06 — Decision Log
## Purpose
记录项目中关键的技术决策、选型原因及权衡取舍(从代码注释和架构特征推断)。
## TL;DR
- 选择 PostgreSQL 作为唯一消息总线NOTIFY/LISTEN避免引入 Kafka/Redis 等额外组件。
- signal_engine 改为 15 秒循环(原 5 秒CPU 降 60%,信号质量无影响。
- 双写 Cloud SQL 作为灾备,失败不阻断主流程。
- `agg_trades` 按月分区,避免单表过大影响查询性能。
- 认证采用自研 HMAC-SHA256 JWT不依赖第三方库。
- 前端使用 Next.js App Router + 纯客户端轮询,不使用 WebSocket 推送。
- 策略参数外置为 JSON 文件,支持热修改无需重启进程。
- 信号评分采用多层加权体系5层每层独立可调支持多策略并行。
## Canonical Facts
### 决策 1PostgreSQL 作为进程间消息总线
**决策**:使用 PostgreSQL `NOTIFY/LISTEN` 在 signal_engine 和 live_executor 之间传递信号,而非 Redis pub/sub 或消息队列。
**原因**(从代码推断):
- 系统已强依赖 PG避免引入新的基础设施依赖。
- 信号触发频率低(每 15 秒最多一次PG NOTIFY 完全满足延迟要求。
- 信号 payload 直接写入 `signal_indicators`NOTIFY 仅做触发通知,消费者可直接查表。
**取舍**:单点依赖 PGPG 宕机时信号传递和持久化同时失败(可接受,因为两者本就强耦合)。
**来源**`live_executor.py:1-10` 架构注释,`signal_engine.py:save_indicator` 函数。
---
### 决策 2信号引擎循环间隔从 5 秒改为 15 秒
**决策**`LOOP_INTERVAL = 15`(原注释说明原值为 5
**原因**:代码注释明确写道 "CPU降60%,信号质量无影响"。
**取舍**:信号触发延迟最坏增加 10 秒对于短线但非高频的策略TP/SL 以 ATR 倍数计,通常 >1% 波动10 秒的额外延迟影响可忽略不计。
**来源**`signal_engine.py:39` `LOOP_INTERVAL = 15 # 秒从5改15CPU降60%,信号质量无影响)`
---
### 决策 3agg_trades 表按月范围分区
**决策**`agg_trades` 使用 `PARTITION BY RANGE(time_ms)`,按月创建子表(如 `agg_trades_202603`)。
**原因**
- aggTrades 是最大的写入表(每秒数百条),无分区会导致单表膨胀。
- 按月分区支持高效的时间范围查询PG 分区裁剪)。
- 旧分区可独立归档或删除,不影响主表。
**取舍**:分区管理需要维护(`ensure_partitions()` 自动创建当月+未来2个月分区需定期执行跨分区查询性能取决于分区裁剪是否生效`time_ms` 条件必须是常量)。
**来源**`db.py:191-201, 360-393`
---
### 决策 4Cloud SQL 双写(非阻塞)
**决策**:所有写入操作在本地 PG 成功后,尝试相同写入到 Cloud SQL`10.106.0.3`Cloud SQL 失败不影响主流程。
**原因**提供数据异地备份Cloud SQL 作为只读副本或灾备使用。
**取舍**
- 本地 PG 和 Cloud SQL 可能出现数据不一致local 成功 + cloud 失败)。
- 双写增加每次写操作的延迟(两个网络 RTT但因为是 best-effort 且使用独立连接池,实际阻塞极少。
- live_executor 直接连 Cloud SQL`DB_HOST=10.106.0.3`),绕过本地 PG。
**来源**`db.py:23-29, 80-118``live_executor.py:50-55`
---
### 决策 5自研 JWT不用 PyJWT 等第三方库)
**决策**:使用 Python 标准库 `hmac`、`hashlib`、`base64` 手动实现 JWT 签发和验证。
**原因**推断减少依赖JWT 结构相对简单HMAC-SHA256 签名几十行代码即可实现。
**取舍**
- 需要自行处理过期、revoke、refresh token 等逻辑(代码中已有 `refresh_tokens` 表)。
- 非标准实现可能在边界情况(时钟偏差、特殊字符等)上与标准库行为不同。
- 无 JWT 生态工具支持(调试工具、密钥轮转库等)。
**来源**`auth.py:1-6`import hashlib, secrets, hmac, base64, json`auth.py:16-19`
---
### 决策 6策略配置外置为 JSON 文件
**决策**V5.x 策略的权重、阈值、TP/SL 倍数等参数存放在 `backend/strategies/*.json`signal_engine 每次 `load_strategy_configs()` 读取。
**原因**
- 策略调优频繁v51→v52 权重变化显著),外置避免每次改参数都要修改代码。
- 多策略并行signal_engine 同时运行 v51_baseline 和 v52_8signals对每个 symbol 分别评分。
- [inference] 支持未来通过前端或 API 修改策略参数而不重启进程(目前 signal_engine 每次循环重读文件 —— 需确认)。
**取舍**JSON 文件无类型检查,配置错误在运行时才发现;缺少配置 schema 校验。
**来源**`signal_engine.py:41-67``backend/strategies/v51_baseline.json``backend/strategies/v52_8signals.json`
---
### 决策 7信号评分采用五层加权体系
**决策**:信号评分分为 5 个独立层次(方向层、拥挤层、资金费率层、环境层、确认层、清算层、辅助层),每层有独立权重,总分 0~100阈值 75 触发信号。
**设计特点**
- 方向层CVD权重最高V5.1: 45分V5.2: 40分是核心指标。
- "standard" 档位score ≥ threshold75"heavy" 档位score ≥ max(threshold+10, 85)。
- 信号冷却:同一 symbol 同一策略触发后 10 分钟内不再触发。
- CVD 快慢线需同向才产生完整方向信号;否则标记 `no_direction=True` 不触发。
**取舍**:权重缩放逻辑较复杂(各层原始满分不统一,需先归一化再乘权重);`market_indicators` 缺失时给默认中间分,保证系统在数据不完整时仍能运行。
**来源**`signal_engine.py:410-651`
---
### 决策 8前端使用轮询而非 WebSocket
**决策**React 前端对 `/api/rates` 每 2 秒轮询慢速数据stats/history/signals每 120 秒轮询K 线图每 30 秒刷新。
**原因**(推断):
- 实现简单,无需维护 WebSocket 连接状态和断线重连逻辑。
- 数据更新频率2 秒/30 秒对轮询友好WebSocket 的优势在于毫秒级推送。
- FastAPI 已支持 WebSocket但实现 SSE/WS 推送需要额外的后端状态管理。
**取舍**:每 2 秒轮询 `/api/rates` 会产生持续的服务器负载;当用户量增加时需要加缓存或换 WebSocket。
**来源**`frontend/app/page.tsx:149-154`
---
### 决策 9live_executor 和 risk_guard 直连 Cloud SQL
**决策**`live_executor.py` 和 `risk_guard.py` 默认 `DB_HOST=10.106.0.3`Cloud SQL而不是本地 PG。
**原因**(推断):这两个进程运行在与 signal_engine 不同的环境(可能是另一台 GCP VM 或容器),直连 Cloud SQL 避免通过本地 PG 中转。
**取舍**live_executor 和 signal_engine 使用不同的 PG 实例,理论上存在数据读取延迟(双写同步延迟)。
**来源**`live_executor.py:50-55``risk_guard.py:47-53`
## Interfaces / Dependencies
无额外接口依赖,均为内部架构决策。
## Unknowns & Risks
- [inference] 所有决策均从代码推断,无明确的 ADRArchitecture Decision Record文档。
- [unknown] 策略配置是否支持热重载signal_engine 是否每次循环都重读 JSON未确认。
- [risk] 决策 4双写+ 决策 9live executor 直连 Cloud SQL组合下若本地 PG 和 Cloud SQL 数据不一致live_executor 可能读到滞后的信号或重复执行。
## Source Refs
- `backend/signal_engine.py:39` — LOOP_INTERVAL 注释
- `backend/signal_engine.py:44-67` — load_strategy_configs
- `backend/signal_engine.py:410-651` — evaluate_signal 完整评分逻辑
- `backend/db.py:23-29, 80-118` — Cloud SQL 双写连接池
- `backend/live_executor.py:50-55` — DB_HOST 配置
- `backend/auth.py:1-6` — 自研 JWT import
- `frontend/app/page.tsx:149-154` — 轮询间隔
- `backend/strategies/v51_baseline.json`, `v52_8signals.json`

100
docs/ai/07-glossary.md Normal file
View File

@ -0,0 +1,100 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: deep
---
# 07 — Glossary
## Purpose
项目中使用的专业术语、领域术语和项目自定义术语的定义。
## TL;DR
- 项目混合使用量化交易、加密货币和工程术语,中英文混用。
- "R" 是风险单位1R = 单笔风险金额PAPER_RISK_PER_TRADE × 余额)。
- CVD 是核心指标:累计 delta = 主动买量 - 主动卖量,衡量买卖压力。
- ATR 用于动态计算止盈止损距离TP/SL 均以 ATR 倍数表示)。
- "tier" 指仓位档位light/standard/heavy对应不同仓位大小倍数。
## Canonical Facts
### 交易与量化术语
| 术语 | 定义 |
|------|------|
| **资金费率Funding Rate** | 永续合约中多空双方每8小时相互支付的费率。正费率多头付给空头负费率空头付给多头。以小数表示`0.0001` = 0.01%)。 |
| **永续合约Perpetual / Perp** | 无到期日的期货合约,通过资金费率机制锚定现货价格。本项目操作 Binance USDC-M 永续合约。 |
| **套利Arbitrage** | 持有现货多头 + 永续空头资金费率为正时空头每8小时收取费率收益实现无方向性风险的稳定收益。 |
| **年化Annualized** | `平均费率 × 3次/天 × 365天 × 100%`,将单次资金费率换算为年化百分比。 |
| **CVDCumulative Volume Delta** | 累计成交量差值 = 主动买量 - 主动卖量。正值表示买方主导负值表示卖方主导。本项目计算三个窗口CVD_fast30分钟、CVD_mid4小时、CVD_dayUTC日内。 |
| **aggTrade** | Binance 聚合成交数据:同一方向、同一价格、同一时刻的多笔成交合并为一条记录,包含 `is_buyer_maker` 字段0=主动买1=主动卖)。 |
| **is_buyer_maker** | `0`:买方是 taker主动买入`1`:买方是 maker被动成交即主动卖。CVD 计算0→买量1→卖量。 |
| **VWAPVolume Weighted Average Price** | 成交量加权平均价格。用于判断当前价格相对于短期平均成本的位置。 |
| **ATRAverage True Range** | 平均真实波动幅度,衡量市场波动性。本项目使用 5 分钟 K 线、14 周期 EMA 计算。 |
| **ATR Percentile** | 当前 ATR 在过去 24 小时内的历史分位数0~100衡量当前波动性是高还是低。 |
| **P95 / P99** | 过去 24 小时内成交量的第 95/99 百分位数,作为"大单阈值"。超过 P99 的成交视为大单,对信号评分有影响。 |
| **Long/Short Ratio多空比** | 全市场多头账户数 / 空头账户数。反映市场情绪拥挤程度。 |
| **Top Trader Position顶级交易者持仓比** | 大户多头持仓占比,范围 0~1。高于 0.55 视为多头拥挤,低于 0.45 视为空头拥挤。 |
| **Open InterestOI持仓量** | 市场上所有未平仓合约的总名义价值USD。OI 增加 ≥3% 视为环境强势信号。 |
| **Coinbase Premium** | Coinbase Pro BTC/USD 现货价格相对 Binance BTC/USDT 的溢价比例。正溢价(>0.05%)被视为看涨信号(美国机构买入)。以比例存储(如 `0.0005` = 0.05%)。 |
| **清算Liquidation** | 爆仓事件。空头清算多于多头清算(短时间内)视为看涨信号(逼空)。本项目使用 5 分钟窗口内多空清算 USD 之比进行评分。 |
| **R风险单位** | 单笔风险金额。1R = `初始余额 × 风险比例`(默认 2%,即 200U。盈亏以 R 倍数表示1R=保本2R=盈利1倍风险-1R=全亏。 |
| **PnL_R** | 以 R 为单位的盈亏:`pnl_r = (exit_price - entry_price) / risk_distance × direction_sign`。 |
| **TP1 / TP2Take Profit** | 止盈目标价。TP1 为第一目标触发后平一半仓位TP2 为第二目标(平剩余)。 |
| **SLStop Loss** | 止损价。SL 触发后视 TP1 是否已命中:未命中→亏损 1R已命中→保本sl_be 状态)。 |
| **Tier档位** | 仓位大小分级。`light`=0.5×R`standard`=1.0×R`heavy`=1.5×R。信号分数越高触发越重的档位score ≥ max(threshold+10, 85) → heavyscore ≥ threshold → standard。 |
| **Warmup冷启动** | signal_engine 启动时读取历史 `agg_trades` 填充滚动窗口的过程,完成前不产生信号(`state.warmup=True`)。 |
| **Signal Cooldown信号冷却** | 同一 symbol 同一策略触发信号后10 分钟内不再触发新信号,防止过度交易。 |
### 策略术语
| 术语 | 定义 |
|------|------|
| **v51_baseline** | V5.1 基准策略。6 个信号cvd, p99, accel, ls_ratio, oi, coinbase_premium。SL=1.4×ATRTP1=1.05×ATRTP2=2.1×ATR。 |
| **v52_8signals** | V5.2 扩展策略。8 个信号v51 + funding_rate + liquidation。SL=2.1×ATRTP1=1.4×ATRTP2=3.15×ATR更宽止损更高盈亏比目标。 |
| **Score / 信号分数** | 0~100 的综合评分,由多层加权指标累加得出,阈值 75 触发信号。 |
| **Direction Layer方向层** | 评分第一层,最高 45 分v51或 40 分v52。基于 CVD_fast、CVD_mid 同向性和 P99 大单方向。 |
| **Crowding Layer拥挤层** | 基于多空比和顶级交易者持仓的市场拥挤度评分。 |
| **Environment Layer环境层** | 基于持仓量变化OI change的市场环境评分。 |
| **Confirmation Layer确认层** | CVD 快慢线同向确认15 分(满足)或 0 分。 |
| **Auxiliary Layer辅助层** | Coinbase Premium 辅助确认0~5 分。 |
| **Accel Bonus加速奖励** | CVD 快线斜率正在加速时额外加分v51: +5分v52: +0分。 |
| **Score Factors** | 各层得分详情,以 JSONB 格式存储在 `paper_trades.score_factors``live_trades.score_factors`。 |
### 工程术语
| 术语 | 定义 |
|------|------|
| **Paper Trading / 模拟盘** | 不真实下单、仅模拟记录的交易,用于验证策略。数据存储在 `paper_trades` 表。 |
| **Live Trading / 实盘** | 通过 Binance API 真实下单执行的交易。数据存储在 `live_trades` 表。 |
| **Testnet** | Binance 测试网(`https://testnet.binancefuture.com`),使用虚拟资金。`TRADE_ENV=testnet`。 |
| **Production** | Binance 生产环境(`https://fapi.binance.com`),使用真实资金。`TRADE_ENV=production`。 |
| **Circuit Break熔断** | risk_guard 触发的保护机制,阻止新开仓甚至强制平仓。通过 `live_config` 表的 flag 通知 live_executor。 |
| **Dual Write双写** | 同一数据同时写入本地 PG 和 Cloud SQLCloud SQL 写失败不阻断主流程。 |
| **Partition / 分区** | `agg_trades` 表的月度子表(如 `agg_trades_202603`),用于管理大表性能。 |
| **NOTIFY/LISTEN** | PostgreSQL 原生异步通知机制。signal_engine 用 `NOTIFY new_signal` 触发live_executor 用 `LISTEN new_signal` 接收。 |
| **TradeWindow** | signal_engine 中的滚动时间窗口类,维护 CVD 和 VWAP 的实时滚动计算。 |
| **SymbolState** | 每个交易对的完整状态容器,包含三个 TradeWindow、ATRCalculator、market_indicators 缓存和信号冷却记录。 |
| **Invite Code邀请码** | 注册时必须提供的一次性(或限次)代码,由管理员通过 `admin_cli.py` 生成。 |
| **Subscription Tier** | 用户订阅等级(`free` 等),存储在 `subscriptions` 表,当前代码中使用有限。 |
| **万分之** | 前端显示资金费率时的单位表述,实际值 × 10000 展示。例如 `0.0001` 显示为 `1.0000 万分之`。 |
## Interfaces / Dependencies
无。
## Unknowns & Risks
- [inference] `Subscription Tier` 功能在 schema 中有定义但实际业务逻辑中使用程度不确定(可能是预留字段)。
- [inference] "no_direction" 状态CVD_fast 和 CVD_mid 不一致时)的处理逻辑:方向取 CVD_fast但标记为不触发信号可用于反向平仓判断。
## Source Refs
- `backend/signal_engine.py:1-16` — CVD/ATR/VWAP/P95/P99 架构注释
- `backend/signal_engine.py:69-81` — Paper trading 参数定义R、tier 倍数)
- `backend/signal_engine.py:170-207` — TradeWindow 类CVD/VWAP 定义)
- `backend/signal_engine.py:209-257` — ATRCalculator 类
- `backend/signal_engine.py:410-651` — evaluate_signal各层评分逻辑
- `backend/strategies/v51_baseline.json`, `v52_8signals.json` — 策略参数
- `backend/trade_config.py` — 交易对精度配置
- `frontend/app/page.tsx:186` — "万分之" 显示注释

View File

@ -0,0 +1,141 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: deep
---
# 99 — Open Questions
## Purpose
记录文档生成过程中发现的未解决问题、不确定点和潜在风险。
## TL;DR
- `requirements.txt` 不完整,实际依赖需手动补齐。
- `users` 表在两个地方定义且 schema 不一致db.py vs auth.py
- live_executor / risk_guard 直连 Cloud SQL 但 signal_engine 写本地 PG存在数据同步延迟风险。
- 策略是否支持热重载(每次循环重读 JSON未确认。
- uvicorn 监听端口 4332 未在启动脚本中显式确认。
- 无 CI/CD无自动化测试。
## Open Questions
### 高优先级(影响正确性)
#### Q1users 表 schema 双定义不一致
**问题**`db.py` 的 `SCHEMA_SQL``auth.py``AUTH_SCHEMA` 均定义了 `users` 表,但字段不同:
- `db.py` 版本:`id, email, password_hash, role, created_at`(无 `discord_id`、无 `banned`
- `auth.py` 版本:`id, email, password_hash, discord_id, role, banned, created_at`
`init_schema()``ensure_auth_tables()` 都在 FastAPI startup 中调用,两次 `CREATE TABLE IF NOT EXISTS` 第一次成功后第二次静默跳过。**实际创建的是哪个版本?** 取决于调用顺序(先 `init_schema``ensure_auth_tables`),如果本地 PG 已有旧版表则字段可能缺失。
**影响**auth 相关功能discord_id 关联、banned 状态检查)可能在 schema 未更新的环境下失效。
**建议行动**:统一到 auth.py 版本,或添加 `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` 迁移。
**来源**`db.py:269-276``auth.py:28-37`
---
#### Q2live_executor 读 Cloud SQLsignal_engine 写本地 PG双写延迟是否可接受
**问题**signal_engine 写入本地 PG`signal_indicators`),同时双写 Cloud SQLlive_executor 直连 Cloud SQL 读取信号。若某次双写失败或延迟live_executor 可能错过信号或读到不一致数据。
**影响**:实盘信号丢失或执行延迟。
**建议行动**:确认 NOTIFY 是否也发送到 Cloud SQL即 live_executor 通过 LISTEN 接收信号,不依赖轮询读表);或将 live_executor 改为连接本地 PG。
**来源**`live_executor.py:50-55``db.py:76-95`
---
#### Q3requirements.txt 不完整
**问题**`requirements.txt` 只列出 `fastapi, uvicorn, httpx, python-dotenv, psutil`,但源码还 import 了 `asyncpg`、`psycopg2`(用于 psycopg2-binary、`aiohttp`、`websockets`(推断)等。
**影响**:新环境安装后进程无法启动。
**建议行动**:执行 `pip freeze > requirements.txt` 或手动补全所有依赖。
**来源**`backend/requirements.txt:1-5``backend/db.py:9-11``backend/live_executor.py:28-29`
---
### 中优先级(影响维护性)
#### Q4策略 JSON 是否支持热重载
**问题**`load_strategy_configs()` 在 `main()` 函数开头调用一次。不清楚 signal_engine 的主循环是否每次迭代都重新调用此函数。
**影响**:如果不支持热重载,修改策略 JSON 后需要重启 signal_engine 进程。
**来源**`signal_engine.py:44-67, 964`(需查看 main 函数结构)
---
#### Q5uvicorn 端口确认
**问题**:从 `frontend/next.config.ts` 推断 uvicorn 运行在 `127.0.0.1:4332`,但没有找到后端启动脚本明确指定此端口。
**建议行动**:在 `ecosystem.dev.config.js` 或启动脚本中显式记录端口。
**来源**`frontend/next.config.ts:8`
---
#### Q6market_indicators 表 schema 未在 SCHEMA_SQL 中定义
**问题**signal_engine 从 `market_indicators` 表读取数据(指标类型:`long_short_ratio`, `top_trader_position`, `open_interest_hist`, `coinbase_premium`, `funding_rate`),但该表的 CREATE TABLE 语句不在 `db.py``SCHEMA_SQL` 中。
**影响**:表由 `market_data_collector.py` 单独创建如果该进程未运行过表不存在signal_engine 会报错或返回空数据。
**建议行动**:将 `market_indicators` 表定义加入 `SCHEMA_SQL`,确保 `init_schema()` 能覆盖全量 schema。
**来源**`signal_engine.py:123-158``db.py:166-357`(未见 market_indicators 定义)
---
#### Q7liquidations 表 schema 未确认
**问题**signal_engine 查询 `liquidations` 表(`SELECT FROM liquidations WHERE symbol=%s AND trade_time >= %s`),但该表定义在 `SCHEMA_SQL` 中同样未找到。可能由 `liquidation_collector.py` 自行创建。
**来源**`signal_engine.py:395-407``liquidation_collector.py:28``ensure_table()` 函数)
---
### 低优先级(长期健康度)
#### Q8无 CI/CD 流水线
**问题**:仓库中没有 `.github/workflows/`、Dockerfile、docker-compose.yml 等部署自动化文件。所有部署为手动操作ssh + git pull + pm2 restart
**建议行动**:添加 GitHub Actions 用于基本 lint 检查和依赖安全扫描。
---
#### Q9无自动化测试
**问题**:未发现任何测试文件(`test_*.py`、`*.test.ts` 等)。策略验证完全依赖人工回测和模拟盘。
**建议行动**:至少为 `evaluate_signal()`、`TradeWindow`、`ATRCalculator` 添加单元测试,防止重构回归。
---
#### Q10生产环境硬编码密码风险
**问题**`db.py`、`live_executor.py`、`risk_guard.py` 中均有 testnet 默认密码 `arb_engine_2026` 硬编码在源代码里(通过 `os.getenv(..., "arb_engine_2026")` 方式)。
**影响**代码一旦泄露testnet 数据库可被访问;生产环境如果环境变量设置失败,会静默使用错误密码(失败时的错误信息较明确,但仍有风险)。
**建议行动**testnet 默认密码移除或通过单独的 `.env.testnet` 文件管理,不内嵌到源代码。
**来源**`db.py:19``live_executor.py:44``risk_guard.py:42`
---
#### Q11`signal_indicators` 表含 `strategy` 字段但 schema 未声明
**问题**`save_indicator()` 函数的 INSERT 语句包含 `strategy` 字段,但 `SCHEMA_SQL` 中的 `signal_indicators` 表定义不包含该字段。可能通过 `ALTER TABLE ADD COLUMN IF NOT EXISTS` 在运行时补充,或是后续版本添加但忘记更新 schema。
**来源**`signal_engine.py:690-699``db.py:205-224`signal_indicators 定义)
## Source Refs
- `backend/db.py:269-276` — db.py 版 users 表
- `backend/auth.py:28-37` — auth.py 版 users 表(含 discord_id, banned
- `backend/requirements.txt` — 不完整的依赖列表
- `backend/live_executor.py:44, 50-55` — DB_HOST 和默认密码
- `backend/risk_guard.py:42, 47-53` — DB_HOST 和默认密码
- `backend/signal_engine.py:395-407` — liquidations 表查询
- `backend/signal_engine.py:690-699` — strategy 字段 INSERT

52
docs/ai/INDEX.md Normal file
View File

@ -0,0 +1,52 @@
---
generated_by: repo-insight
version: 1
created: 2026-03-03
last_updated: 2026-03-03
source_commit: 0d9dffa
coverage: deep
---
# Arbitrage Engine — AI Documentation Index
**Project**: `arbitrage-engine`
**Summary**: Full-stack crypto perpetual futures funding-rate arbitrage monitoring and V5.x CVD/ATR-based short-term trading signal engine. Python/FastAPI backend + Next.js frontend + PostgreSQL + Binance USDC-M Futures.
## Generated Documents
| File | Description |
|------|-------------|
| [00-system-overview.md](./00-system-overview.md) | Project purpose, tech stack, repo layout, entry points, environment variables |
| [01-architecture-map.md](./01-architecture-map.md) | Multi-process architecture, component diagram, signal pipeline data flow, risk guard rules, frontend polling |
| [02-module-cheatsheet.md](./02-module-cheatsheet.md) | Module-by-module index: role, public interfaces, dependencies for all 20 backend + 15 frontend files |
| [03-api-contracts.md](./03-api-contracts.md) | All REST endpoints, auth flows, request/response shapes, error conventions |
| [04-data-model.md](./04-data-model.md) | All PostgreSQL tables, columns, partitioning strategy, storage design decisions |
| [05-build-run-test.md](./05-build-run-test.md) | 构建/运行/部署命令环境变量PM2 配置,回测和模拟盘操作 |
| [06-decision-log.md](./06-decision-log.md) | 9 项关键技术决策PG 消息总线、循环间隔、双写、分区、自研 JWT 等 |
| [07-glossary.md](./07-glossary.md) | 交易术语CVD/ATR/R/tier+ 工程术语paper trading/warmup/circuit break |
| [99-open-questions.md](./99-open-questions.md) | 11 个未解决问题users 表双定义冲突、依赖不完整、硬编码密码、无测试等 |
## Recommended Reading Order
1. **Start here**: `00-system-overview.md` — 了解项目定位和结构。
2. **Architecture**: `01-architecture-map.md` — 理解 7+ 进程的交互方式。
3. **Data**: `04-data-model.md` — 任何 DB 相关工作的必读;注意时间戳格式不统一问题。
4. **API**: `03-api-contracts.md` — 前端开发或 API 对接时参考。
5. **Module detail**: `02-module-cheatsheet.md` — 修改特定文件前的参考。
6. **Ops**: `05-build-run-test.md` — 部署和运维操作。
7. **Concepts**: `07-glossary.md` — 不熟悉量化术语时查阅。
8. **Risks**: `99-open-questions.md` — 开始开发前必读,了解已知风险点。
## Coverage Tier
**Deep** — 包含完整的模块签名读取、核心业务模块深度阅读signal_engine 全文、evaluate_signal 评分逻辑、backtest.py、构建运行指南、决策日志、术语表和开放问题。
## Key Facts for AI Agents
- **Signal engine is the core**: `backend/signal_engine.py` — change with care; affects all trading modes.
- **Strategy tuning via JSON**: modify `backend/strategies/v51_baseline.json` or `v52_8signals.json` to change signal weights/thresholds without code changes.
- **No ORM**: raw SQL via `asyncpg`/`psycopg2`; schema in `db.py:SCHEMA_SQL`.
- **Auth is custom JWT**: no third-party auth library; hand-rolled HMAC-SHA256 in `auth.py`.
- **`TRADE_ENV=testnet` default**: production use requires explicit env override + strong JWT_SECRET.
- **Dual timestamp formats**: `ts` = Unix seconds, `time_ms`/`entry_ts`/`timestamp_ms` = Unix milliseconds — do not confuse.
## Generation Timestamp
2026-03-03T00:00:00 (UTC)