Sports Betting — Cloudflare Report
Runbook
How to run the truth-layer backfills safely, generate coverage reports, and scale to larger date ranges.
Recommended workflow (current)
- Start small (1–2 days) to validate environment and DB writes.
- Run 14-day and 30-day backfills to validate stability and missingness.
- Scale month-by-month with
--resumeand periodic coverage reports.
Example commands
python truth_poc.py --start-date 2025-12-02 --end-date 2025-12-15 --max-games 20 --db season.sqlite --pbp-mode nba_api --sleep 1.8
python coverage_report.py --db season.sqlite --out coverage_report.json
If you interrupt a long run, re-run the same command with --resume.
Interpreting “quiet” long runs
Low CPU does not necessarily mean a hang. The process may be waiting on network responses and sleeping between calls for rate limiting.
- To cancel safely: Ctrl+C (then re-run with
--resume). - If you suspect a true stall: check for an active
pythonprocess and whether its CPU time is increasing over minutes.
Past releases
Release 2025-12-16 (bundle v2) — expand
Sports Betting — Cloudflare Report
Runbook
How to run the fetcher locally, inspect SQLite output, and avoid common re-run pitfalls.
Quickstart (any developer machine)
This assumes you already have Python 3.10+ installed.
python -m venv .venv
# Activate:
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install --upgrade pip
pip install nba_api pbpstats pandas requests
Then run:
python truth_poc.py --date 2025-12-15 --max-games 1 --db pilot.sqlite --pbp-mode nba_api --sleep 1.5
Inspect the output
Check row counts:
python -c "import sqlite3; con=sqlite3.connect('pilot.sqlite');
print('schedule', con.execute('select count(*) from canonical_schedule_rest').fetchone()[0]);
print('box', con.execute('select count(*) from canonical_box_score').fetchone()[0]);
print('pbp', con.execute('select count(*) from canonical_pbp').fetchone()[0]);
con.close()"
Common gotcha: re-running into the same DB
If you rerun the same date/game into the same SQLite file, you may hit UNIQUE constraint errors depending on your capture timestamp strategy. Two easy ways around this:
- Use a new DB filename per test run, or
- Drop and recreate the target table(s) before re-ingesting.