Agent Architecture
A three-layer system where deterministic checks do the heavy lifting and LLMs are called only when they add value.
Operational Phases
Research & Strategy Building
Hypothesis-driven exploration with budget constraints and quality gates on every action. Haiku plans cheaply, Sonnet makes high-stakes decisions. A state machine governs the full research lifecycle.
Live Deployment Validation
Intensive monitoring during the first hour after go-live. All deterministic checks run simultaneously with LLM analysts on standby. Automatic rollback on critical findings.
Steady-State Monitoring
Deterministic-first monitoring at 30-minute intervals — free. LLM analysts invoked only on degradation or anomalies. 24/7 autonomous operation with crash recovery.
Deterministic vs LLM
The key insight: most monitoring cycles find nothing wrong. Running an LLM to confirm "everything is fine" is wasteful. Deterministic checks are instant, free, and catch 95%+ of actionable issues.
Deterministic Checks
Always free, instantbug_check
Code changes, runtime errors, stack traces
health_check
API connectivity, memory, process heartbeats
signal_check
Signal generation consistency, threshold drift
trade_check
Fill quality, slippage, position reconciliation
backtest_check
Live vs backtest divergence, outcome alignment
drift_check
Performance degradation, regime shift detection
LLM Analysts
On-demand, only when flaggedtrade_analyst
Deep analysis of win/loss patterns and exit timing
investigation_analyst
Root cause analysis when checks flag anomalies
strategy_analyst
Strategy health assessment and adaptation recommendations
Router
Rule-based state machine, no LLMOrchestrates the full monitoring cycle. Dispatches deterministic checks first, evaluates results, and only escalates to LLM analysts when anomalies are detected. No AI cost during normal operation.
Research Agent
An autonomous agent that tests trading hypotheses through iterative experimentation. Budget-constrained, goal-directed, with quality gates on every action. Two modes: hypothesis (test a specific idea) and explorer (discover new opportunities).
Strategy System
Strategies are assembled from pluggable components defined in YAML. New indicators and filters are added by subclassing base types — no core modifications needed. The explorer agent can generate, validate, and integrate new components autonomously.
strategy:
indicators:
primary: { type: zscore, source: hl2 }
momentum: { type: momentum, source: primary }
volume: { type: zscore, source: volume }
atr: { type: atr, method: ewm }
entry_conditions:
- indicator: primary, operator: ">=", mirror: true
exit_conditions:
- indicator: primary, operator: crosses_below
risk:
stop_loss: { type: atr }
take_profit: { type: atr }
sizing: { type: risk_pct }
# Parameter values loaded at runtime — never committed to configExtensible Components
New indicators (subclass BaseIndicator), new filters (subclass BaseFilter), new exit rules — all plug-and-play with zero core changes.
Autonomous Creation
Explorer agent can generate, validate, and integrate new strategy components without human intervention.
Validation Pipeline
1. Structural
Correct inheritance, required methods, type signatures
2. Integration
Component loads, connects to data feeds, no import errors
3. Functional
Generates valid signals, handles edge cases, no NaN outputs
4. Quality
Backtest meets minimum thresholds before integration
Analysis Dashboard
A private FastAPI + Plotly.js dashboard for deep analysis. Backtests are scanned from the filesystem, indexed, and made searchable. Every optimization run gets distribution charts, parameter sensitivity analysis, and constraint failure breakdowns.
$ helios dashboard --summary Backtest Explorer 54,000+ runs indexed · sortable · filterable Optimization View 66 sweeps · heatmaps · parameter sensitivity Chart Suite 7 interactive types · equity · drawdown · scatter Comparison Tool Side-by-side metrics · parameter diffs · equity overlay
Walk-Forward Validation
Strategies are validated using time-series-aware methodology. No shuffling, no future data leakage — the same constraints a live trader faces.
In-Sample / Out-of-Sample
Chronological fold splitting preserves time-series integrity. Optimize on IS data, validate on OOS data per fold.
Monte Carlo Bootstrap
1,000 resamples generate confidence intervals for Sharpe, drawdown, and win rate. Flags: low_confidence, sequence_dependent.
Cost Optimization
The deterministic-first architecture dramatically reduces LLM costs. Most monitoring cycles complete without a single API call.
| Scenario | LLM-Only | Deterministic-First | Savings |
|---|---|---|---|
| Routine cycle (no issues) | $0.05–0.10 | $0 | 100% |
| Deployment validation (1 hr) | $1.50–3.00 | $0–0.30 | ~80% |
| Steady-state (24 hrs) | $12–30 | $0–2.00 | 93% |
Deterministic checks handle routine operations at zero cost. LLM analysts are invoked only when checks flag anomalies that require reasoning — typically <5% of monitoring cycles.
Code Architecture
Core
/coreStrategy
/strategyExecution
/executionResearch
/researchMonitoring
/monitoringValidation
/validation