CRANE-X
Research — Next Generation

CRANE-X

Cluster-Reactive Adaptive News Ensemble — a three-signal, multi-asset sentiment engine that combines pre-baked LLM polarity, statistical clustering, and deep content analysis at zero GPU cost.

5
Assets Tracked
3
Ensemble Signals
Active Clusters
Articles Scored

What is CRANE-X?

CRANE-X is a CPU-native, three-signal sentiment ensemble that ingests financial news and outputs per-asset sentiment scores for five major instruments — S&P 500 E-mini Futures (ES), Nasdaq 100 E-mini Futures (NQ), Brent Crude Oil (CL), Bitcoin (BTC), and Ethereum (ETH).

Unlike monolithic sentiment models that apply one-size-fits-all scoring, CRANE-X treats each asset as an independent optimization problem. The same news article can (and should) affect different assets differently — a rate cut is bullish for equities but neutral-to-negative for crypto. CRANE-X captures this.

Three independent signals — VADER-style polarity, statistical clusters (245 adaptive TF-IDF groups), and a zero-shot LLM scoring on full article content
Per-asset calibration — Each of the 5 assets has its OWN ensemble weights, optimized every 6 hours via Spearman rank correlation against realized 24-hour forward returns
Volatility-normalized compositing — The aggregate signal weights each asset inversely to its realized volatility (1/σ), preventing high-noise instruments from dominating
CPU-only, ~$5/month — Full pipeline runs on a single CPU server. The extLLM signal scores ~700-word articles at ~$0.0006 per 4-article batch
Live gauge — Real-time multi-asset sentiment dashboard embedded below

Live Sentiment Gauge

Real-time multi-asset sentiment dashboard — ES, NQ, CL, BTC, ETH with volatility-normalized composite and per-asset regime indicators.

Volatility-Normalized Composite Sentiment

--
Composite Sentiment
Score ranges from -1.0 (strongly bearish) to +1.0 (strongly bullish). >+0.3 = bullish, <-0.3 = bearish, in between = neutral.
Loading...

Ensemble Weights

How much each signal contributes — independently calibrated per asset

Three-Signal Ensemble

Each article is scored by three independent signals, weighted per-asset, and combined into a volatility-normalized composite.

TF

Signal 1 — TFLLM

Pre-baked VADER-style polarity delivered with the EODHD API response. Zero latency, zero cost — comparable dimensionality to FinBERT.

SC

Signal 2 — StatCluster

245 active semantic clusters from 3,400+ articles. Each stores historical 24h forward price reactions per asset. Cosine similarity matching.

LLM

Signal 3 — extLLM

Zero-shot DeepSeek V4 Flash scoring on full article content (~700 words). Cost: ~$0.0006 per 4-article batch. Temperature 0.1.

Per-Asset Calibration

Five independent weight sets — one per asset. Optimized every 6h via Spearman rank correlation. Diversity floor prevents signal extinction.

σ

Volatility-Normalized Composite

Composite = Σ(sₖ/σₖ) / Σ(1/σₖ). Lower-vol assets (ES σ≈15%) contribute ~3.7× more than high-vol assets (BTC σ≈55%).

📈

Real-Time Gauge

Live at tradeflags.com/cranex-gauge.html. Shows per-asset sentiment, price regimes, and the vol-normalized composite with CSS tooltips.

Pipeline Flow

From raw news ingestion to live gauge — fully automated via cron and systemd daemon.

End-to-End Pipeline

EODHD News Freshness Filter ≤15m MySQL eodhd_news TFLLM Signal StatCluster extLLM Ensemble Score Composite Gauge

Data Pipeline Schedule

Ingest */15 min Scorer */30 min LLM Scorer */30 min Calibrate */6 h

16 Topics, 19 Active

EODHD topic tags: technology, earnings, oil, fed, markets, crypto, economy, inflation, bonds, commodities, stocks, mergers, energy, recession, regulation + TFLLM.

Freshness-Gated

Only articles ≤15 min old are ingested, ensuring price snapshots from WSJ Dylan correlate with the news event.

Full Content Analysis

Unlike the original CRANE (headline-only), extLLM scores the full article body — 700+ words per article.

CRANE vs CRANE-X

Key architectural improvements that distinguish the next-generation engine.

ComponentOriginal CRANECRANE-X
News SourceTradeFlags NewsFeed APIEODHD topic tags NEW
Price SourceBundled with newsWSJ Dylan API (separate) NEW
SentimentFinBERT lexicon (248 terms)Pre-baked VADER polarity NEW
Cluster VocabBy IDF (rare terms)By doc frequency (common terms) NEW
TokenizationHeadline only (15 words)Title + tags + weighted content
AssetsES, NQ, CL, BTC+ ETH NEW
Ensemble WeightsSingle set (ES only)5 independent, per-asset NEW
CompositeNoneVolatility-normalized (1/σ) NEW
LLM InputNever ran reliablyFull article (700+ words) NEW
FreshnessNone15-min max age NEW
DaemonN/Asystemd, 30s check loop NEW

Cost Economics

Designed for sustainable operation — CPU-only, no GPU, minimal API spend.

Signal 1 — TFLLM

$0.00 — polarity arrives pre-computed with every EODHD API response.

Signal 2 — StatCluster

$0.00 — pure CPU, TF-IDF cosine similarity on the local MySQL instance.

Signal 3 — extLLM

~$0.0006 per 4-article batch. Full 500-article corpus costs ~$0.10. DeepSeek V4 Flash at $0.14/M input tokens.