Help us improve
Share bugs, ideas, or general feedback.
From factor-researcher
Validates, resamples, and ingests OHLCV market data (CSV/Parquet/HDF5) for factor mining. Pulls live data from MCP connectors like FactSet, Daloopa, Morningstar.
npx claudepluginhub minihellboy/factorminer --plugin factor-researcherHow this skill is triggered — by the user, by Claude, or both
Slash command
/factor-researcher:factor-dataThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Market data is the input contract for every FactorMiner workflow. This skill makes sure a dataset is schema-valid and split-covered *before* a mining run burns iterations on a broken file.
Designs and manages market data infrastructure for financial trading: real-time/delayed feeds, Level 1/2/3 depth, SIP vs direct feeds, vendor selection (Bloomberg, Refinitiv), licensing, entitlements, ticker plants, and data quality.
Runs the FactorMiner research engine to discover alpha factors from validated datasets via Ralph or Helix loops with causal validation, regime conditioning, and debate generation.
Queries LSEG/Refinitiv financial data including fundamentals, market data, ESG scores, RIC/ISIN symbology, deals via lseg.data Python API with mandatory validation and inspection.
Share bugs, ideas, or general feedback.
Market data is the input contract for every FactorMiner workflow. This skill makes sure a dataset is schema-valid and split-covered before a mining run burns iterations on a broken file.
FactorMiner expects an OHLCV panel with one row per (asset, timestamp):
| Column | Meaning | Notes |
|---|---|---|
datetime | Bar timestamp | Parseable date/datetime |
asset_id | Instrument id | Aliases: code, ticker, symbol |
open high low close | Prices | — |
volume | Share/contract volume | — |
amount | Dollar/turnover volume | vwap derived as amount / volume when missing |
returns and vwap are derived automatically when absent. Column aliasing is handled by the loader, so near-canonical files pass.
Always validate first:
factorminer validate-data path/to/market_data.csv --json
Read the report. It lists detected columns, applied aliases, derived fields, and train/test split coverage. If either split has zero rows, stop — fix the file or the config's data.train_period / data.test_period before mining. Use --strict to treat warnings as failures in CI.
If the bars are finer than the research horizon (e.g. 5-minute bars for a daily study), resample:
factorminer resample-data raw_5m.csv bars_1h.parquet --rule 1h
To pull data from a financial-data MCP connector instead of a local file, write a small MCP-source config and run fetch-data. The config maps the connector's tool and field names onto the canonical loader-required schema, including volume and amount:
factorminer mcp-connectors
# factset_source.yaml
transport: http
url: https://mcp.factset.com/mcp
headers:
Authorization: "Bearer ${FACTSET_TOKEN}"
tool: get_prices
arguments:
ids: ["AAPL-US", "MSFT-US"]
start: "2022-01-01"
end: "2024-12-31"
frequency: "1d"
records_path: data.prices
field_mapping:
datetime: date
asset_id: fsym_id
open: price_open
high: price_high
low: price_low
close: price_close
volume: volume
amount: turnover
factorminer fetch-data --mcp-config factset_source.yaml --output universe.parquet
factorminer validate-data universe.parquet
${ENV} placeholders keep credentials out of the file. The same pattern works for Daloopa, Morningstar, LSEG, S&P Global, Moody's, Aiera, PitchBook, Chronograph, MT Newswires, Egnyte, or any connector that returns tabular price data — only the tool name and field_mapping change. If the endpoint does not return liquidity fields, switch endpoints or enrich the file before mining rather than fabricating turnover.
mine or helix.