> ## Documentation Index
> Fetch the complete documentation index at: https://docs.alphaengine.trade/llms.txt
> Use this file to discover all available pages before exploring further.

# Scoring and Benchmarks

> How beta score, CapitalScore, ContestScore, and official evaluation relate.

AlphaEngine scores strategies on evidence, not raw APY.

<Tabs>
  <Tab title="Beta score">
    Public beta ranking uses `score`, a benchmark-adjusted display number
    returned by the evaluation endpoint.
  </Tab>

  <Tab title="CapitalScore">
    CapitalScore is a stricter robustness diagnostic for guarded capital review.
  </Tab>

  <Tab title="ContestScore">
    ContestScore is duplicate-aware contest participation logic and stays
    separate from allocation ranking.
  </Tab>

  <Tab title="Private evaluation">
    Official private evaluation can combine hidden real-sister, perturbation,
    and stress scenario scores.
  </Tab>
</Tabs>

## Utility

Per-scenario utility combines return, drawdown, tail risk, turnover, and explicit
execution cost:

$$
U = w_r \mu - w_{dd}\mathrm{MDD} - w_{tail}\mathrm{CVaR} - w_{to}\mathrm{TO} - w_{cost}\mathrm{Cost}
$$

Default MVP weights:

| Weight   | Value | Meaning               |
| -------- | ----: | --------------------- |
| `w_r`    | `1.0` | return contribution   |
| `w_dd`   | `0.6` | drawdown penalty      |
| `w_tail` | `2.0` | tail-risk penalty     |
| `w_to`   | `0.3` | turnover penalty      |
| `w_cost` | `1.0` | explicit cost penalty |

Source: `ranking-evaluation/docs/p0-evaluation-design-v1.md`.

## Beta score

The public beta evaluation endpoint returns `score`:

$$
\mathrm{score} = 100 \times \frac{\mathrm{strategyUtility}}{\max(\mathrm{benchmarkUtility}, 0.0001)}
$$

This is the current UI sort key. It compares the submitted simulation utility to
the benchmark utility for the resolved dataset.

The response also includes diagnostics such as annualized return, max drawdown,
CVaR, turnover, execution cost, benchmark utility, eligibility, and CapitalScore.

## Benchmark Hold

Benchmark Hold is the comparison baseline for the same dataset and scored
window. It anchors the beta score so a strategy is measured against simple
principal-token carry rather than an isolated return number.

## CapitalScore

CapitalScore is the robust allocation diagnostic:

$$
\mathrm{CapitalScore} = \operatorname{mean}(U_s) - \kappa \cdot \operatorname{std}(U_s) - \frac{c}{\sqrt{T_{\mathrm{eff}}}}
$$

It penalizes fragile strategies and small samples. It should be shown as a
secondary diagnostic in beta unless product explicitly chooses it as the main
leaderboard behavior.

Source: `ranking-evaluation/docs/p0-evaluation-design-v1.md`.

## ContestScore

ContestScore is separate from allocation logic. It supports contest rewards and
duplicate-aware participation scoring. It must not influence CapitalScore.

Source: `ranking-evaluation/docs/development-log.md`.

## Official private evaluation

The internal official-market score combines hidden scenario scores:

$$
\mathrm{officialMarketScore}
= 0.65 \cdot \mathrm{RealSisterScore}
+ 0.20 \cdot p_{25}(\mathrm{PerturbationScores})
+ 0.15 \cdot p_{25}(\mathrm{StressScores})
$$

The current implementation expects one real-sister score, five perturbation
scores, and five stress scores.

Source: `ranking-evaluation/docs/development-log.md`.

## Eligibility gates

Eligibility is separate from ranking. Current backend gates include:

* positive-utility fraction at least `0.70`,
* max drawdown no more than `0.35`,
* mean turnover no more than `0.25`,
* p95 runtime no more than `10_000` ms,
* budget and config validity,
* finite numeric values.

Eligibility failures should be explicit and fixable.

## Related pages

<CardGroup cols={3}>
  <Card title="Leaderboard and evidence" icon="table" href="/arena/leaderboard-and-evidence">
    See how rows should present score, diagnostics, artifacts, and eligibility.
  </Card>

  <Card title="Eligibility, not allocation" icon="shield" href="/capital/eligibility-not-allocation">
    Keep capital eligibility separate from allocation promises.
  </Card>

  <Card title="Evaluations API" icon="file-check" href="/api/evaluations">
    Inspect the implemented evaluation endpoint behavior.
  </Card>
</CardGroup>
