After Action Report: Game-Share Fold-In to TOSS

Date: 2026-04-26 (computed and shipped) Effective: 2026-04-26 — primary Power Index formula updated for 2026+ season-long rankings Branch: claude/marist-quality-win-analysis-AUd3L Author: Engineering, after iterative review with site owner.

What we did

The primary Power Index formula on the 2026 season rankings now reads:

PI = 0.40 × APR + 0.40 × FQI + 0.20 × oGS

Where oGS is a new third component: the opponent-APR-weighted aggregate share of *games* (not flights) a team won across the season. Both FQI and oGS use the same per-match opponent multiplier — opp_APR / median_APR — so the schedule-aware scaling that was already in FQI applies identically to game share.

Historical seasons (2021–2025) keep the original 50/50 APR + FWS formula and are unchanged. The three already-published Saturday weekly snapshots (2026-04-04, 2026-04-11, 2026-04-18) keep their original published values and are unchanged. The change applies forward only, starting with the 2026-04-26 season-long rankings; weekly snapshots from Week 4 (2026-04-26 Sunday cadence) onward use the new formula.

Why we did it

One week after promoting TOSS (APR + FQI, 50/50) to the primary Power Index on 2026-04-26, a residual distortion remained visible: a small number of undefeated teams in shallow conferences continued to sit above demonstrably stronger teams from deeper conferences, despite the FQI multiplier already penalizing weak-opponent flight wins. The mechanism was structural: a team that wins 8–0 in flights every match has a raw FWS near 1.0, and a multiplier of 1.0 × 0.7 = 0.7 (the typical case for thin-conference opponents) still sits well above field median. The multiplier is correct in direction but too weak in magnitude to reorder the top of the rankings against the kind of dominant flight outcomes that arise from running the table on a shallow schedule.

Two failure modes were visible to readers:

The site owner's explicit constraint was that the fix must not introduce a class-aware multiplier. No region or classification of the state should be hand-weighted up or down. The fix had to come from a signal already present in the data.

What we considered

Two design directions surfaced:

1. Steeper opponent multiplier on FQI. Replace the linear opp_APR / median_APR with a non-linear curve (squared, or piecewise more aggressive below 1.0). Single-knob change, easy to roll back, but amplifies an existing parameter rather than adding new information. Risks tuning sensitivity issues if the median APR shifts season to season.

2. Add a third component sourced from set/game data. Tennis exposes per-set, per-game outcomes that are absent from most team-sport rating systems. Folding in opponent-weighted aggregate game share gives the index a second, independent signal: not just "did you win the flight?" but "by how much?" — distinguishing a 6–2 flight win that came in straight 6–0 sets from a 6–2 flight win that went the distance.

Both directions were dry-run against the current data. The single-knob approach moved teams in roughly the same direction but added no information the existing parameters didn't already encode. The fold-in approach correlated 0.86–0.90 with the existing PI but disagreed in *interpretable* ways — the divergent cases looked like the cases the owner was complaining about, not noise. We chose direction 2.

Weight calibration

Three weight splits were dry-run side by side:

SplitAPRFQIoGSEffect
Mild0.450.450.10Directional but soft; some over-rewarded teams still sat too high
Moderate0.400.400.20Sweet spot: top-of-rankings inflation resolved without overcorrecting
Aggressive0.350.350.30Punitive on undefeated thin-conference teams; began diluting FQI's anti-stacking purpose

Moderate landed on the right balance for the season's data. It moved the teams the eye-test flagged as over-rewarded into reasonable bands while keeping the FQI signal dominant enough to preserve its anti-stacking role.

Game-share computation

Per dual match, the team's game_share is games_won / games_played summed across all flights, with three format-specific rules:

Flights without set-level data fall back to a binary one-game outcome (one to the winner, zero to the loser) so coverage stays high. In the 2026 dataset, set-level data is present on 98.1% of contested flights.

The team's season-long oGS is the arithmetic mean of (match_game_share × opp_APR / median_APR) across all dual matches — the same formula shape as FQI, just with game share replacing flight score as the per-match input.

What shipped

Code changes (generate_site.py)

JSON schema additions (processed_rankings.json)

For every 2026 entry:

Methodology page

A new section, Opponent-Weighted Game Share (oGS), sits between the FQI and H2H sections and explains:

The Overview & Formula section, the TOSS formula box, and the table of contents were updated to reflect the three-component structure. The QWS and Legacy formulas are unchanged.

Weekly rankings

The weekly rankings generator (scripts/generate_weekly_rankings.py) reads power_index from the team rankings, so it picks up the new formula automatically for any week generated on or after 2026-04-26. Earlier weekly snapshots are not regenerated; they remain canonical at their published values.

Outcomes

The Pearson correlation between the previous (50/50 APR + FQI) primary PI and the new (40/40/20) primary PI is roughly 0.96 across both genders for 2026 — most teams move zero or one rank slot. The reorderings cluster at the top of the table and in the mid-pack, exactly where the residual distortions sat:

No class-aware multiplier was introduced, no school is named in the formula, and no league is treated differently from any other. The signal comes entirely from per-match game data the upstream feed already provides.

What we didn't do

Risks and follow-ups

Rollback

If the formula needs to be reverted to 50/50 APR + FQI, set the constants in generate_site.py:

TOSS_APR_WEIGHT = 0.50
TOSS_FQI_WEIGHT = 0.50
TOSS_OGS_WEIGHT = 0.0

and regenerate. No data migration is required because oGS and game_share remain valid informational fields regardless of the weight applied to them in the PI formula.