Changelog

2026-04-26

Changed: TOSS Power Index weights rebalanced to 65/25/10

What: The TOSS formula is now 0.65 × APR + 0.25 × FQI + 0.10 × oGS (previously 0.40 / 0.40 / 0.20). Record-based APR is now the primary signal; FQI and oGS are secondary indicators of flight-level and game-level dominance rather than co-equal terms.

Why: The 40/40/20 split allowed a strong-flight, weaker-record team to outrank a stronger-record team — a 8-5-2 team with 8-0 / 7-1 wins could land ahead of an 8-2 team that won 5-3 / 6-2. FQI was never meant to be that big a share of the formula; rebalancing keeps its anti-stacking purpose intact while letting head-to-head record do the heavy lifting.

Impact: Reorders the 2026 table moderately. Biggest movers: an 8-2-0 6A team rising 17 spots into the top 10, an 11-3-0 team dropping out of the top 5, and several teams with dominant flight scores in shallow leagues sliding back. Historical seasons (2021–2025) use the legacy RPI formula and are unchanged.

Fixed: Weekly composite rankings broke ties arbitrarily

Problem: When two or more teams had the same composite rank in the weekly computer rankings, their relative order was effectively dict insertion order — there was no defined tiebreaker, so a team with median rank 23 could land above a team at median 18 with the same composite. Affected every week's composite table since launch.

Fix: The weekly composite sort now applies tiebreakers in order: composite rank → median rank (ascending — more consistent placement wins) → main-site Power Index (descending) → school_id (deterministic fallback). Re-running Week 4 reorders 12 tied groups; the largest case is the three-way 20.2 tie that now resolves to median 18 → 21 → 23 instead of the previous reverse order.

Added: Teams with fewer than 3 dual matches now display as "NR"

What: A team must have played at least 3 dual matches to receive a numeric state rank, class rank, or league rank. Teams below the threshold are emitted with all rank-style fields (rank, class_rank, league_rank, rank_toss, rank_qws, rank_legacy, and the matching class_rank_* variants) set to null, and the rankings table renders them as NR. They're also excluded from head-to-head swap eligibility, the class-average FQI baseline, and the playoff simulator's eligible field.

Why: A 1-match résumé is not enough signal to support a rank. Before the fix, a team with a single early-season win could appear at state rank #2 simply because their per-match average sat above everyone else's. The empirical-Bayes shrinkage on FQI/oGS already softened this for teams with 4–5 matches, but truly tiny samples (1–2 matches) still don't belong on a ranked list at all.

Impact: 2026 currently has 0 boys teams and 1 girls team marked NR. Across the 2021–2026 archive 30 historical entries flip from numeric to NR, all with 0–2 dual matches. Numeric metrics (APR, FQI, PI, record, etc.) are still computed and emitted on NR entries so they reappear with a rank as soon as match #3 is played. Threshold is MIN_RANKED_MATCHES = 3 in generate_site.py.

Changed: TOSS FQI and oGS now apply empirical-Bayes shrinkage

Problem: Both opponent-weighted components in the TOSS Power Index — FQI (flight quality) and oGS (game share) — were a straight per-match arithmetic mean with no sample-size adjustment. A team that played five lopsided wins early in the season could land above a team with twelve matches and a couple of competitive losses, because the small-sample team's per-match average had nothing pulling it back toward a league baseline. The reported case had a 4-1 team ranking just ahead of a 10-3 league rival on raw PI even though the 10-3 team had won the head-to-head; the H2H swap corrected the state rank but the underlying number was the wrong way around.

Fix: Added 5 phantom matches at the neutral 0.5 baseline (multiplier 1.0) to both FQI and oGS calculations. A team with N actual matches is now pulled 5 / (N + 5) of the way toward 0.5: a 5-match team is shrunk 50%, a 13-match team 28%, a 60-flight (~10 dual-match) team 33%, a full season (~15 matches) team 25%. APR is unchanged — the RPI-style OWP/OOWP terms already shrink it via the schedule graph.

Constants: TOSS_PRIOR_MATCHES = 5, TOSS_PRIOR_VALUE = 0.5 in generate_site.py. Same prior is applied to FQI and oGS so the regression strength is consistent across the two opp-weighted components.

Impact: 2026-only (TOSS is the primary on 2026 alone; 2021–2025 retain their RPI-based legacy formula and are unchanged). Biggest movers are tiny-sample season-opener teams sliding back toward the field — the most dramatic was a 1-match team that had been at state rank #2 dropping to #25 — and a few mid-pack teams with deep schedules moving up. Head-to-head among the affected leagues now matches the user's intuition without relying on a swap to rescue it.

Fixed: League rank stale after H2H tiebreaker swaps

Problem: League standings could show a team at league rank #1 while the same team sat *below* a league-mate in the overall state ranking, because the head-to-head tiebreaker pass had already moved the league-mate ahead in the state list. Affected 466 teams across 139 (year, gender, league) groups going back to 2021, including 88 teams in 2026.

Root cause: school_league_rank was built once from the initial Power Index sort (used as a *condition* for the H2H swap pass), then never recomputed after the swap pass reordered teams. The output entry's league_rank field was reading the stale pre-swap value while rank was reading the post-swap order.

Fix: After all H2H swap phases finish, rebuild school_league_rank from the post-swap ranked order. State rank and league rank are now monotonic within every league.

Changed: TOSS Power Index folds in opponent-weighted Game Share (40/40/20 split)

What: The primary Power Index formula is now 0.40 × APR + 0.40 × FQI + 0.20 × oGS. A new third component, oGS (opponent-weighted Game Share), is the season-aggregate share of *games* (not just flights) a team won, scaled by the same opp_APR / median_APR multiplier that FQI uses. Set-type-aware: best-of-3 sets and 8-game pro sets contribute raw game totals, regular-set tiebreakers count as one deciding game, and 10-point match tiebreakers count as a single decision (not 17 games). Coverage is 98.1% on 2026 flight matches; flights without set data fall back to a binary one-game outcome.

Why: One week of live data on the 50/50 APR + FQI version surfaced a residual distortion: undefeated teams in shallow conferences could carry a maxed-out raw flight score even after the opponent multiplier discounted it, because 1.0 × 0.7 still sits above field median. Folding in oGS — which natively distinguishes a 6–2 flight win that came in straight 6–0 sets from a 6–2 flight win that went the distance — pulls those teams into more honest bands without introducing any class-aware logic. FQI's anti-stacking purpose is preserved (its weight stays equal to APR; the new term is additive, not a replacement).

Impact: Historical seasons (2021–2025) and the three already-published Saturday weekly snapshots (2026-04-04, 2026-04-11, 2026-04-18) are unchanged. The 2026 season-long rankings reorder modestly — Pearson correlation between the previous and new primary PI is ~0.96, most teams move zero or one slot. Biggest movement is at the top of the table (undefeated thin-conference teams move down) and in the mid-pack (deeper-conference teams with competitive losses rise). New JSON fields on every 2026 entry: games_won, games_played, game_share (informational aggregates), and ogs (the opp-weighted variant that feeds the formula).

Detail: Game-Share Fold-In AAR · methodology.

Changed: TOSS is now the primary Power Index

What: Starting with Week 4 (2026-04-26, the first Sunday-cadence snapshot), the main rankings table, class ranks, head-to-head tiebreakers, league standings, and the playoff simulator all use TOSS as the primary Power Index. The pre-2026-04-26 RPI-based formula is retained as Legacy in the Model dropdown for comparison, and QWS continues as the experimental B of the ongoing A/B test. The Model selector above the rankings table switches the State Rank / Class Rank / Power Index columns between the three models (TOSS primary is the default).

Why: One week of live A/B data (weeks 1-3 Saturday snapshots + the 2026-04-20 baseline run) made the problem with the RPI-based model concrete. Teams dominating thin leagues were ranked above teams with comparable records in tougher leagues, because the old FWS component had no opponent-strength awareness. TOSS fixes this with a per-match multiplier keyed to opponent APR while keeping the OSAA-compatible RPI APR unchanged. QWS is a more aggressive structural fix (replaces APR with quality-weighted wins) and stays in parallel for 2027 evaluation; the flat 50-point loss penalty in QWS is the reason it isn't the primary yet.

Impact: Historical seasons (2021-2025) are unchanged. The three already-published Saturday weekly snapshots (2026-04-04/11/18) are unchanged. Every 2026 team in processed_rankings.json now carries the primary TOSS rank plus rank_legacy, rank_qws, class_rank_legacy, class_rank_qws, and the corresponding PI values for side-by-side comparison. Biggest reorderings happen around thin-league undefeated teams (down) and strong-league mid-pack teams (up) — see the AAR for the full list.

Detail: Power Index A/B Test AAR and methodology page.

Changed: Flight-quality component named FQI, clamping removed

What: The opponent-weighted flight metric inside the TOSS formula is now named FQI (Flight Quality Index). The TOSS formula reads as 0.5 × APR + 0.5 × FQI. On the main rankings table, the prior FWS% column is replaced by FQI, shown in the same 0–1 range as APR with 4-decimal precision. The prior hover tooltip FWS+ becomes FQI+ (same classification-relative index, 100 = classification average, now computed from FQI instead of raw flight-win percentage).

The 0.75–1.25 per-match multiplier clamp that initially shipped with TOSS has been removed. FQI's opponent-weight multiplier is now opp_apr / median_apr uncapped, with unknown opponents defaulting to the median (multiplier 1.0).

Why rename: "FWS" described what the metric used to do (flights-weighted by position only). The metric is no longer that — it weights flights by position *and* by opponent strength. A new name makes clear this is a different measurement than the FWS% column it replaces. FQI is the Oregon label; the metric is documented portably under the generic name oFWS (opponent-weighted Flight-Weighted Score) in docs/oFWS-PRD.md so other states or sports with different flight structures can adopt the same approach.

Why remove clamping: The clamp bounded per-match impact to ±25% against the median opponent, which dampens the very signal the metric is designed to produce. A flight-level win against a top-APR opponent is meaningfully more informative than a flight-level win against a bottom-APR opponent; clamping hides that difference rather than exposing it. If APR is trustworthy enough to serve as the multiplier, it's trustworthy enough unclamped; if it isn't, clamping treats a symptom of APR instability instead of the cause. Rollout velocity is managed by the existing TOSS_PRIMARY_DATE gate, not by neutering the math.

Constraint preserved: FWS's original role — discouraging singles-heavy stacking by weighting top flights (S1/D1 at 1.00 down to S4/D4 at 0.10) — is untouched. FQI layers opponent-weighting on top of the existing flight-weight structure.

Backcompat: Legacy JSON field names (normalized_fws, fws_plus) remain as aliases in processed_rankings.json so existing consumers don't break. normalized_fws_raw is preserved as the opponent-blind baseline for debugging. New field names are fqi and fqi_plus.

2026-04-24

Added: Power Index A/B test (TOSS + QWS) and Sunday publish cadence

What: The main rankings table now has a Model dropdown (Current / TOSS / QWS) that swaps the displayed Rank, Class Rank, and Power Index between three formulas computed in parallel from 2026-04-20 forward. The Current model is the default and continues to drive the playoff simulator, head-to-head, and league standings. Weekly publishing also shifts from Saturdays to Sundays starting 2026-04-26 so Saturday match results are included in that week's snapshot. Historical seasons (2021-2025) and the three already-published Saturday weekly snapshots are unchanged.

Why: The current Power Index over-rewards dominant flight scores against weak-league opponents because FWS has no opponent-strength awareness. Two design approaches surfaced — TOSS (opponent-APR-weighted FWS) and QWS (ITA-style quality-weighted APR) — and rather than pick one blind, we're running both alongside the unchanged baseline through end of season to compare on live data.

Detail: See the Power Index A/B Test AAR for the full design, math, validation results, and rollout plan. The methodology page has user-facing explanations of all three formulas.

Fixed: Duplicate dual matches inflated team records and rankings

Problem: Some teams showed inflated win or loss totals because both coaches posted the same dual match to tennisreporting.com, producing two distinct meet entries. The reported trigger case was Valley Catholic girls at 7-2-0 vs. Molalla girls at 6-4-0 on 2026-04-07; both teams had the April 7 VC-vs-Molalla match counted twice. An audit across 2021-2026 found 51 duplicate match pairs affecting 92 school files.

Root cause: Every meet-iteration site (get_dual_match_record, get_league_record, get_head_to_head, get_head_to_head_detailed, process_school_data, FWS calculation, and the weekly match graph) consumed the raw data['meets'] array without checking for duplicate (date, team_a, team_b) entries. scripts/generate_weekly_rankings.py::extract_matches had a partial guard, but it only gated match_list; match_graph, team_records, team_top_flight, and team_match_log were all written before the dedup check.

Fix: Added a dedupe_meets() helper to generate_site.py, scripts/build_rankings.py, and scripts/generate_weekly_rankings.py that collapses any pair of dual meets with the same date and same two school IDs, keeping the entry with the most completed flight match data (ties broken by lowest meet id). The dedup runs once at load time, so every downstream consumer sees the cleaned meets list. Non-dual meets (tournaments, state championships) are untouched.

Impact: Valley Catholic girls 2026: 7-2-0 -> 6-2-0. Molalla girls 2026: 6-4-0 -> 6-3-0. 106 duplicate meet entries removed across the 2021-2026 history, correcting records, league standings, head-to-head tables, FWS, OWP, and weekly composite rankings.

2026-04-23

Fixed: Weekly rankings counted tiebreaker losses as ties

Problem: La Salle Prep girls' record on the weekly rankings page read 3-2-5 while OSAA and the main site both reported 4-5-2. Three of the five "ties" were 4-4 meets the Oregon tiebreaker awarded to the opponent — Summit, Hood River Valley, and Crescent Valley — and one missing win was simply a later-dated match. Every team's weekly record that included a tiebreaker outcome was off by the same pattern.

Root cause: scripts/generate_weekly_rankings.py::extract_matches inferred a team's result purely from flight scores (won = my_score > opp_score) and treated any 4-4 as a tie, ignoring the winnerSchoolId field that tennisreporting.com sets based on the Oregon tiebreaker (sets won, then games won). The canonical get_meet_result in generate_site.py already handled this correctly — the weekly pipeline had a divergent copy.

Fix: Added a tiebreaker-aware get_meet_result helper to the weekly script mirroring the site version, switched extract_matches to tri-state won (True/False/None, where None marks a true tie with no winnerSchoolId), and updated scripts/computer_rankings.py so Elo, Colley, PageRank, and Win-Score handle ties explicitly (0.5 credit each way) instead of implicitly via the margin==0 sentinel that could no longer distinguish a tiebreaker loss from a true draw. Regenerated all three 2026 weekly snapshots; a full cross-check confirms every team's weekly W-L-T now matches processed_rankings.json exactly.

2026-04-22

Fixed: Tournament-format dual matches dropped from team records

Problem: Lincoln (Portland, OR) boys showed 4-0 on the site while tennisreporting.com reported 8-0-0. The four "missing" wins were all titled *Caldera Tournament*. Follow-up audit found nine other schools similarly under-counted, including Bend boys (displayed 4-3-1 vs. API 8-3-1), Roseburg boys (displayed 6-3-2 vs. API 10-3-2), and South Medford boys (displayed 4-4 vs. API 8-4).

Root cause: is_dual_match() in generate_site.py dropped any meet whose title contained the word "Tournament". In the 2026 data, every "Tournament" meet is actually a single head-to-head dual match (one winner, one loser, full per-flight scores) — events like the Caldera Tournament, Roseburg Invitational Tournament, and RHS Tournament are scheduled as clusters of dual matches played in one location, and tennisreporting.com's own overallRecord counts each one. The blanket title filter was a blunt instrument that excluded them from record totals, league standings, head-to-head tables, RPI opponent sets, and FWS.

Fix: Removed the 'Tournament' in title check. The existing structural guard at the bottom of is_dual_match() (one winner, one loser) still filters out genuinely multi-team events — e.g., the OES Invitational (1 winner, 3 losers) remains correctly excluded as a points-based event.

Impact: Ten 2026 boys teams and their opponents now reconcile exactly with tennisreporting.com's overallRecord. A full pass across all 264 schools × 2 genders shows 0 remaining record mismatches, ties included. Lincoln boys moved from rank uncertain to rank #1 at 8-0-0. Upcoming events the season will see — Jesuit Tournament, Central Oregon Invite, Roseburg Tournament, Bigfoot Invitational — are all handled by this same change, because each appears in the feed as a set of 1v1 dual matches.

Fixed: Massey column silently ranked teams by school_id when match graph was disconnected

Problem: The Massey column on the weekly rankings was nonsense whenever the match graph had more than one connected component, which has been the case for every 2026 girls week published so far. On the 2026-04-18 snapshot, Jesuit girls (9-0, consensus #1 everywhere else) appeared at Massey #117 out of 127, while Vale, Stayton, Century, Hillsboro, and Glencoe occupied Massey #1-5. The ordering was not tennis at all — it was school_id ascending.

Root cause: massey_rankings in scripts/computer_rankings.py builds an n×n Laplacian-style matrix M whose row sums are zero, then replaces the last row with a sum-to-zero constraint to pin the solution. That constraint lifts the rank by one — fine for a connected graph, but the 2026 girls graph has two components (one small-school cluster never played outside itself), so post-constraint M is still rank n-1, not n. np.linalg.solve raised LinAlgError, the except branch set every rating to 0.0, and then ratings_to_ranks did a stable descending sort. Stable sort on all-equal keys preserves insertion order, which comes from teams = sorted(match_graph.keys()) — i.e., school_id. Jesuit's school_id 124879 happens to sit 117th in that order. The same failure mode will have silently hit any prior gender/year whose match graph ever fragmented.

Fix: Swapped np.linalg.solve(M, p) for np.linalg.lstsq(M, p, rcond=None), which returns the minimum-norm least-squares solution even when M is singular. Each connected component gets its own self-consistent Massey ratings (centered near zero by the sum-to-zero row) rather than collapsing everything to 0. Regenerated all three 2026 weekly snapshots; Jesuit girls is now Massey #1 in weeks 2 and 3 (matching Elo/Colley/PageRank/Win-Score) and Massey #1 in week 1 before other systems had enough match history to converge. Tradeoff: ratings across disconnected components are not strictly comparable, but that is already true of Massey on disconnected data in theory, and the new behavior is vastly closer to truth than "sort by school_id."

Fixed: Playoff simulator could seed the same school twice

Problem: On the Playoff Simulator page, the home-game-guarantee step could place the same league champion at two different seeds. Example from 4A/3A/2A/1A Girls 2026 with an 8-team bracket: Marist Catholic appeared as AUTO at both seed 4 and seed 6, while another team was silently dropped from the field. The "First 4 Out" list and first-round matchups both reflected the corrupted field.

Fix: The loop that moves league champions up into home-game seeds (generate_site.py inside generatePlayoffFieldFromSelection) captured each mover's array index before any mutations, then used those stale indices to splice(fromIdx, 1). After the first insertion at lastHomeGameSeed - 1 shifted every element to the right of it, the next splice removed the wrong team — and re-inserted a champion that was still in the array, producing the duplicate. Movers are now pulled out of the field by school_id and reinserted as a contiguous block at the last home-game seed, and each champion's reported "moved from X to Y" uses its actual final seed instead of a fixed target. No duplicates can appear in the qualifying field, and no unrelated team is dropped.

Changed: Power Index now weights opponent strength by league depth (two-pass APR)

Problem: APR's OWP term (strength of schedule) averaged opponents' raw win percentages with no league context. That let teams dominating weaker leagues carry OWP scores indistinguishable from teams beating comparably-ranked opponents in deeper leagues — so Power Index rewarded weak-schedule wins and penalized top-bracket teams forced to play each other.

Fix: APR is now computed in two passes. Pass 1 uses the existing RPI formula to produce a first-cut APR, and those APRs feed a per-league depth score (top-4 APR average, already computed for the existing league-quality display). Pass 2 recomputes OWP using a league-depth-weighted opponent strength — each opponent's contribution is scaled by opp_league_depth / median_league_depth for that year+gender. The depth calculation uses leave-one-out: both the opponent and the team being evaluated are excluded, so a team can't inflate its own strength-of-schedule by being in its own league. OOWP, APR, and Power Index are then recomputed from the pass-2 OWPs. FWS is untouched, so the anti-stacking half of PI is unchanged.

Impact on 2026: Biggest Boys movement is Cascade (SD-2) #26→#32 and Marist Catholic (SD-2) #27→#33 dropping, with PIL teams (Wells, Franklin) and Three Rivers teams (Lakeridge) rising. Girls top-10 barely reshuffles — Wells moves up one, McMinnville down one — consistent with the top already being dominated by teams playing strong slates. Vale is unchanged at Girls #5 and rises to Boys #46 (from #42), which reflects their actual non-league slate rather than SD-5 alone.

Fixed: Quality Wins calculation used drifting in-memory ranks instead of published weekly ranks

Problem: generate_weekly_rankings.py recomputed every prior week's rankings from scratch on each run and chained them through memory as the prev_rankings input for quality-win calculations. Because the underlying match data keeps evolving (new match ingests, tiebreaker corrections, schema updates), the retroactive "previous week" rankings drifted from what was actually published. The result was quality-win totals that didn't match the visible prior-week rankings — Marist Catholic's 6-2 win over Catlin Gabel (4A-1A #2 at the time) showed 0 quality wins instead of 1, and 20+ girls teams and 30+ boys teams had similar mismatches in the Week 3 snapshot. Earlier-week snapshots also still carried the pre-rename top25_wins field.

Fix: Weekly generation now treats the published public/data/weekly/<date>.json snapshots as the canonical source of prior-week ranks. When the loop starts a week and no prior ranks are held in memory (single-week runs, partial reruns), it loads them from disk via the new load_published_week helper. Full --all runs still chain in memory because each freshly-written snapshot matches what would be read back from disk. Regenerated all three 2026 weeks end-to-end to bring existing snapshots into agreement with the canonical rule; Marist Catholic now correctly shows 1 quality win in Week 3.

Changed: Unified "Quality Wins" column on weekly rankings

Problem: The "Top 25 W" column on the weekly rankings page only counted wins against opponents ranked top-25 overall. Since the overall rankings skew toward 6A teams (who play each other more), teams in smaller classifications had no way to get credit for beating strong opponents within their own classification.

Fix: Replaced "Top 25 W" with a unified "Quality Wins" column that counts wins against opponents who were either top-25 overall *or* top-10 within their own classification at the time the match was played. A single win against an opponent who qualifies for both still counts as one quality win (no double-counting).

Fixed: Tiebreaker wins/losses incorrectly reported as ties

Problem: When a dual match ended with equal flight scores (e.g., 4-4), Oregon uses a tiebreaker system (sets won, then games won) to determine a winner. The upstream data feed records these tiebreaker outcomes in a winnerSchoolId field, but oregontennis.org was ignoring it and only comparing flight scores — so every 4-4 match was reported as a tie, even when one team officially won the tiebreaker.

Impact: 126 team records corrected across 2024-2026 seasons. Examples from 2026:

Westview Girls: 4-1-1 → 5-1-0
Crescent Valley Girls: 4-0-1 → 5-0-0
Ridgeview Boys: 3-0-3 → 5-1-0
Nelson Boys: 5-1-1 → 6-1-0
Valley Catholic Girls: 4-1-1 → 5-1-0

Some teams moved significantly in rankings as a result (e.g., Ridgeview Boys +21 spots, Sam Barlow Boys +15).

Fix: All four match-result functions (get_dual_match_record, get_league_record, get_head_to_head, get_head_to_head_detailed) now fall back to winnerSchoolId when flight scores are tied, matching the official tiebreaker outcome.

Fixed: H2H boost not applying in overall state rankings

Problem: The head-to-head tiebreaker system had two phases: Phase 1 handled same-league teams (bubble swaps), Phase 2 checked adjacent pairs in the overall ranking. But when two same-classification teams had teams from other classifications ranked between them, they were never adjacent, so the H2H boost was never evaluated for the overall state ranking.

Fix: Added Phase 3 — classification-level H2H enforcement. Groups teams by classification and checks H2H for pairs within class rank or PI proximity, using bubble swaps. This catches 3-6 additional swaps per year/gender that were previously missed.

Changed: Default table sort order

Rankings table now defaults to sorting by Rank (ascending) instead of Power Index (descending). This ensures the display order reflects H2H tiebreaker adjustments. A "Rank" button was added to the sort toggle alongside Power Index and APR.