Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    zazu-22

    ff-statistical-methods

    zazu-22/ff-statistical-methods
    Data & Analytics
    1 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Expert guidance on statistical analysis methodologies and Monte Carlo simulation for fantasy football...

    SKILL.md

    Statistical Analysis & Simulation for Fantasy Football

    Overview

    Provide expert guidance on statistical methodologies and simulation techniques for fantasy football analytics. Apply appropriate regression methods, design Monte Carlo simulations, perform variance analysis, and conduct hypothesis tests using research-backed approaches.

    When to Use This Skill

    Trigger this skill for queries involving:

    • Regression selection: "Should I use OLS or Lasso?" "When to use GAMs?" "Ridge vs Elastic Net?"
    • Monte Carlo simulation: "How do I simulate rest-of-season?" "Calculate championship probability?" "Quantify trade impact?"
    • Variance analysis: "Identify regression-to-mean candidates?" "Calculate confidence intervals?" "Prediction intervals?"
    • Statistical testing: "Hypothesis test for performance trends?" "Is this improvement significant?"
    • Aging curves: "Model non-linear age effects?" "GAMs for position-specific curves?"
    • Uncertainty quantification: "Error bars on projections?" "Probability of outcomes?"

    Note: For ML model selection and feature engineering, use ff-ml-modeling. For dynasty strategy domain knowledge, use ff-dynasty-strategy.

    Core Capabilities

    1. Regression Methods

    Decision Framework:

    Linear Regression (OLS): Baseline, interpretability, small samples

    Ridge (L2): Multicollinearity, keep all features, shrink coefficients

    Lasso (L1): High-dimensional data, automatic feature selection, sparse models

    Elastic Net: Best default for fantasy (combines Ridge + Lasso)

    GAMs: Non-linear relationships (aging curves), interpretable smooth functions

    Reference: references/regression_methods.md for detailed comparisons and Python code.

    2. Monte Carlo Simulation

    Applications:

    • Rest-of-season projections with uncertainty
    • Championship probability estimation
    • Trade scenario impact analysis
    • Lineup optimization under uncertainty

    Core Approach:

    # Simulate player week: Normal(projection, std_dev)
    simulated_points = np.random.normal(projection, std_dev, n_sims=10000)
    simulated_points = np.maximum(simulated_points, 0)  # Floor at zero
    

    Key Considerations:

    • Iterations: 10,000 default (SE ≈ 0.5%), 100,000 for critical decisions
    • Error correlation: QB and WRs are correlated, model synergies
    • Path dependence: Update team ratings within simulations (FiveThirtyEight approach)
    • Flaw of averages: Analyze full distribution, not just mean

    Reference: references/simulation_design.md for frameworks, templates, and best practices.

    Asset: assets/monte_carlo_template.py - Python templates for common simulations.

    3. Regression to the Mean

    Concept: Extreme values tend toward average in subsequent measurements

    Fantasy Application:

    • +TDOE (Touchdowns Over Expected): Declines 86% next year
    • -TDOE: Improves 93% next year
    • High TD rates regress downward, low TD rates improve

    Position-Specific Sample Sizes (50% regression):

    • QB: 21 games
    • RB: 29-30 games
    • WR: 13-14 games
    • TE: ~20 games

    Implementation:

    regression_factor = sample_size / (sample_size + n_50[position])
    regressed_estimate = (regression_factor * current_stat) + ((1 - regression_factor) * position_mean)
    

    Reference: references/regression_methods.md section on regression to the mean.

    4. Confidence vs Prediction Intervals

    Confidence Interval: Uncertainty in estimated mean (narrow)

    Prediction Interval: Uncertainty for new observation (wider - use this for player projections!)

    Why it matters: Individual player performance has more variability than average performance

    # Prediction interval accounts for both parameter uncertainty AND individual variance
    margin = t_score * residual_standard_error
    lower, upper = prediction - margin, prediction + margin
    

    5. Generalized Additive Models (GAMs)

    When to use: Non-linear relationships like aging curves

    How it works: Fit smooth spline for each feature: y = β₀ + f₁(age) + f₂(experience) + ...

    Fantasy use cases:

    • Aging curves (inverted-U shapes for position-specific performance)
    • Experience effects on production
    • Visualize smooth trends

    Research finding: GAMs reveal QB peaks at 28-33, RB declines post-27

    Python:

    from pygam import LinearGAM, s, f
    
    # s() = smooth (non-linear), f() = factor (categorical)
    gam = LinearGAM(s(0) + s(1) + f(2))  # age, experience, position
    gam.fit(X_train, y_train)
    
    # Visualize smooth curves
    gam.partial_dependence(term=0)  # Age curve
    

    Reference: references/regression_methods.md section on GAMs with Python and R code.

    Workflows

    Choosing a Regression Method

    Step 1: Define Goal

    • Interpretation needed? → OLS or GAMs
    • Prediction focus? → Consider regularization

    Step 2: Check Data Characteristics

    • Small sample (<100)? → OLS or Ridge
    • High-dimensional (many features)? → Lasso or Elastic Net
    • Multicollinearity (VIF > 5)? → Ridge or Elastic Net
    • Non-linear patterns? → GAMs

    Step 3: Baseline

    • Always start with OLS to establish floor

    Step 4: Regularization

    • If overfitting, try Ridge/Lasso/Elastic Net
    • Use cross-validation to select regularization strength

    Step 5: Non-linearity

    • If residuals show patterns, consider GAMs
    • Particularly for aging curves

    Designing a Monte Carlo Simulation

    Step 1: Define Scenario

    • What are you simulating? (Rest-of-season, trade impact, championship probability)
    • What's the time horizon? (Weeks remaining)

    Step 2: Gather Inputs

    • Player projections (expected values)
    • Standard deviations (from historical performance or model residuals)
    • Correlations (QB-WR pairs, teammates)

    Step 3: Build Simulation

    • Use assets/monte_carlo_template.py as starting point
    • Implement correlated errors for teammates
    • Consider path dependence if multi-week

    Step 4: Run Simulations

    • 10,000 iterations default
    • 100,000 for final decisions

    Step 5: Analyze Distribution

    • Don't just report mean!
    • Show percentiles (10th, 50th, 90th)
    • Probability of exceeding thresholds
    • Visualize histograms and CDFs

    Analyzing Regression to the Mean

    Step 1: Identify Extreme Performers

    • Find players with unusually high/low TD rates
    • Calculate TDOE (Touchdowns Over Expected)

    Step 2: Check Sample Size

    • How many games/opportunities?
    • Compare to position-specific threshold (QB: 21, RB: 30, WR: 14)

    Step 3: Apply Regression Formula

    • regression_factor = n / (n + n_50)
    • regressed = (factor * current) + ((1 - factor) * mean)

    Step 4: Identify Buy-Low / Sell-High

    • Positive TDOE → Likely to regress down (sell high)
    • Negative TDOE → Likely to improve (buy low)
    • Volume matters more than TDs!

    Identifying Data Requirements

    For Regression Analysis:

    • Dependent variable (fantasy points, production metrics)
    • Independent variables (age, usage, efficiency stats)
    • Historical data (3+ years for robust estimates)
    • Position labels (for position-specific models)

    For Monte Carlo Simulation:

    • Player projections (weekly expected values)
    • Historical variance (to estimate std dev)
    • Roster compositions (for team simulations)
    • Correlation structures (teammate relationships)

    For Variance Analysis:

    • Historical performance distributions
    • Sample sizes (games played, opportunities)
    • Position-specific baselines (for regression to mean)

    Integrating with Other Skills

    Complement with ff-ml-modeling when:

    • Choosing between regression types and tree-based models
    • Feature engineering informs what to include in regression
    • Validation strategies apply to statistical models too

    Complement with ff-dynasty-strategy when:

    • Identifying TD regression candidates (TDOE analysis)
    • Applying aging curves to trade decisions
    • Understanding domain context for statistical findings

    Best Practices

    Start Simple - OLS baseline before complex methods

    Regularize for High Dimensions - Use Lasso/Elastic Net when features > samples

    Use GAMs for Clear Non-Linearity - Aging curves, experience effects

    Model Correlations in Simulations - QB and WRs are correlated (ρ ≈ 0.6)

    Sufficient Iterations - 10,000 minimum for stable estimates

    Analyze Full Distribution - Percentiles and probabilities, not just means

    Validate Assumptions - Plot residuals, check for patterns

    Regression to Mean is Powerful - TDs regress, volume is king

    Common Pitfalls

    Ignoring non-linearity - Age curves aren't linear, use GAMs

    Too few simulations - <1,000 gives unstable estimates

    Independence assumptions - Teammates are correlated

    Flaw of averages - Non-linear outcomes make "average" misleading

    Over-interpreting small samples - NFL has only 17 games/season

    Forgetting regression to mean - Extreme TDs will regress

    Python Libraries

    # Regression
    from sklearn.linear_model import Ridge, Lasso, ElasticNet, LinearRegression
    import statsmodels.api as sm  # For statistical inference
    from pygam import LinearGAM, s, f  # GAMs
    
    # Simulation
    import numpy as np
    import scipy.stats
    
    # Analysis
    from sklearn.metrics import mean_absolute_error, mean_squared_error
    import matplotlib.pyplot as plt
    

    References

    • references/regression_methods.md - OLS, Ridge, Lasso, Elastic Net, GAMs, regression to mean, confidence/prediction intervals
    • references/simulation_design.md - Monte Carlo frameworks, championship probability, trade impact, path dependence, error correlation

    Assets

    • assets/monte_carlo_template.py - Python templates for rest-of-season simulation, championship probability, and trade impact analysis
    Recommended Servers
    InfraNodus Knowledge Graphs & Text Analysis
    InfraNodus Knowledge Graphs & Text Analysis
    Cloudflare Workers Observability
    Cloudflare Workers Observability
    Consensus
    Consensus
    Repository
    zazu-22/ff_data_analytics
    Files