fantasyfootball.data

Module Contents

Classes

FantasyData

Loads historical fantasy football data.

Attributes

logger

fantasyfootball.data.logger
class fantasyfootball.data.FantasyData(season_year_start: int, season_year_end: int)[source]

Loads historical fantasy football data.

Parameters:
  • season_year_start (int) – The first year of the season.

  • season_year_end (int) – The last year of the season.

property data: pandas.DataFrame

Returns the dataframe of the historical NFL Fantasy data.

Returns:

Historical NFL Fantasy data.

Return type:

pd.DataFrame

_validate_season_year_range() bool[source]

Ensures that the season year range is valid.

Raises:
  • ValueError – If the season year is less than the minimum year

  • ValueError – If the season year is greater than the maximum year

Returns:

True if the season year range is valid.

Return type:

bool

static _refresh_data(ff_data_dir: pathlib.PosixPath, data_sources: dict) bool[source]

Use the datasets specified in the config.py to identify if a dataset is missing from the installed version of the package. When a missing dataset is identified, the most recent version is downloaded from Git.

Parameters:
  • ff_data_dir (PosixPath) – The directory containing the seasonal data.

  • data_sources (dict) – A dictionary indicating the names of

  • package. (the data sources used in the fantasyfootball) –

Returns:

True if data in current package is up to date or data was succesfully downloaded from remote repo.

Return type:

bool

static _load_data(ff_data_dir: pathlib.PosixPath, data_sources: dict, *exclude: str) pandas.DataFrame[source]

Helper method to load all other data, excluding the season calendar and roster of active players for a season.

Parameters:
  • ff_data_dir (PosixPath) – The directory containing the season data.

  • data_sources (dict) – A dictionary indicating the names of the data sources used in the fantasyfootball package.

  • exclude (str) – The names of the files to exclude from the data load.

Raises:
  • ValueError – If the exclude file name is a required file.

  • ValueError – If the columns used to join the data are not found.

Returns:

The dataframe containing all historical fantasy football data for a single season.

Return type:

pd.DataFrame

_filter_to_most_recent_complete_week(df: pandas.DataFrame) pandas.DataFrame[source]

Filters the dataframe to the most recent week that has complete data in-season.

Parameters:
  • df (pd.DataFrame) – The dataframe

  • data (containing all historical fantasy football) –

Raises:
  • ValueError – If the most recent week in a season has

  • less than 100 observations.Typically, values should be around 400.

  • Less than 100 indicates an incomplete week.

Returns:

If the most recent week is in season, the dataframe is filtered to the most recent week.

Return type:

pd.DataFrame

load_data(data_sources: dict = data_sources, filter_final_season_week: bool = True) FantasyData[source]

Loads all historical fantasy football data from the season year range provided. Each season year is loaded separately and then concatenated together.

Parameters:
  • data_sources (dict) – A dictionary indicating the names of the data sources used in the fantasyfootball package.

  • filter_final_season_week (bool) – If True, the final week of each season is filtered out. These weeks are filtered because many players are ‘active’ but have minimal participation in the game to avoid injury when their team has secured a playoff spot. Excluding these weeks allows for more accurate predictions. Default is True.

Returns:

The dataframe containing all historical

fantasy football data for the specified season year range.

Return type:

FantasyData

static _validate_scoring_source_rules(source_rules: dict, ff_df_columns: list) None[source]

Validates the scoring source rules provided.

Parameters:
  • source_rules (dict) – The scoring source rules to validate.

  • ff_df_columns (list) – The list of columns in the dataframe.

Raises:
  • ValueError – If the scoring source rules are not valid.

  • TypeError – If the scoring source rules are not a dictionary.

  • KeyError – If the scoring source keys do not include ‘scoring_columns’ or ‘multiplier’.

  • KeyError – If scoring columns are not a subset of the dataframe columns.

  • TypeError – If the scoring multiplier is not a dictionary.

  • KeyError – If the scoring multiplier keys do not include ‘threshold’ and ‘points’.

  • KeyError – If the multiplier values are not a subset of the dataframe columns.

add_scoring_source(scoring_source_rules: dict) FantasyData[source]

Updates the scoring source rules.

Parameters:

scoring_source_rules (dict) – Scoring source rules. Required keys are the source name (e.g., ‘custom’), ‘scoring_columns’, and ‘multiplier’.

Returns:

An updated FantasyData object with

the new scoring source rules.

Return type:

FantasyData

Example

>>> from fantasyfootball.data import FantasyData
>>> fantasy_data = FantasyData(season_year_start=2019,
                               season_year_end=2021
                               )
>>> new_scoring_source = {"my league": {"scoring_columns": {
        "passing_td": 4,
        "passing_yds": 0.04,
        "passing_int": -3,
        "rushing_td": 6,
        "rushing_yds": 0.1,
        "receiving_rec": 0.5,
        "receiving_td": 4,
        "receiving_yds": 0.1,
        "fumbles_fmb": -3,
        "scoring_2pm": 4,
        "punt_returns_td": 6,
        },
        "multiplier": {"rushing_yds" : {"threshold": 100,"points": 5},
                        "passing_yds": {"threshold": 300, "points": 3},
                        "receiving_yds": {"threshold": 100, "points": 3},
        }
        }}
>>> fantasy_data.add_scoring_source(new_scoring_source)
static score_player(player_df: pandas.DataFrame, scoring_columns: set, scoring_source_rules: dict) numpy.array[source]

Calculates the total number of points scored for a single week

Parameters:
  • player_df (pd.DataFrame) – Weekly stats for a single player for the season.

  • scoring_columns (set) – Columns to use for scoring

  • scoring_source_rules (dict) – Rules for scoring

Returns:

The total number of points scored for a single week.

Return type:

np.array

create_fantasy_points_column(scoring_source: str) FantasyData[source]

Creates a fantasy points column for the scoring source provided.

Parameters:

scoring_source (str) – Name of the scoring source to use (e.g., ‘draft kings’, ‘yahoo’, ‘custom’).

Returns:

An updated FantasyData object with the new

fantasy points column.

Return type:

FantasyData

show_scoring_sources() List[str][source]
__str__() str[source]

Returns a string representation of the FantasyData object.

Returns:

Top 5 rows of the FantasyData object.

Return type:

str