
GMM training script.

The GMMs are fitted to 4 feature dimensions. The 4 features are, in order,

  • 'arrival_time': minute of day, normalized to [0, 1)

  • 'departure_time': minute of day, normalized to [0, 1)

  • 'estimated_departure_time': minute of day, normalized to [0, 1)

  • 'requested_energy': energy requested; multiply by 100 to get kWh

Example command line usage

python -m sustaingym.envs.evcharging.train_gmm_model --site caltech --gmm_n_components 30 --date_range 2019-05-01 2019-08-31 2019-09-01 2019-12-31 2020-02-01 2020-05-31 2021-05-01 2021-08-31
python -m sustaingym.envs.evcharging.train_gmm_model --site jpl --gmm_n_components 30 --date_range 2019-05-01 2019-08-31 2019-09-01 2019-12-31 2020-02-01 2020-05-31 2021-05-01 2021-08-31


usage: [-h] [--site SITE] [--gmm_n_components GMM_N_COMPONENTS]
                        [--date_ranges DATE_RANGES [DATE_RANGES ...]]

optional arguments:
-h, --help            show this help message and exit
--site SITE           Name of site: 'caltech' or 'jpl'
--gmm_n_components GMM_N_COMPONENTS
--date_ranges DATE_RANGES [DATE_RANGES ...]
                      Date ranges for GMM models to be trained on. Number
                      of dates must be divisible by 2, with the second
                      later than the first. Dates should be formatted as
                      YYYY-MM-DD. Supported ranges in between 2018-11-01
                      and 2021-08-31.

Module Contents#


preprocess(→ pandas.DataFrame)

Preprocessing script for real event sessions before GMM modeling.

station_id_cnts(→ numpy.ndarray)

Returns the usage counts for a network's charging station ids.


Converts a sequence of string date ranges to datetimes.

create_gmm(→ None)

Creates a custom GMM and saves in the gmms folder.

create_gmms(→ None)

Creates multiple gmms and saves them in gmms folder.


sustaingym.envs.evcharging.train_gmm_model.preprocess(df: pandas.DataFrame, filter: bool = True) pandas.DataFrame[source]#

Preprocessing script for real event sessions before GMM modeling.

Filters EVs with departures / estimated departures on a different date than arrival date. The arrival, departure, and estimated departure are normalized between 0 and 1 for the time during the day, and the requested energy is normalized by a scaling factor.

  • df (pandas.DataFrame) – DataFrame of charging events, expected to be gotten from utils.get_real_events()

  • filter (bool) – option to filter cars staying overnight


df – filtered copy of DataFrame with normalized parameters.

Return type:


sustaingym.envs.evcharging.train_gmm_model.station_id_cnts(df: pandas.DataFrame, n2i: dict[str, int]) numpy.ndarray[source]#

Returns the usage counts for a network’s charging station ids.

  • df (pandas.DataFrame) – DataFrame of session observations.

  • n2i (dict[str, int]) – dict mapping charging station id to position in numpy array.


cnts – number of sessions associated with each station id.

Return type:


sustaingym.envs.evcharging.train_gmm_model.parse_string_date_list(date_range:[str])[tuple[datetime.datetime, datetime.datetime]][source]#

Converts a sequence of string date ranges to datetimes.


date_range ([str]) – an even-length sequence of string dates in the format ‘YYYY-MM-DD’. Each consecutive pair describes a date range, and should fall inside the range 2018-11-01 and 2021-08-31.


A sequence of 2-tuples containing a begin and end datetime.

  • ValueError – length of date_range is odd

  • ValueError – begin date of pair is not before end date of pair

  • ValueError – begin and end date not in data’s range

Return type:[tuple[datetime.datetime, datetime.datetime]]

sustaingym.envs.evcharging.train_gmm_model.create_gmm(site: sustaingym.envs.evcharging.utils.SiteStr, n_components: int, date_range: tuple[datetime.datetime, datetime.datetime]) None[source]#

Creates a custom GMM and saves in the gmms folder.

  • site (sustaingym.envs.evcharging.utils.SiteStr) – either ‘caltech’ or ‘jpl’

  • n_components (int) – number of components of Gaussian mixture model

  • date_range (tuple[datetime.datetime, datetime.datetime]) – a range of dates that falls inside 2018-11-01 and 2021-08-31.

Return type:


sustaingym.envs.evcharging.train_gmm_model.create_gmms(site: sustaingym.envs.evcharging.utils.SiteStr, n_components: int, date_ranges:[tuple[str, str]] = DEFAULT_DATE_RANGES) None[source]#

Creates multiple gmms and saves them in gmms folder.

  • site (sustaingym.envs.evcharging.utils.SiteStr) – either ‘caltech’ or ‘jpl’

  • n_components (int) – number of components of Gaussian mixture model

  • date_range – a sequence of 2-tuples of string dates in the format ‘YYYY-MM-DD’. Each tuple describes a date range, and should fall inside the range 2018-11-01 and 2021-08-31.

  • date_ranges ([tuple[str, str]]) –

Return type:

