sustaingym.data.load_moer#

Methods for handling Marginal Operating Emissions Rate (MOER) data from the California Self-Generation Incentive Program. See http://sgipsignal.com/api-documentation for more information.

By default, saves MOER files to sustaingym/data/moer/{ba}_{year}-{month}.csv.gz where {ba} is the balancing authority. The default balancing authorities are 'SGIP_CAISO_PGE' and 'SGIP_CAISO_SCE'.

Module Contents#

Classes#

MOERLoader

Class for loading emission rates data for gyms.

Functions#

get_data_sgip(→ pandas.DataFrame)

Retrieves data from the SGIP Signal API.

get_historical_and_forecasts(→ pandas.DataFrame)

Retrieves historical and forecast MOER data.

save_monthly_moer(→ None)

Saves 1 month of historical and forecasted MOER data, with 1 day of

save_moer(→ None)

Saves all full-months data between a date range.

save_moer_default_ranges(→ None)

Saves all monthly data for default date ranges.

load_monthly_moer(→ pandas.DataFrame)

Loads pandas DataFrame from file.

load_moer(→ pandas.DataFrame)

Returns data for all months that overlap with interval.

Attributes#

sustaingym.data.load_moer.DEFAULT_DATE_RANGES = [('2019-05', '2021-08')][source]#
sustaingym.data.load_moer.DATE_FORMAT = '%Y-%m'[source]#
sustaingym.data.load_moer.USERNAME[source]#
sustaingym.data.load_moer.PASSWORD[source]#
sustaingym.data.load_moer.USERNAME = 'caltech'[source]#
sustaingym.data.load_moer.LOGIN_URL = 'https://sgipsignal.com/login/'[source]#
sustaingym.data.load_moer.DATA_URLS[source]#
sustaingym.data.load_moer.DATA_VERSIONS[source]#
sustaingym.data.load_moer.TIME_COLUMN[source]#
sustaingym.data.load_moer.SGIP_DT_FORMAT = '%Y-%m-%dT%H:%M:%S%z'[source]#
sustaingym.data.load_moer.FNAME_FORMAT_STR = '{ba}_{year}-{month:02}.csv.gz'[source]#
sustaingym.data.load_moer.DEFAULT_SAVE_DIR[source]#
sustaingym.data.load_moer.COMPRESSION = 'gzip'[source]#
sustaingym.data.load_moer.INDEX_NAME = 'time'[source]#
sustaingym.data.load_moer.BALANCING_AUTHORITIES = ['SGIP_CAISO_PGE', 'SGIP_CAISO_SCE'][source]#
sustaingym.data.load_moer.FIVEMINS[source]#
sustaingym.data.load_moer.ONEDAY[source]#
sustaingym.data.load_moer.get_data_sgip(starttime: str, endtime: str, ba: str, req_type: Literal[historical, forecasted], forecast_timesteps: int = 36) pandas.DataFrame[source]#

Retrieves data from the SGIP Signal API.

Authenticates user, performs API request, and returns data as a DataFrame. If req_type is 'historical', returns the historical MOER. If req_type is 'forecast', returns the forecast for emissions rate at the next 5 minute mark. See https://sgipsignal.com/api-documentation

Parameters:
  • starttime (str) – start time. Format ISO 8601 timestamp. See https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations

  • endtime (str) – end time for data, inclusive. See starttime. Historical queries are limited to 31 days. Forecast queries are are limited to 1 day.

  • ba (str) – balancing authority, responsible for region grid operation.

  • req_type (Literal[historical, forecasted]) – either 'historical' or 'forecast'

  • forecast_timesteps (int) – number of forecast timesteps to grab (in 5 min increments), default is 36 timesteps (=3 hours)

Returns:

df – DataFrame containing either historical or forecasted rates with a DateTimeIndex named “time”. The time index type is datetime64[ns, UTC] (in UTC time).

If forecast:

f1                        float64
f2                        float64
...
f{forecast_timesteps}     float64

If historical:

moer                      float64
Return type:

pandas.DataFrame

Example:

starttimestr = '2021-02-20T00:00:00+0000'
endtimestr = '2021-02-20T23:10:00+0000'
ba = 'SGIP_CAISO_PGE'
df = get_data_sgip(starttimestr, endtimestr, ba, 'forecasted')
sustaingym.data.load_moer.get_historical_and_forecasts(starttime: datetime.datetime, endtime: datetime.datetime, ba: str) pandas.DataFrame[source]#

Retrieves historical and forecast MOER data.

May request forecasted data repeatedly due to API constraints. See notes section in get_data_sgip() for more info.

Parameters:
  • starttime (datetime.datetime) – start time. A timezone-aware datetime object. If not timezone-aware, assumes UTC time.

  • endtime (datetime.datetime) – timezone-aware datetime object. See starttime.

  • ba (str) – balancing authority, responsible for region grid operation.

Returns:

combined_df – DataFrame of both historical and forecasted MOER values

time (index)    datetime64[ns, UTC]
moer            float64               historical MOER at given time
f1              float64               forecast for time+5min, generated at given time
...
f36             float64               forecast for time+3h, generated at given time
Return type:

pandas.DataFrame

sustaingym.data.load_moer.save_monthly_moer(year: int, month: int, ba: str, save_dir: str) None[source]#

Saves 1 month of historical and forecasted MOER data, with 1 day of padding on either end.

May request forecasted data repeatedly due to API constraints. See notes in get_data_sgip() for more info. NaNs in data are imputed with the previous non-NaN value.

Parameters:
  • year (int) – year of requested month

  • month (int) – requested month

  • ba (str) – balancing authority, responsible for region grid operation.

  • save_dir (str) – directory to save compressed csv to.

Return type:

None

sustaingym.data.load_moer.save_moer(starttime: datetime.datetime, endtime: datetime.datetime, ba: str) None[source]#

Saves all full-months data between a date range.

Saves data separated by months as separate compressed csv files, which contain historical and forecasted marginal emission rates for the days spanning the month.

Parameters:
  • starttime (datetime.datetime) – start time for data. Only year and month are used. Timezone information is ignored.

  • endtime (datetime.datetime) – end time for data. See starttime.

  • ba (str) – balancing authority, responsible for region grid operation.

Return type:

None

sustaingym.data.load_moer.save_moer_default_ranges() None[source]#

Saves all monthly data for default date ranges.

Repeatedly calls save_moer() for all months spanned by the default ranges. Saves for both balancing authorities: ‘SGIP_CAISO_PGE’, ‘SGIP_CAISO_SCE’.

Return type:

None

sustaingym.data.load_moer.load_monthly_moer(year: int, month: int, ba: str, save_dir: str | None = None) pandas.DataFrame[source]#

Loads pandas DataFrame from file.

Parameters:
  • year (int) – year of requested month

  • month (int) – requested month

  • ba (str) – balancing authority, responsible for region grid operation

  • save_dir (str | None) – directory to save compressed csv to

Returns:

df – DataFrame of the emission rates for the month, with index sorted chronologically. See get_historical_and_forecasts() for more info.

Return type:

pandas.DataFrame

sustaingym.data.load_moer.load_moer(starttime: datetime.datetime, endtime: datetime.datetime, ba: str, save_dir: str | None = None) pandas.DataFrame[source]#

Returns data for all months that overlap with interval.

Parameters:
  • starttime (datetime.datetime) – start time for data. Only year and month are used.

  • endtime (datetime.datetime) – end time for data. See starttime.

  • ba (str) – balancing authority, responsible for region grid operation

  • save_dir (str | None) – directory to load compressed csvs from

Returns:

df – DataFrame of historical emissions and forecasts for all months that overlap the (starttime, endtime) interval. Index is sorted chronologically. See get_historical_and_forecasts() for more info.

Return type:

pandas.DataFrame

Example:

starttime, endtime = datetime(2021, 2, 1), datetime(2021, 5, 31)
ba = 'SGIP_CAISO_PGE'
df = load_moer(starttime, endtime, ba, 'sustaingym/data/moer')
class sustaingym.data.load_moer.MOERLoader(starttime: datetime.datetime, endtime: datetime.datetime, ba: str, save_dir: str | None = None)[source]#

Class for loading emission rates data for gyms.

Parameters:
  • starttime (datetime.datetime) – start time for data. Only year and month are used

  • endtime (datetime.datetime) – end time for data. See starttime

  • ba (str) – balancing authority, responsible for region grid operation

  • save_dir (str | None) – directory to load compressed csv from

df#

DataFrame of historical emissions and forecasts for all months that overlap the (starttime, endtime) interval. Index is sorted chronologically. See get_historical_and_forecasts() for more info.

retrieve(dt: datetime.datetime) numpy.ndarray[source]#

Retrieves MOER data starting at given datetime for next 24 hours.

Parameters:

dt (datetime.datetime) – a timezone-aware datetime object

Returns:

data – array of shape (289, 37). The first column is the historical MOER. The remaining columns are forecasts for the next 36 five-min time steps. Units kg CO2 per kWh. Rows are sorted chronologically.

Return type:

numpy.ndarray