sustaingym.envs.building#

Submodules#

Package Contents#

Classes#

BuildingEnv

BuildingEnv class.

MultiAgentBuildingEnv

Multi-agent building environment.

Functions#

ParameterGenerator(* 7, ground_temp, shgc, ...)

Generates parameters from the selected building and temperature file for the env.

class sustaingym.envs.building.BuildingEnv(parameters: dict[str, Any])[source]#

Bases: gymnasium.Env

BuildingEnv class.

This classes simulates the zonal temperature of a building controlled by a user selected agent. It constructs the physics-based building simulation model based on the RC model with a nonlinear residual model. The simulation is based on the EPW weather file provided by the Building Energy Codes Program.

This environment’s API is known to be compatible with Gymnasium v0.28, v0.29.

In what follows:

  • n = number of zones (rooms) in the building

  • k = number of steps for the MOER CO2 forecast

  • T = length of time-series data

Actions:

Type: Box(n)
Action                                           Shape       Min         Max
HVAC power consumption(cool in - ,heat in +)     n           -1          1

Observations:

TODO: fix min/max for occupower

Type: Box(n+4)
                                                 Shape       Min         Max
Temperature of zones (celsius)                   n           temp_min    temp_max
Temperature of outdoor (celsius)                 1           temp_min    temp_max
Temperature of ground (celsius)                  1           temp_min    temp_max
Global Horizontal Irradiance (W)                 1           0           heat_max
Occupancy power (W)                              1           0           heat_max
Parameters:

parameters (dict[str, Any]) –

dict of parameters for the environment

  • ’n’ (int): number of rooms

  • ’zones’ (list[Zone]): list of length n, information about each zone

  • ’target’ (np.ndarray): shape (n,), target temperature of each room

  • ’out_temp’ (np.ndarray): shape (T,), outdoor temperature

  • ’ground_temp’ (np.ndarray): shape (T,), ground temperature

  • ’ghi’ (np.narray): shape (T,), global horizontal irradiance, normalized to [0, 1]

  • ’metabolism’ (np.ndarray): shape (T,), total metabolic rate of occupants (in W)

  • ’reward_beta’ (float): temperature error penalty, for the reward function

  • ’reward_pnorm’ (float): p to use for norm in reward function

  • ’ac_map’ (np.ndarray): boolean array of shape (n,) specifying presence (1) or absence (0) of AC in individual rooms

  • ’max_power’ (float): max power output of a single HVAC unit (in W)

  • ’temp_range’ (tuple[float, float]): tuple of (min temp, max temp) in Celsius, defining the possible temperature in the building

  • ’is_continuous_action’ (bool): determines action space (Box vs. MultiDiscrete).

  • ’time_resolution’ (int): time resolution of the simulation (in seconds)

  • ’A’ (np.ndarray): A matrix, shape (n, n+1)

  • ’B’ (np.ndarray): B matrix of shape (n, n+3)

  • ’D’ (np.ndarray): D vector of shape (n,)

parameters#

Dictionary containing the parameters for the environment.

Type:

dict

observation_space#

structure of observations returned by environment

timestep#

current timestep in episode, from 0 to 288

action_space#

structure of actions expected by environment

OCCU_COEF = [6.461927, 0.946892, 2.55737e-05, 0.0627909, 5.89172e-05, 0.19855, 0.000940018, 1.49532e-06]#
OCCU_COEF_LINEAR = 7.139322#
DISCRETE_LENGTH = 100#
SCALING_FACTOR = 24#
state: numpy.ndarray#
step(action: numpy.ndarray) tuple[numpy.ndarray, float, bool, bool, dict[str, Any]][source]#

Steps the environment.

Updates the state of the environment based on the given action and calculates the reward, done, and info values for the current timestep.

Parameters:

action (numpy.ndarray) – Action to be taken in the environment.

Returns:
  • state – array of shape (n+4,), updated state of the environment. Contains:

    • ‘X_new’: shape [n], new temperatures of the rooms.

    • ‘out_temp’: scalar, outdoor temperature (°C) the current timestep

    • ‘ground_temp’: scalar, ground temperature (°C) at current timestep

    • ‘ghi’: scalar, global horizontal irradiance at the current timestep.

    • ‘Occupower’: scalar, occupancy power at the current timestep.

  • reward – Reward for the current timestep.

  • terminated – Whether the episode is terminated.

  • truncated – Whether the episode has reached a time limit.

  • info – Dictionary containing auxiliary information.

    • ‘statelist’: List of states in the environment.

    • ‘actionlist’: List of actions taken in the environment.

    • ‘epochs’: Counter for the number of epochs (int).

Return type:

tuple[numpy.ndarray, float, bool, bool, dict[str, Any]]

reset(*, seed: int | None = None, options: dict | None = None) tuple[numpy.ndarray, dict[str, Any]][source]#

Resets the environment.

Prepares the environment for the next episode by setting the initial temperatures, average temperature, occupancy, and occupower. The initial state is constructed by concatenating these variables.

Parameters:
  • seed (int | None) – seed for resetting the environment. The seed determines which episode to start at. Increment the seed sequentially to experience episodes in chronological order. Set seed to None for a random episode. An episode is entirely reproducible no matter the generator used.

  • options (dict | None) –

    optional resetting options

    • ’T_initial’: np.ndarray, shape [n], initial temperature of each zone

Returns:
  • state – the initial state of the environment. See step()

  • info – information dictionary. See step()

Return type:

tuple[numpy.ndarray, dict[str, Any]]

train(states: numpy.ndarray, actions: numpy.ndarray) None[source]#

Trains the linear regression model using the given states and actions.

The model is trained to predict the next state based on the current state and action. The trained coefficients are stored in the environment for later use.

Parameters:
  • states (numpy.ndarray) – a list of states.

  • actions (numpy.ndarray) – a list of actions corresponding to each state.

Return type:

None

class sustaingym.envs.building.MultiAgentBuildingEnv(parameters: dict[str, Any])[source]#

Bases: pettingzoo.ParallelEnv

Multi-agent building environment.

Each agent controls the AC unit in a single zone. Agent IDs are integers.

This environment’s API is known to be compatible with PettingZoo v1.24.1

Parameters:
  • parameters (dict[str, Any]) – dict of parameters for the environment (see BuildingEnv)

  • global_obs – whether each agent observes the global state or only the temperature of its own zone

# attributes required by pettingzoo.ParallelEnv
agents#

list[int], agent IDs, indices of zones with AC units

possible_agents#

list[int], same as agents

observation_spaces#

dict[int, spaces.Box], observation space for each agent

action_spaces#

dict[int, spaces.Box], action space for each agent

# attributes specific to MultiAgentBuildingEnv
single_env#

BuildingEnv

periods_delay#

int, time periods of delay for inter-agent communication

step(actions: collections.abc.Mapping[int, numpy.ndarray]) tuple[dict[int, numpy.ndarray], dict[int, float], dict[int, bool], dict[int, bool], dict[int, dict[str, Any]]][source]#

Returns: obss, rewards, terminateds, truncateds, infos

Parameters:

actions (collections.abc.Mapping[int, numpy.ndarray]) –

Return type:

tuple[dict[int, numpy.ndarray], dict[int, float], dict[int, bool], dict[int, bool], dict[int, dict[str, Any]]]

reset(seed: int | None = None, options: dict | None = None) tuple[dict[int, numpy.ndarray], dict[int, dict[str, Any]]][source]#

Resets the environment.

Parameters:
  • seed (int | None) –

  • options (dict | None) –

Return type:

tuple[dict[int, numpy.ndarray], dict[int, dict[str, Any]]]

render() None[source]#

Render environment.

Return type:

None

close() None[source]#

Close the environment.

Return type:

None

state() numpy.ndarray[source]#
Return type:

numpy.ndarray

observation_space(agent: int) gymnasium.spaces.Space[source]#
Parameters:

agent (int) –

Return type:

gymnasium.spaces.Space

action_space(agent: int) gymnasium.spaces.Box | gymnasium.spaces.Discrete[source]#
Parameters:

agent (int) –

Return type:

gymnasium.spaces.Box | gymnasium.spaces.Discrete

sustaingym.envs.building.ParameterGenerator(building: str, weather: str, location: str, U_Wall: Ufactor = (0,) * 7, ground_temp: collections.abc.Sequence[float] = (0,) * 12, shgc: float = 0.252, shgc_weight: float = 0.01, ground_weight: float = 0.5, full_occ: numpy.ndarray | float = 0, max_power: numpy.ndarray | int = 8000, ac_map: numpy.ndarray | int = 1, time_res: int = 300, reward_beta: float = 0.999, reward_pnorm: float = 2, target: numpy.ndarray | float = 22, activity_sch: numpy.ndarray | float = 120, temp_range: tuple[float, float] = (-40, 40), is_continuous_action: bool = True, root: str = '', stochastic_summer_percentage: float | None = None, episode_len: int = 288, block_size: int = None) dict[str, Any][source]#

Generates parameters from the selected building and temperature file for the env.

Parameters:
  • building (str) – name of a building in BUILDINGS, or path (relative to root) to a htm file for building idf

  • weather (str) – name of a weather condition in WEATHER, or path to an epw file.

  • location (str) – name of a location in GROUND_TEMP

  • U_Wall (Ufactor) – list of 7 U-values (thermal transmittance) for different surfaces in the building in the order [intwall, floor, outwall, roof, ceiling, groundfloor, window]. Only used if building cannot be found in BUILDINGS

  • ground_temp (collections.abc.Sequence[float]) – monthly ground temperature (in Celsius) when location is not in GROUND_TEMP

  • shgc (float) – Solar Heat Gain Coefficient for windows (unitless)

  • shgc_weight (float) – Weight factor for extra loss of solar irradiance (ghi)

  • ground_weight (float) – Weight factor for extra loss of heat from ground

  • full_occ (numpy.ndarray | float) – max number of people that can occupy each room, either an array of shape (n,) specifying maximum for each room, or a scalar maximum that applies to all rooms

  • max_power (numpy.ndarray | int) – max power output of a single HVAC unit (in W)

  • ac_map (numpy.ndarray | int) – binary indicator of presence (1) or absence (0) of AC, either a boolean array of shape (n,) to specify AC presence in individual rooms, or a scalar indicating AC presence in all rooms

  • time_res (int) – length of 1 timestep in seconds. Default is 300 (5 min).

  • reward_beta (float) – temperature error penalty weight for reward function

  • reward_pnorm (float) – p to use for norm in reward function

  • target (numpy.ndarray | float) – target temperature setpoints (in Celsius), either an array specifying individual setpoints for each zone, or a scalar setpoint for all zones

  • activity_sch (numpy.ndarray | float) – metabolic rate of people in the building (in W), either an array of shape (T,) to specify metabolic rate at every time step, or a scalar rate for all time steps

  • temp_range (tuple[float, float]) – (min temperature, max temperature) in Celsius, defining the possible temperature in the building

  • is_continuous_action (bool) – whether to use continuous action space (as opposed to MultiDiscrete)

  • root (str) – root directory for building and weather data files, only used when building and weather do not correspond to keys in BUILDINGS and WEATHER

  • stochastic_summer_percentage (float | None) – fraction (between 0 and 1) of the generated observations that should be weighted toward those from the summer distribution. None if not using stochastic features

  • episode_len (int) – number of time steps in each episode (default: 288 steps at 5-min time_res is 1 day)

  • block_size (int) – size (in hours) of blocks of data to fit distributions to (e.g., block_size=24 will sample daily blocks of 24 hourly observations to fit distributions to)

Returns:

parameters – Contains all parameters needed for environment initialization.

Return type:

dict[str, Any]