sustaingym.envs.building.env
#
The module implements the BuildingEnv class.
Module Contents#
Classes#
BuildingEnv class. |
- class sustaingym.envs.building.env.BuildingEnv(parameters: dict[str, Any])[source]#
Bases:
gymnasium.Env
BuildingEnv class.
This classes simulates the zonal temperature of a building controlled by a user selected agent. It constructs the physics-based building simulation model based on the RC model with a nonlinear residual model. The simulation is based on the EPW weather file provided by the Building Energy Codes Program.
This environment’s API is known to be compatible with Gymnasium v0.28, v0.29.
In what follows:
n
= number of zones (rooms) in the buildingk
= number of steps for the MOER CO2 forecastT
= length of time-series data
Actions:
Type: Box(n) Action Shape Min Max HVAC power consumption(cool in - ,heat in +) n -1 1
Observations:
TODO: fix min/max for occupower
Type: Box(n+4) Shape Min Max Temperature of zones (celsius) n temp_min temp_max Temperature of outdoor (celsius) 1 temp_min temp_max Temperature of ground (celsius) 1 temp_min temp_max Global Horizontal Irradiance (W) 1 0 heat_max Occupancy power (W) 1 0 heat_max
- Parameters:
parameters (dict[str, Any]) –
dict of parameters for the environment
’n’ (int): number of rooms
’zones’ (list[Zone]): list of length n, information about each zone
’target’ (np.ndarray): shape (n,), target temperature of each room
’out_temp’ (np.ndarray): shape (T,), outdoor temperature
’ground_temp’ (np.ndarray): shape (T,), ground temperature
’ghi’ (np.narray): shape (T,), global horizontal irradiance, normalized to [0, 1]
’metabolism’ (np.ndarray): shape (T,), total metabolic rate of occupants (in W)
’reward_beta’ (float): temperature error penalty, for the reward function
’reward_pnorm’ (float): p to use for norm in reward function
’ac_map’ (np.ndarray): boolean array of shape (n,) specifying presence (1) or absence (0) of AC in individual rooms
’max_power’ (float): max power output of a single HVAC unit (in W)
’temp_range’ (tuple[float, float]): tuple of (min temp, max temp) in Celsius, defining the possible temperature in the building
’is_continuous_action’ (bool): determines action space (Box vs. MultiDiscrete).
’time_resolution’ (int): time resolution of the simulation (in seconds)
’A’ (np.ndarray): A matrix, shape (n, n+1)
’B’ (np.ndarray): B matrix of shape (n, n+3)
’D’ (np.ndarray): D vector of shape (n,)
- parameters#
Dictionary containing the parameters for the environment.
- Type:
dict
- observation_space#
structure of observations returned by environment
- timestep#
current timestep in episode, from 0 to 288
- action_space#
structure of actions expected by environment
- OCCU_COEF = [6.461927, 0.946892, 2.55737e-05, 0.0627909, 5.89172e-05, 0.19855, 0.000940018, 1.49532e-06][source]#
- step(action: numpy.ndarray) tuple[numpy.ndarray, float, bool, bool, dict[str, Any]] [source]#
Steps the environment.
Updates the state of the environment based on the given action and calculates the reward, done, and info values for the current timestep.
- Parameters:
action (numpy.ndarray) – Action to be taken in the environment.
- Returns:
state – array of shape (n+4,), updated state of the environment. Contains:
‘X_new’: shape [n], new temperatures of the rooms.
‘out_temp’: scalar, outdoor temperature (°C) the current timestep
‘ground_temp’: scalar, ground temperature (°C) at current timestep
‘ghi’: scalar, global horizontal irradiance at the current timestep.
‘Occupower’: scalar, occupancy power at the current timestep.
reward – Reward for the current timestep.
terminated – Whether the episode is terminated.
truncated – Whether the episode has reached a time limit.
info – Dictionary containing auxiliary information.
‘statelist’: List of states in the environment.
‘actionlist’: List of actions taken in the environment.
‘epochs’: Counter for the number of epochs (int).
- Return type:
tuple[numpy.ndarray, float, bool, bool, dict[str, Any]]
- reset(*, seed: int | None = None, options: dict | None = None) tuple[numpy.ndarray, dict[str, Any]] [source]#
Resets the environment.
Prepares the environment for the next episode by setting the initial temperatures, average temperature, occupancy, and occupower. The initial state is constructed by concatenating these variables.
- Parameters:
seed (int | None) – seed for resetting the environment. The seed determines which episode to start at. Increment the seed sequentially to experience episodes in chronological order. Set seed to None for a random episode. An episode is entirely reproducible no matter the generator used.
options (dict | None) –
optional resetting options
’T_initial’: np.ndarray, shape [n], initial temperature of each zone
- Returns:
- Return type:
tuple[numpy.ndarray, dict[str, Any]]
- train(states: numpy.ndarray, actions: numpy.ndarray) None [source]#
Trains the linear regression model using the given states and actions.
The model is trained to predict the next state based on the current state and action. The trained coefficients are stored in the environment for later use.
- Parameters:
states (numpy.ndarray) – a list of states.
actions (numpy.ndarray) – a list of actions corresponding to each state.
- Return type:
None