Environment¶
- class building_energy_storage_simulation.Environment(building_simulation: BuildingSimulation, max_timesteps: int = 2000, num_forecasting_steps: int = 4, randomize_start_time_step: bool = False, randomize_forecasts_in_observation: bool = False)¶
Wraps the simulation as gymnasium environment, so it can be used easily for reinforcement learning.
- Parameters:
max_timesteps (int) – The number of steps after which the environment terminates
num_forecasting_steps (int) – The number of timesteps into the future included in the forecast. Note that the forecast is perfect.
building_simulation (BuildingSimulation) – Instance of BuildingSimulation to be wrapped as gymnasium environment.
randomize_start_time_step (bool) – Randomizes the start_index in the BuildingSimulation. this should help prevent the agent from overfitting to the data profile during training (otherwise it will always see the same time series from the same start point)
- reset(seed=None, options=None) Tuple[ObsType, dict]¶
Resetting the state of the simulation by calling reset() method from the simulation class.
- Returns:
- Tuple of:
An observation
Empty hashmap: {}
- Return type:
(observation, dict)
- step(action: ActType) Tuple[ObsType, float, bool, bool, dict]¶
Perform one step, which is done by:
Performing one simulate_one_step()
Calculating the reward
Retrieving the observation
- Parameters:
action (float) – Fraction of energy to be stored or retrieved from the battery. The action lies in [-1;1]. The action represents the fraction of max_battery_charge_per_timestep which should be used to charge or discharge the battery. 1 represents the maximum possible amount of energy which can be used to charge the battery per time step.
- Returns:
- Tuple of:
observation
reward
terminated. If true, the episode is over.
truncated. Is always false, as it is not implemented yet.
Additional Information about the electricity_consumption and the electricity_price of the current time step
- Return type:
(observation, float, bool, bool, dict)