Environment¶
- class building_energy_storage_simulation.Environment(max_timesteps: int = 2000, num_forecasting_steps: int = 4, battery_capacity: float = 100, solar_power_installed: float = 240, max_battery_charge_per_timestep: float = 20)¶
Wraps the simulation as gymnasium environment, so it can be used easily for reinforcement learning.
- Parameters
max_timesteps (int) – The number of steps after which the environment terminates
num_forecasting_steps (int) – The number of timesteps into the future included in the forecast. Note that the forecast is perfect.
solar_power_installed (float) – The installed peak photovoltaic power in kWp.
battery_capacity (float) – The capacity of the battery in kWh.
max_battery_charge_per_timestep (float) – Maximum amount of energy (kWh) which can be obtained from the battery or which can be used to charge the battery in one time step.
- reset() Tuple[ObsType, dict]¶
Resetting the state of the simulation by calling reset() method from the simulation class.
- Returns
- Tuple of:
An observation
Empty hashmap: {}
- Return type
(observation, dict)
- step(action: ActType) Tuple[ObsType, float, bool, bool, dict]¶
Perform one step, which is done by:
Performing one simulate_one_step()
Calculating the reward
Retrieving the observation
- Parameters
action (float) –
Fraction of energy to be stored or retrieved from the battery. The action lies in [-1;1]. The action represents the fraction of max_battery_charge_per_timestep which should be used to charge or discharge the battery. 1 represents the maximum possible amount of energy which can be used to charge the
battery per time step.
- Returns
- Tuple of:
observation
reward
terminated. If true, the episode is over.
truncated. Is always false, it is not implemented yet.
Additional Information about the electricity_comsumption and the excess_energy of the current time step
- Return type
(observation, float, bool, bool, dict)