Environment

class building_energy_storage_simulation.Environment(max_timesteps: int = 2000, num_forecasting_steps: int = 4, battery_capacity: float = 100, solar_power_installed: float = 240, max_battery_charge_per_timestep: float = 20)

Wraps the simulation as gymnasium environment, so it can be used easily for reinforcement learning.

Parameters
  • max_timesteps (int) – The number of steps after which the environment terminates

  • num_forecasting_steps (int) – The number of timesteps into the future included in the forecast. Note that the forecast is perfect.

  • solar_power_installed (float) – The installed peak photovoltaic power in kWp.

  • battery_capacity (float) – The capacity of the battery in kWh.

  • max_battery_charge_per_timestep (float) – Maximum amount of energy (kWh) which can be obtained from the battery or which can be used to charge the battery in one time step.

reset() Tuple[ObsType, dict]

Resetting the state of the simulation by calling reset() method from the simulation class.

Returns

Tuple of:
  1. An observation

  2. Empty hashmap: {}

Return type

(observation, dict)

step(action: ActType) Tuple[ObsType, float, bool, bool, dict]

Perform one step, which is done by:

  1. Performing one simulate_one_step()

  2. Calculating the reward

  3. Retrieving the observation

Parameters

action (float) –

Fraction of energy to be stored or retrieved from the battery. The action lies in [-1;1]. The action represents the fraction of max_battery_charge_per_timestep which should be used to charge or discharge the battery. 1 represents the maximum possible amount of energy which can be used to charge the

battery per time step.

Returns

Tuple of:
  1. observation

  2. reward

  3. terminated. If true, the episode is over.

  4. truncated. Is always false, it is not implemented yet.

  5. Additional Information about the electricity_comsumption and the excess_energy of the current time step

Return type

(observation, float, bool, bool, dict)