resource_management medium

custom-env

Gymnasium-compatible continuous resource management with 3 interdependent resources (A, B, C). Observation space: Box(low=0, high=100, shape=(15,), dtype=float32): [storage_A, storage_B, storage_C, demand_A, demand_B, demand_C, demand_derivative_A, demand_derivative_B, demand_derivative_C, coupling_AB, coupling_BC, coupling_CA, time_since_shock, rolling_efficiency_score, normalized_step]. Action space: Box(low=0, high=10, shape=(6,), dtype=float32): [produce_A, produce_B, produce_C, convert_A_to_B, convert_B_to_C, convert_C_to_A]. Dynamics: storage_t+1 = storage_t + production + conversion_in - conversion_out - demand_t - waste. Demand follows non-stationary process d_t = d_base + α*sin(ω*t) where ω = ω_base*(1+e) scales with efficiency e ∈ [0,1] (rolling satisfied_demand/total_demand over 100 steps). Shock events occur with probability p = 0.01 + 0.2*max(0, e-0.7). Coupling coefficients C_ij (resource i requires resource j) evolve as C_ij = C_base * e, creating progressive interdependencies. Higher e increases production complexity and demand non-stationarity. Reward: r_t = -sum(|demand_t - satisfied_t|) - 0.5*sum(waste) - 0.01*||action||^2. Episode length: 1000 steps. Reset() initializes storage at 50 units, sets coupling matrix based on performance history (persistence across episodes), and samples new demand phase parameters.

Observation Space

Box(shape=?)

Action Space

Discrete(shape=?)

Reward

see spec

Back to Environments