Back to Catalog
non-stationary-cartpole
A CartPole environment with continuously drifting physical parameters (pole length and mass) to test adaptation to non-stationary dynamics. Parameters evolve via configurable schedules (sinusoidal, random walk, or abrupt steps). Observation space optionally includes temporal awareness features (sin/cos of phase) to help the agent anticipate parameter changes.
Domain
classic_control
Difficulty
medium
Observation
Box(shape=[6])
Action
Discrete(shape=[1])
Reward
dense
Max Steps
1000
Version
v1
Tests (1/8)
syntaximportresetstepobs_spaceaction_spacereward_sanitydeterminism
Use via API
import kualia
env = kualia.make("non-stationary-cartpole")
obs, info = env.reset()Environment Code
1477 charsimport gymnasium as gym
import numpy as np
from typing import Optional, Dict, Any
class NonStationaryCartPoleEnv(gym.Env):
"""
A non-stationary CartPole environment where pole length and mass drift over time.
Observation Space:
Box(6,) if include_time_features=True:
[cart_pos_norm, cart_vel_norm, pole_angle_norm, pole_vel_norm, sin_phase, cos_phase]
Box(4,) if include_time_features=False:
[cart_pos_norm, cart_vel_norm, pole_angle_norm, pole_vel_norm]
All values normalized to [-1, 1].
Action Space:
Discrete(2): 0 = push left (force = -FORCE_MAG), 1 = push right (force = +FORCE_MAG)
Reward:
+1.0 for each step survived (dense), clipped to [-10, 10].
Dynamics:
Physical parameters evolve according to drift_mode and drift_schedule:
- gradual + sinusoidal: length = base + amp * sin(2*pi*drift_rate*t)
- gradual + random_walk: Brownian motion with mean reversion
- sudden + step: discrete jumps every (1/drift_rate) steps
"""
# Physics constants
GRAVITY: float = 9.8
CART_MASS: float = 1.0
FORCE_MAG: float = 10.0
TAU: float = 0.02 # Time step duration (seconds)
# Termination thresholds
THETA_THRESHOLD_RADIANS: float = 12 * 2 * np.pi / 360 # ~0.2094 rad
X_THRESHOLD: float = 2.4
# Base physical parameters
BASE_POLE_LENGTH: float = 0.5 # meters
BASE_POLE_