custom-env
Gymnasium-compatible continuous 2D navigation environment (10m x 10m arena). Observation space: Box(low=-10, high=10, shape=(14,), dtype=float32) containing [agent_x, agent_y, agent_vx, agent_vy, goal_relative_x, goal_relative_y, 8_lidar_readings (ray cast distances)]. Action space: Box(low=-1, high=1, shape=(2,), dtype=float32) representing [force_x, force_y]. Physics uses Euler integration with velocity damping 0.9. The environment maintains a competency buffer tracking success/failure of last 20 episodes to compute score c ∈ [0,1]. Adaptation mechanism: Obstacle count N = 5 + floor(15*c). Placement policy evolves continuously: (1) Random uniform when c < 0.33; (2) Corridor-blocking when 0.33 ≤ c < 0.66 using k-means clustering on recent agent trajectories to identify high-traffic zones, placing obstacles to minimize passage width; (3) Adversarial placement when c ≥ 0.66 using trajectory distribution analysis to maximize expected path length to goal. Obstacles are static circles with radii 0.3-0.6m. Reward: r_t = -0.1*||pos - goal||_2 - 0.01*||action||^2 + 10*success_flag - 5*collision_flag. Episode terminates on goal reach (distance < 0.5m), collision, or 500 steps. Reset() randomizes start/goal positions (min separation 8m) and regenerates obstacles via current placement policy based on c.
Domain
navigation
Difficulty
medium
Observation
Box(shape=?)
Action
Discrete(shape=?)
Reward
see spec
Max Steps
1000
Version
v1
Tests (0/8)
Use via API
import kualia
env = kualia.make("custom-env-1774053027")
obs, info = env.reset()Environment Code
1783 chars===ENV_SPEC===
{
"name": "adaptive-2d-nav-v0",
"domain": "navigation",
"description": "Continuous 2D navigation in a 10m x 10m arena with adaptive obstacle placement based on agent competency. The environment maintains a competency buffer (last 20 episodes) to compute score c ∈ [0,1]. Obstacle count and placement strategy evolve with c: random uniform (c < 0.33), corridor-blocking via k-means on trajectories (0.33 ≤ c < 0.66), or adversarial wall placement (c ≥ 0.66). The agent observes its position, velocity, relative goal position, and 8 LIDAR rangefinder readings.",
"observation_space": {
"type": "Box",
"shape": [14],
"low": -10.0,
"high": 10.0,
"components": [
"agent_x (0 to 10)",
"agent_y (0 to 10)",
"agent_vx",
"agent_vy",
"goal_relative_x",
"goal_relative_y",
"lidar_ray_0",
"lidar_ray_1",
"lidar_ray_2",
"lidar_ray_3",
"lidar_ray_4",
"lidar_ray_5",
"lidar_ray_6",
"lidar_ray_7"
]
},
"action_space": {
"type": "Box",
"shape": [2],
"low": -1.0,
"high": 1.0,
"components": [
"force_x",
"force_y"
]
},
"reward_function": {
"type": "dense",
"components": {
"distance_penalty": "-0.1 * L2_distance_to_goal",
"action_penalty": "-0.01 * ||action||^2",
"success_bonus": "10.0 if distance < 0.5m",
"collision_penalty": "-5.0 if collision with obstacle"
},
"range": [-10.0, 10.0]
},
"episode": {
"max_steps": 500,
"termination_conditions": [
"Goal reached (distance < 0.5m)",
"Collision with obstacle"
],
"truncation_conditions": [
"Step limit reached (500 steps)"
]
},
"parameters": {
"arena_size": 10.0,
"max_steps": 500