ProjectsSeptember 26, 2025

Reinforcement Learning Agents

image
Implemented reinforcement learning agents that learn to act optimally in unknown environments. The agents were tested in Gridworld, a simulated robot crawler, and Pacman, learning behavior from interaction rather than explicit supervision.
  • Markov Decision Processes (MDPs): Modeled environments with states, actions, rewards, and transitions.
  • Value iteration: Computed optimal value functions and policies using dynamic programming.
  • Q-learning: Implemented model-free learning that estimates action values from experience.
  • Exploration vs. exploitation: Tuned ε-greedy exploration strategies for effective learning.
  • Approximate Q-learning: Used feature-based representations to generalize across large state spaces in Pacman.
  • Gridworld agent converges to optimal or near-optimal policies based on reward structure.
  • Crawler agent learns stable walking behavior through trial and error.
  • Pacman agent learns to collect food and avoid ghosts without an explicit environment model.
  • How reward design and discount factors shape agent behavior.
  • The trade-offs between model-based (value iteration) and model-free (Q-learning) methods.
  • How function approximation enables RL in larger, more complex state spaces.

Related projects

Game AI: Search and Adversarial Agents

Game AI: Search and Adversarial Agents

Built intelligent Pacman agents using classic search algorithms and adversarial planning (minimax, expectimax, alpha–beta pruning).