Code Flow Analysis
This section provides a comprehensive analysis of the code execution flow in Board Game Arena, helping developers understand how the framework orchestrates game simulations and agent interactions.
Overview
The Game Reasoning Arena framework follows a structured execution flow that handles multiple game types, agent configurations, and backend systems. The code flow analysis covers three main perspectives:
High-level Architecture Flow - Overall system organization
Method Call Hierarchy - Detailed function call sequences
Data Flow Analysis - Information transformation through the system
Method Call Flow for Matrix Games
The following diagram illustrates the complete method call flow, with special focus on matrix games (simultaneous-move games like Rock-Paper-Scissors):
Note
For interactive Mermaid diagrams, see the source files method_call_flow.md or flow_visualization.html in the project root.
High-Level Flow Overview:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ runner.py │───▶│ simulate_game │───▶│ compute_actions │
│ ::main() │ │ │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ parse_config │ │ initialize_ │ │ env.step │
│ run_simulation │ │ policies │ │ apply_actions │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Detailed Method Call Flow:
runner.py::main()
├── parse_config()
├── run_simulation()
│ ├── initialize_ray() [if enabled]
│ └── simulate_game()
│ ├── initialize_llm_registry()
│ ├── initialize_policies()
│ │ ├── AGENT_REGISTRY[agent_type]
│ │ ├── LLMAgent.__init__(model_name, game_name)
│ │ └── RandomAgent.__init__(seed)
│ ├── registry.make_env()
│ │ ├── registry.get_game_loader()
│ │ └── MatrixGameEnv.__init__()
│ │ └── OpenSpielEnv.__init__()
│ │
│ └── Episode Loop:
│ ├── env.reset(seed)
│ │ ├── game.new_initial_state()
│ │ ├── _solve_chance_nodes()
│ │ └── _state_to_observation()
│ │ └── _generate_prompt(agent_id)
│ │
│ └── while not terminated:
│ ├── compute_actions()
│ │ ├── is_simultaneous_node() → True [matrix games]
│ │ ├── LLMAgent.compute_action()
│ │ │ └── generate_response()
│ │ └── RandomAgent.compute_action()
│ │ └── random.choice(legal_actions)
│ │
│ ├── env.step(action_dict)
│ │ ├── state.apply_actions([action_0, action_1])
│ │ ├── _compute_reward()
│ │ └── state.is_terminal()
│ │
│ └── Logging:
│ ├── log_llm_action()
│ ├── SQLiteLogger.log_move()
│ └── TensorBoard metrics
Key Decision Points:
Simultaneous vs Turn-based:
is_simultaneous_node()determines whether all players act (matrix games) or only current player (turn-based games)LLM vs Local:
generate_response()uses Ray batch processing for remote LLMs or direct calls for local modelsGame Termination:
state.is_terminal()controls episode completion
Initialization Chain
The system startup follows this hierarchical initialization pattern:
runner.py::main()
├── parse_config()
├── run_simulation()
├── initialize_ray() [if enabled]
└── simulate_game()
├── initialize_llm_registry()
├── initialize_policies()
│ ├── registry.get_game_loader().num_players()
│ ├── AGENT_REGISTRY[agent_type]
│ ├── LLMAgent.__init__(model_name, game_name)
│ └── RandomAgent.__init__(seed)
└── registry.make_env()
├── registry.get_game_loader()
├── loader_class.load() [OpenSpiel game]
└── MatrixGameEnv.__init__()
└── OpenSpielEnv.__init__()
Episode Execution Chain
Each game episode follows this execution pattern:
simulate_game()
└── for episode in range(num_episodes):
├── env.reset(seed)
│ ├── game.new_initial_state()
│ ├── _solve_chance_nodes()
│ └── _state_to_observation()
│ └── _generate_prompt(agent_id)
│ ├── state.legal_actions(agent_id)
│ └── state.action_to_string(agent_id, action)
│
└── while not (terminated or truncated):
├── compute_actions()
│ ├── env.state.is_simultaneous_node() [True for matrix games]
│ └── for each player:
│ └── player_to_agent[player](observation)
│ ├── LLMAgent.compute_action()
│ │ ├── generate_response() or Ray call
│ │ └── extract action from LLM response
│ └── RandomAgent.compute_action()
│ └── random.choice(legal_actions)
│
├── env.step(action_dict)
│ ├── state.apply_actions([action_0, action_1, ...])
│ ├── _compute_reward()
│ ├── state.is_terminal()
│ └── _state_to_observation() [if not terminal]
│
└── Logging:
├── log_llm_action()
├── SQLiteLogger.log_move()
└── TensorBoard metrics
Matrix Game Specific Flow
For matrix games like Rock-Paper-Scissors between an LLM and Random agent:
- 1. State to Observation
For Player 0 (LLM): legal_actions: [0, 1, 2] (Rock, Paper, Scissors), prompt: “You are Player 0 in matrix_rps…”
For Player 1 (Random): legal_actions: [0, 1, 2], prompt: “You are Player 1 in matrix_rps…”
- 2. Action Computation
is_simultaneous_node() → True
Player 0: LLMAgent.compute_action() → Send prompt to LLM → Parse response for action number → Return {“action”: 1, “reasoning”: “…”}
Player 1: RandomAgent.compute_action() → Return random.choice([0, 1, 2])
- 3. Environment Step
env.step({0: 1, 1: 2}) # Player 0: Paper, Player 1: Scissors
state.apply_actions([1, 2])
OpenSpiel calculates: Player 1 wins (Scissors cuts Paper)
rewards: {0: -1, 1: +1}
Check if terminal (single round) or continue
Key Class Interactions
The main class relationships and interactions follow this hierarchy:
┌─────────────────────────────────────────────────────────────────┐
│ RUNNER & SIMULATION │
│ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐│
│ │ runner.py │────────▶│ simulate.py ││
│ │ │ │ ││
│ │ + main() │ │ + simulate_game() ││
│ │ + run_simulation│ │ + compute_actions() ││
│ │ + initialize_ray│ │ + log_llm_action() ││
│ └─────────────────┘ └─────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ REGISTRY & POLICY MANAGEMENT │
│ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐│
│ │ GameRegistry │ │ PolicyManager ││
│ │ │ │ ││
│ │ + register() │ │ + initialize_policies() ││
│ │ + get_game_ │ │ + policy_mapping_fn() ││
│ │ loader() │ │ ││
│ │ + make_env() │ │ ││
│ └─────────────────┘ └─────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ ENVIRONMENTS & AGENTS │
│ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐│
│ │ MatrixGameEnv │ │ AGENTS ││
│ │ │ │ ││
│ │ + __init__() │ │ ┌─────────────┐ ┌─────────────┐││
│ │ + _state_to_ │ │ │ LLMAgent │ │ RandomAgent │││
│ │ observation() │◄────────┤ │ │ │ │││
│ │ + _generate_ │ │ │+ compute_ │ │+ compute_ │││
│ │ prompt() │ │ │ action() │ │ action() │││
│ │ │ │ │+ __call__() │ │+ __call__() │││
│ │ ▲ │ │ └─────────────┘ └─────────────┘││
│ │ │ │ │ ││
│ │ ┌─────────────────┐ │ ││
│ │ │ OpenSpielEnv │ │ ││
│ │ │ │ │ ││
│ │ │ + reset() │ │ ││
│ │ │ + step() │ │ ││
│ │ │ + _solve_chance_│ │ ││
│ │ │ nodes() │ │ ││
│ │ │ + _compute_ │ │ ││
│ │ │ reward() │ │ ││
│ │ └─────────────────┘ │ ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
Class Responsibilities:
runner.py: Main entry point, configuration parsing, simulation orchestration
simulate.py: Core game loop, action coordination, logging management
GameRegistry: Game discovery, environment factory, loader management
PolicyManager: Agent instantiation, policy mapping, multi-agent coordination
MatrixGameEnv: Game-specific observation generation, prompt formatting
OpenSpielEnv: Base environment interface, state management, reward computation
LLMAgent: Language model interaction, response parsing, reasoning extraction
RandomAgent: Baseline random policy, action sampling
Data Flow Overview
The complete data transformation flow follows this pattern:
Configuration → Game Setup → Episode Loop → Action Computation → Environment Step → Logging
Configuration Example:
{
"env_configs": [{"game_name": "matrix_rps"}],
"agents": {
"player_0": {"type": "llm", "model": "gpt-4"},
"player_1": {"type": "random"}
}
}
Game Setup:
policies_dict = {
"policy_0": LLMAgent("gpt-4", "matrix_rps"),
"policy_1": RandomAgent(seed=42)
}
env = MatrixGameEnv(openspiel_game, "matrix_rps", ...)
Episode Loop:
observations = {
0: {"legal_actions": [0,1,2], "prompt": "..."},
1: {"legal_actions": [0,1,2], "prompt": "..."}
}
Action Computation:
action_dict = {
0: 1, # LLM chooses Paper
1: 2 # Random chooses Scissors
}
Environment Step:
rewards = {0: -1, 1: +1} # Player 1 wins
terminated = True # Single round game
Turn-based vs. Simultaneous Games
The framework handles both game types differently:
- Matrix Games (Simultaneous)
All players act at the same time
is_simultaneous_node()returnsTrueActions collected from all agents before environment step
Examples: Rock-Paper-Scissors, Prisoner’s Dilemma
- Turn-based Games
Only current player acts each step
state.current_player()identifies active playerSingle action passed to environment step
Examples: Tic-Tac-Toe, Connect Four
This comprehensive flow analysis shows how matrix games are handled as simultaneous-move games where all players act at once, contrasting with turn-based games where only the current player acts each step.
Note
For more detailed technical implementation, see the source files:
method_call_flow.md- Complete method call documentationcode_flow_analysis.md- Detailed architectural analysisflow_visualization.html- Interactive flow visualization