Prompting System
Game Reasoning Arena uses a scalable prompting system.
Understanding the Observation Flow
Before diving into prompt creation, it’s important to understand how information flows from the game environment to the LLM and back. The complete observation flow works as follows:
Environment Creates Observation
Every game environment generates observations for each player through the _state_to_observation() method. This creates a structured dictionary containing everything an agent needs to make a decision:
state_string: A human-readable representation of the current game statelegal_actions: A list of valid moves the player can makeprompt: A carefully crafted natural language prompt generated by_generate_prompt()
Agent Receives Prompt
When it’s time for an LLM agent to act, it extracts the prompt field from its observation. This prompt contains all the context and instructions the LLM needs to understand the game situation and make an informed decision.
Backend Processing
The prompt then goes through backend-specific formatting. For example, chat-based models might have the prompt wrapped in conversation templates with system and user roles, while base models receive the raw text prompt.
Response Parsing
Finally, the agent extracts the chosen action and reasoning from the LLM’s structured JSON response, completing the decision-making cycle.
The Hierarchical Prompt Creation System
Game Reasoning Arena’s prompt creation follows a layered architecture that allows for both consistency and customization. Let’s explore each layer:
Environment-Level Prompt Generation
At the foundation of the system, each game environment implements its own _generate_prompt() method. This allows every game to create prompts that are perfectly tailored to its specific mechanics and information requirements.
Base OpenSpiel Environment
The base environment class OpenSpielEnv (in open_spiel_env.py) provides a generic template that works well for most traditional board games. It creates prompts containing:
The name of the game being played
The player’s symbol or identifier (like ‘X’ or ‘O’ in tic-tac-toe)
The current move number to provide temporal context
A visual representation of the board state
A clear list of available legal actions
Here’s what a base prompt might look like for a tic-tac-toe game:
You are playing the game: tic_tac_toe
and you are playing with the: X
the current move number is: 3
Board state:
X | O |
-----------
| |
-----------
| |
Available actions: [2, 3, 4, 5, 6, 7, 8]
What action do you choose? Reply only with the available action number.
Specialized Game Environments
Games with unique mechanics often need more sophisticated prompts. The Kuhn Poker environment class KuhnPokerEnv (in kuhn_poker_env.py) demonstrates this by including game-specific information.
For Kuhn Poker, the prompt includes:
The player’s private card (the most critical piece of hidden information)
Complete betting history to understand what has happened so far
Current pot size and each player’s contribution
Context-aware action labels that change based on the game situation
For example, if no one has bet yet, the actions are labeled as “Check” and “Bet”. But if an opponent has already bet, they become “Fold” and “Call” - much more intuitive for the LLM to understand.
Here’s what a real Kuhn Poker prompt looks like in practice:
You are Player 0 in the game Kuhn Poker.
Your private card: Jack
This is move number: 2
Betting history: ['Check']
Total pot size: 2 chips
Your contribution: 1 chips
Available actions:
0: Check (stay in the game without betting)
1: Bet (add a chip to the pot)
What action do you choose? Reply only with '0' or '1'.
The Prompt Formatting Layer
Once each environment creates its game-specific prompt, the system applies standardized formatting through the format_prompt() function (in llm_utils.py). This layer adds two crucial elements that ensure consistent, high-quality responses from LLMs.
Reasoning Request
The system encourages the LLM to think before acting by adding this instruction:
First, think through the game strategy and explain your reasoning.
Only after that, decide on the best action to take.
This “thinking out loud” approach often leads to better decisions and provides valuable insights for analysis and debugging.
JSON Output Format
To ensure reliable parsing of responses, the system enforces a structured output format:
Reply only in the following JSON format:
{
'reasoning': <str>,
'action': <int>
}
This structure allows the system to extract both the LLM’s strategic reasoning and its final action choice, enabling rich analysis of decision-making patterns.
Here’s how our Kuhn Poker prompt looks after formatting:
You are Player 0 in the game Kuhn Poker.
Your private card: Jack
This is move number: 2
Betting history: ['Check']
Total pot size: 2 chips
Your contribution: 1 chips
Available actions:
0: Check (stay in the game without betting)
1: Bet (add a chip to the pot)
What action do you choose? Reply only with '0' or '1'.
First, think through the game strategy and explain your reasoning.
Only after that, decide on the best action to take.
Reply only in the following JSON format:
{
'reasoning': <str>,
'action': <int>
}
Backend-Specific Chat Templates
Modern LLMs often work best with conversational formats rather than raw text prompts. The vLLM backend class VLLMBackend (in vllm_backend.py) handles this automatically by applying chat templates when appropriate.
Chat Template Detection
The system first checks if the model has a built-in chat template by examining the tokenizer. Models like ChatGPT, Claude, or Llama-2-Chat come with their own preferred conversation formats.
Automatic Chat Formatting
For chat-based models, the system wraps the prompt in a conversation structure:
[
{
"role": "user",
"content": "You are Player 0 in the game Kuhn Poker..."
}
]
Fallback Templates
If a model appears to be instruction-tuned but lacks a built-in template, the system applies a generic chat format that works well across different model families.
Role-based Structure
This conversion from plain text to conversation format helps models understand that they’re being asked to respond as a game-playing assistant, which often improves response quality and consistency.
Agent Integration and Response Processing
The final piece of the puzzle is how LLM agents (LLMAgent class in llm_agent.py) coordinate the entire process and handle the responses.
Receiving the Formatted Prompt
The LLM agent receives the fully formatted prompt from the environment and passes it directly to the backend system. This separation of concerns means agents don’t need to worry about game-specific formatting - they just handle the communication with the LLM.
Backend Communication
The agent sends the prompt to the appropriate backend (LiteLLM for API-based models, vLLM for local models), which handles all the technical details of model communication, chat template application, and generation parameters.
Response Parsing
When the LLM responds, the agent uses regular expressions to extract the action and reasoning from the JSON response:
# Extract action: looks for 'action': 1
action_match = re.search(r"'action'\s*:\s*(\d+)", response_text)
# Extract reasoning: looks for 'reasoning': 'text here'
reasoning_match = re.search(r"'reasoning'\s*:\s*'(.*?)'", response_text, re.DOTALL)
This robust parsing handles variations in JSON formatting and ensures reliable extraction of the LLM’s decisions.
Example Complete Response
Here’s what a complete LLM response might look like for our Kuhn Poker example:
{
'reasoning': 'I have a Jack, which is the highest card in Kuhn Poker. My opponent checked, which could mean they have a weak card or are trying to trap me. Since I have the best possible card, I should bet to maximize my expected value. Even if they call, I will win the showdown.',
'action': 1
}
The system extracts action: 1 (meaning “Bet”) and stores the reasoning for later analysis.
Customizing Prompts for New Games
When adding a new game to Game Reasoning Arena, you’ll likely want to customize the prompting to fit your game’s unique characteristics. Here’s how to do it effectively:
Override the _generate_prompt Method
Create your own implementation in your game environment:
def _generate_prompt(self, agent_id: int) -> str:
# Get game-specific information
special_info = self.get_special_game_info(agent_id)
# Create your custom prompt
prompt = f"""You are playing {self.game_name}.
Special game information: {special_info}
Current situation: {self.describe_current_situation()}
Your options: {self.describe_actions_with_context(agent_id)}
Choose your action wisely."""
# Always use format_prompt to add reasoning and JSON formatting
return format_prompt(prompt)