Experiments
===========

This section covers how to design and run experiments with Game Reasoning Arena, including distributed execution capabilities.

Ray Integration for Parallel Execution
---------------------------------------

Game Reasoning Arena supports **Ray** for distributed and parallel execution, allowing you to:

- **Run multiple games in parallel** across different cores/machines
- **Parallelize episodes within games** for faster data collection
- **Distribute LLM inference** for batch processing
- **Scale experiments** on SLURM clusters or multi-GPU setups

Configuration Options
~~~~~~~~~~~~~~~~~~~~~

**Option 1: Combined Configuration File (YAML)**

.. code-block:: yaml

   # Combined config with all settings in one file
   env_config:
     game_name: tic_tac_toe
   num_episodes: 5
   agents:
     player_0:
       type: llm
       model: litellm_groq/llama3-8b-8192
     player_1:
       type: random
   use_ray: true
   parallel_episodes: true
   ray_config:
     num_cpus: 8
     include_dashboard: false

**Option 2: Separate Ray Configuration (Recommended)**

.. code-block:: bash

   # Use any existing config + separate Ray settings
   python3 scripts/runner.py \
     --base-config src/game_reasoning_arena/configs/multi_game_base.yaml \
     --ray-config src/game_reasoning_arena/configs/ray_config.yaml \
     --override num_episodes=10 \
     --override agents.player_0.model=litellm_groq/llama3-70b-8192

**Option 3: Command-Line Override**

.. code-block:: bash

   # Enable Ray with any existing configuration
   python3 scripts/runner.py --config src/game_reasoning_arena/configs/example_config.yaml \
     --override use_ray=true parallel_episodes=true

**Option 4: Maximum Parallelization (Multi-Model Ray)**

.. code-block:: bash

   # Run multiple models in parallel with full Ray integration
   # Parallelizes: Models + Games + Episodes simultaneously
   python3 scripts/run_ray_multi_model.py \
     --config src/game_reasoning_arena/configs/ray_multi_model.yaml \
     --override use_ray=true

Ray Configuration Parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``ray_config.yaml`` file contains Ray-specific settings:

.. list-table:: Ray Configuration Options
   :widths: 25 50 25
   :header-rows: 1

   * - Parameter
     - Description
     - Default
   * - ``use_ray``
     - Enable/disable Ray
     - ``false``
   * - ``parallel_episodes``
     - Parallelize episodes within games
     - ``false``
   * - ``ray_config.num_cpus``
     - Number of CPUs for Ray
     - Auto-detect
   * - ``ray_config.num_gpus``
     - Number of GPUs for Ray
     - Auto-detect
   * - ``ray_config.include_dashboard``
     - Enable Ray dashboard
     - ``false``
   * - ``ray_config.dashboard_port``
     - Dashboard port
     - ``8265``
   * - ``ray_config.object_store_memory``
     - Object store memory limit
     - Auto

Performance Comparison
~~~~~~~~~~~~~~~~~~~~~~

.. list-table:: Execution Modes Performance
   :widths: 30 25 25 20
   :header-rows: 1

   * - Execution Mode
     - Parallelization Level
     - Best For
     - Expected Speedup
   * - ``scripts/runner.py`` (standard)
     - Episodes only
     - Single model, single game
     - ~N_episodes
   * - ``scripts/runner.py`` (Ray enabled)
     - Games + Episodes
     - Single model, multiple games
     - ~N_games × N_episodes
   * - ``scripts/run_ray_multi_model.py``
     - Models + Games + Episodes
     - Multiple models, multiple games
     - ~N_models × N_games × N_episodes

**Recommendation**: Use ``run_ray_multi_model.py`` for multi-model experiments to achieve maximum speedup.

Configuration Merging Order
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The system merges configurations in this order (later overrides earlier):

1. Default configuration
2. Base config (``--base-config``)
3. Main config (``--config``)
4. Ray config (``--ray-config``)
5. CLI overrides (``--override``)

SLURM Integration
~~~~~~~~~~~~~~~~~

For cluster environments, Ray automatically detects SLURM allocation:

.. code-block:: bash

   # SLURM job with Ray
   sbatch --nodes=2 --cpus-per-task=48 --gres=gpu:4 slurm_jobs/run_simulation.sh

The SLURM script (``slurm_jobs/run_simulation.sh``) handles:

- Multi-node Ray cluster setup
- Head node and worker initialization
- GPU allocation across nodes
- Environment variable configuration

Debug Commands
~~~~~~~~~~~~~~

.. code-block:: bash

   # Check Ray status
   ray status

   # Monitor Ray dashboard (if enabled)
   # Navigate to: http://localhost:8265

Experiment Design
-----------------

Configuration Management
~~~~~~~~~~~~~~~~~~~~~~~~~

Use YAML configuration files to define experiments:

.. code-block:: yaml

   experiment:
     name: "llm_comparison_study"
     description: "Compare different LLM models on strategic games"

   games:
     - name: "connect_four"
       num_episodes: 100
     - name: "kuhn_poker"
       num_episodes: 200

   agents:
     - type: "llm"
       model: "gpt-4"
       name: "GPT4_Player"
     - type: "llm"
       model: "claude-3-sonnet"
       name: "Claude_Player"

Running Experiments
-------------------

Single Experiments
~~~~~~~~~~~~~~~~~~

.. code-block:: bash

   python scripts/simulate.py --config experiments/my_experiment.yaml

Batch Experiments
~~~~~~~~~~~~~~~~~

For large-scale studies:

.. code-block:: bash

   # Using SLURM for cluster computing
   sbatch slurm_jobs/run_simulation.sh

   # Or parallel execution
   python scripts/runner.py --parallel --jobs 8


Distributed Computing
~~~~~~~~~~~~~~~~~~~~~

Use Ray for distributed execution:

.. code-block:: yaml

   execution:
     backend: "ray"
     num_workers: 8
     resources_per_worker:
       cpu: 2
       memory: "4GB"

Statistical Analysis
--------------------

Significance Testing
~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from game_reasoning_arena.analysis import statistical_tests

   # Compare win rates between agents
   p_value = statistical_tests.binomial_test(
       wins_a=75, games_a=100,
       wins_b=65, games_b=100
   )