Skip to content

dspy.MIPROv2

MIPROv2 (Multiprompt Instruction PRoposal Optimizer Version 2) is an prompt optimizer capable of optimizing both instructions and few-shot examples jointly. It does this by bootstrapping few-shot example candidates, proposing instructions grounded in different dynamics of the task, and finding an optimized combination of these options using Bayesian Optimization. It can be used for optimizing few-shot examples & instructions jointly, or just instructions for 0-shot optimization.

dspy.MIPROv2(metric: Callable, prompt_model: Optional[Any] = None, task_model: Optional[Any] = None, teacher_settings: Optional[dict] = None, max_bootstrapped_demos: int = 4, max_labeled_demos: int = 4, auto: Optional[Literal['light', 'medium', 'heavy']] = 'light', num_candidates: Optional[int] = None, num_threads: Optional[int] = None, max_errors: int = 10, seed: int = 9, init_temperature: float = 0.5, verbose: bool = False, track_stats: bool = True, log_dir: Optional[str] = None, metric_threshold: Optional[float] = None)

Bases: Teleprompter

Source code in dspy/teleprompt/mipro_optimizer_v2.py
def __init__(
    self,
    metric: Callable,
    prompt_model: Optional[Any] = None,
    task_model: Optional[Any] = None,
    teacher_settings: Optional[dict] = None,
    max_bootstrapped_demos: int = 4,
    max_labeled_demos: int = 4,
    auto: Optional[Literal["light", "medium", "heavy"]] = "light",
    num_candidates: Optional[int] = None,
    num_threads: Optional[int] = None,
    max_errors: int = 10,
    seed: int = 9,
    init_temperature: float = 0.5,
    verbose: bool = False,
    track_stats: bool = True,
    log_dir: Optional[str] = None,
    metric_threshold: Optional[float] = None,
):
    # Validate 'auto' parameter
    allowed_modes = {None, "light", "medium", "heavy"}
    if auto not in allowed_modes:
        raise ValueError(f"Invalid value for auto: {auto}. Must be one of {allowed_modes}.")
    self.auto = auto
    self.num_fewshot_candidates = num_candidates
    self.num_instruct_candidates = num_candidates
    self.num_candidates = num_candidates
    self.metric = metric
    self.init_temperature = init_temperature
    self.task_model = task_model if task_model else dspy.settings.lm
    self.prompt_model = prompt_model if prompt_model else dspy.settings.lm
    self.max_bootstrapped_demos = max_bootstrapped_demos
    self.max_labeled_demos = max_labeled_demos
    self.verbose = verbose
    self.track_stats = track_stats
    self.log_dir = log_dir
    self.teacher_settings = teacher_settings or {}
    self.prompt_model_total_calls = 0
    self.total_calls = 0
    self.num_threads = num_threads
    self.max_errors = max_errors
    self.metric_threshold = metric_threshold
    self.seed = seed
    self.rng = None

Functions

compile(student: Any, *, trainset: List, teacher: Any = None, valset: Optional[List] = None, num_trials: Optional[int] = None, max_bootstrapped_demos: Optional[int] = None, max_labeled_demos: Optional[int] = None, seed: Optional[int] = None, minibatch: bool = True, minibatch_size: int = 35, minibatch_full_eval_steps: int = 5, program_aware_proposer: bool = True, data_aware_proposer: bool = True, view_data_batch_size: int = 10, tip_aware_proposer: bool = True, fewshot_aware_proposer: bool = True, requires_permission_to_run: bool = True, provide_traceback: Optional[bool] = None) -> Any

Source code in dspy/teleprompt/mipro_optimizer_v2.py
def compile(
    self,
    student: Any,
    *,
    trainset: List,
    teacher: Any = None,
    valset: Optional[List] = None,
    num_trials: Optional[int] = None,
    max_bootstrapped_demos: Optional[int] = None,
    max_labeled_demos: Optional[int] = None,
    seed: Optional[int] = None,
    minibatch: bool = True,
    minibatch_size: int = 35,
    minibatch_full_eval_steps: int = 5,
    program_aware_proposer: bool = True,
    data_aware_proposer: bool = True,
    view_data_batch_size: int = 10,
    tip_aware_proposer: bool = True,
    fewshot_aware_proposer: bool = True,
    requires_permission_to_run: bool = True,
    provide_traceback: Optional[bool] = None,
) -> Any:

    zeroshot_opt = (self.max_bootstrapped_demos == 0) and (self.max_labeled_demos == 0)

    # If auto is None, and num_trials is not provided (but num_candidates is), raise an error that suggests a good num_trials value
    if self.auto is None and (self.num_candidates is not None and num_trials is None):
        raise ValueError(f"If auto is None, num_trials must also be provided. Given num_candidates={self.num_candidates}, we'd recommend setting num_trials to ~{self._set_num_trials_from_num_candidates(student, zeroshot_opt, self.num_candidates)}.")

    # If auto is None, and num_candidates or num_trials is None, raise an error
    if self.auto is None and (self.num_candidates is None or num_trials is None):
        raise ValueError("If auto is None, num_candidates must also be provided.")

    # If auto is provided, and either num_candidates or num_trials is not None, raise an error
    if self.auto is not None and (self.num_candidates is not None or num_trials is not None):
        raise ValueError("If auto is not None, num_candidates and num_trials cannot be set, since they would be overrided by the auto settings. Please either set auto to None, or do not specify num_candidates and num_trials.")

    # Set random seeds
    seed = seed or self.seed
    self._set_random_seeds(seed)

    # Update max demos if specified
    if max_bootstrapped_demos is not None:
        self.max_bootstrapped_demos = max_bootstrapped_demos
    if max_labeled_demos is not None:
        self.max_labeled_demos = max_labeled_demos

    # Set training & validation sets
    trainset, valset = self._set_and_validate_datasets(trainset, valset)

    # Set hyperparameters based on run mode (if set)
    num_trials, valset, minibatch = self._set_hyperparams_from_run_mode(
        student, num_trials, minibatch, zeroshot_opt, valset
    )

    if self.auto:
        self._print_auto_run_settings(num_trials, minibatch, valset)

    if minibatch and minibatch_size > len(valset):
        raise ValueError(f"Minibatch size cannot exceed the size of the valset. Valset size: {len(valset)}.")

    # Estimate LM calls and get user confirmation
    if requires_permission_to_run:
        if not self._get_user_confirmation(
            student,
            num_trials,
            minibatch,
            minibatch_size,
            minibatch_full_eval_steps,
            valset,
            program_aware_proposer,
        ):
            logger.info("Compilation aborted by the user.")
            return student  # Return the original student program

    # Initialize program and evaluator
    program = student.deepcopy()
    evaluate = Evaluate(
        devset=valset,
        metric=self.metric,
        num_threads=self.num_threads,
        max_errors=self.max_errors,
        display_table=False,
        display_progress=True,
        provide_traceback=provide_traceback,
    )

    # Step 1: Bootstrap few-shot examples
    demo_candidates = self._bootstrap_fewshot_examples(program, trainset, seed, teacher)

    # Step 2: Propose instruction candidates
    instruction_candidates = self._propose_instructions(
        program,
        trainset,
        demo_candidates,
        view_data_batch_size,
        program_aware_proposer,
        data_aware_proposer,
        tip_aware_proposer,
        fewshot_aware_proposer,
    )

    # If zero-shot, discard demos
    if zeroshot_opt:
        demo_candidates = None

    # Step 3: Find optimal prompt parameters
    best_program = self._optimize_prompt_parameters(
        program,
        instruction_candidates,
        demo_candidates,
        evaluate,
        valset,
        num_trials,
        minibatch,
        minibatch_size,
        minibatch_full_eval_steps,
        seed,
    )

    return best_program

get_params() -> dict[str, Any]

Get the parameters of the teleprompter.

Returns:

Type Description
dict[str, Any]

The parameters of the teleprompter.

Source code in dspy/teleprompt/teleprompt.py
def get_params(self) -> dict[str, Any]:
    """
    Get the parameters of the teleprompter.

    Returns:
        The parameters of the teleprompter.
    """
    return self.__dict__

Example Usage

The program below shows optimizing a math program with MIPROv2

import dspy
from dspy.datasets.gsm8k import GSM8K, gsm8k_metric

# Import the optimizer
from dspy.teleprompt import MIPROv2

# Initialize the LM
lm = dspy.LM('openai/gpt-4o-mini', api_key='YOUR_OPENAI_API_KEY')
dspy.configure(lm=lm)

# Initialize optimizer
teleprompter = MIPROv2(
    metric=gsm8k_metric,
    auto="medium", # Can choose between light, medium, and heavy optimization runs
)

# Optimize program
print(f"Optimizing program with MIPROv2...")
optimized_program = teleprompter.compile(
    dspy.ChainOfThought("question -> answer"),
    trainset=gsm8k.train,
    requires_permission_to_run=False,
)

# Save optimize program for future use
optimized_program.save(f"optimized.json")

How MIPROv2 works

At a high level, MIPROv2 works by creating both few-shot examples and new instructions for each predictor in your LM program, and then searching over these using Bayesian Optimization to find the best combination of these variables for your program. If you want a visual explanation check out this twitter thread.

These steps are broken down in more detail below:

1) Bootstrap Few-Shot Examples: Randomly samples examples from your training set, and run them through your LM program. If the output from the program is correct for this example, it is kept as a valid few-shot example candidate. Otherwise, we try another example until we've curated the specified amount of few-shot example candidates. This step creates num_candidates sets of max_bootstrapped_demos bootstrapped examples and max_labeled_demos basic examples sampled from the training set.

2) Propose Instruction Candidates. The instruction proposer includes (1) a generated summary of properties of the training dataset, (2) a generated summary of your LM program's code and the specific predictor that an instruction is being generated for, (3) the previously bootstrapped few-shot examples to show reference inputs / outputs for a given predictor and (4) a randomly sampled tip for generation (i.e. "be creative", "be concise", etc.) to help explore the feature space of potential instructions. This context is provided to a prompt_model which writes high quality instruction candidates.

3) Find an Optimized Combination of Few-Shot Examples & Instructions. Finally, we use Bayesian Optimization to choose which combinations of instructions and demonstrations work best for each predictor in our program. This works by running a series of num_trials trials, where a new set of prompts are evaluated over our validation set at each trial. The new set of prompts are only evaluated on a minibatch of size minibatch_size at each trial (when minibatch=True). The best averaging set of prompts is then evalauted on the full validation set every minibatch_full_eval_steps. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.

For those interested in more details, more information on MIPROv2 along with a study on MIPROv2 compared with other DSPy optimizers can be found in this paper.