Skip to content

dspy.CodeAct

dspy.CodeAct(signature: Union[str, Type[Signature]], tools: list[Callable], max_iters: int = 5)

Bases: ReAct, ProgramOfThought

CodeAct is a module that utilizes the Code Interpreter and predefined tools to solve the problem.

Initializes the CodeAct class with the specified model, temperature, and max tokens.

Parameters:

Name Type Description Default
signature Union[str, Type[Signature]]

The signature of the module.

required
tools list[Callable]

The tool callables to be used. CodeAct only accepts functions and not callable objects.

required
max_iters int

The maximum number of iterations to generate the answer.

5
Example
from dspy.predict import CodeAct
def factorial(n):
    if n == 1:
        return 1
    return n * factorial(n-1)

act = CodeAct("n->factorial", tools=[factorial])
act(n=5) # 120
Source code in dspy/predict/code_act.py
def __init__(self, signature: Union[str, Type[Signature]], tools: list[Callable], max_iters: int = 5):
    """
    Initializes the CodeAct class with the specified model, temperature, and max tokens.

    Args:
        signature (Union[str, Type[Signature]]): The signature of the module.
        tools (list[Callable]): The tool callables to be used. CodeAct only accepts functions and not callable objects.
        max_iters (int): The maximum number of iterations to generate the answer.

    Example:
        ```python
        from dspy.predict import CodeAct
        def factorial(n):
            if n == 1:
                return 1
            return n * factorial(n-1)

        act = CodeAct("n->factorial", tools=[factorial])
        act(n=5) # 120
        ```
    """
    self.signature = ensure_signature(signature)
    self.max_iters = max_iters
    self.history = []

    tools = [t if isinstance(t, Tool) else Tool(t) for t in tools]
    if any(
        not inspect.isfunction(tool.func) for tool in tools
    ):
        raise ValueError("CodeAct only accepts functions and not callable objects.")
    tools = {tool.name: tool for tool in tools}

    instructions = self._build_instructions(self.signature, tools)

    codeact_signature = (
        dspy.Signature({**self.signature.input_fields}, "\n".join(instructions))
        .append("trajectory", dspy.InputField(), type_=str)
        .append("generated_code", dspy.OutputField(desc="Python code that when executed, produces output relevant to answering the question"), type_=str)
        .append("finished", dspy.OutputField(desc="a boolean flag to determine if the process is done"), type_=bool)
    )

    extract_signature = dspy.Signature(
        {**self.signature.input_fields, **self.signature.output_fields},
        self.signature.instructions,
    ).append("trajectory", dspy.InputField(), type_=str)

    self.tools: dict[str, Tool] = tools
    self.codeact = dspy.Predict(codeact_signature)
    self.extractor = dspy.ChainOfThought(extract_signature)
    # It will raises exception when dspy cannot find available deno instance by now.
    self.interpreter = PythonInterpreter()

Functions

__call__(*args, **kwargs)

Source code in dspy/primitives/program.py
@with_callbacks
def __call__(self, *args, **kwargs):
    caller_modules = settings.caller_modules or []
    caller_modules = list(caller_modules)
    caller_modules.append(self)

    with settings.context(caller_modules=caller_modules):
        if settings.track_usage and settings.usage_tracker is None:
            with track_usage() as usage_tracker:
                output = self.forward(*args, **kwargs)
            output.set_lm_usage(usage_tracker.get_total_tokens())
            return output

        return self.forward(*args, **kwargs)

batch(examples, num_threads: Optional[int] = None, max_errors: Optional[int] = None, return_failed_examples: bool = False, provide_traceback: Optional[bool] = None, disable_progress_bar: bool = False)

Processes a list of dspy.Example instances in parallel using the Parallel module.

Parameters:

Name Type Description Default
examples

List of dspy.Example instances to process.

required
num_threads Optional[int]

Number of threads to use for parallel processing.

None
max_errors Optional[int]

Maximum number of errors allowed before stopping execution. If None, inherits from dspy.settings.max_errors.

None
return_failed_examples bool

Whether to return failed examples and exceptions.

False
provide_traceback Optional[bool]

Whether to include traceback information in error logs.

None
disable_progress_bar bool

Whether to display the progress bar.

False

Returns:

Type Description

List of results, and optionally failed examples and exceptions.

Source code in dspy/primitives/program.py
def batch(
    self,
    examples,
    num_threads: Optional[int] = None,
    max_errors: Optional[int] = None,
    return_failed_examples: bool = False,
    provide_traceback: Optional[bool] = None,
    disable_progress_bar: bool = False,
):
    """
    Processes a list of dspy.Example instances in parallel using the Parallel module.

    Args:
        examples: List of dspy.Example instances to process.
        num_threads: Number of threads to use for parallel processing.
        max_errors: Maximum number of errors allowed before stopping execution.
            If ``None``, inherits from ``dspy.settings.max_errors``.
        return_failed_examples: Whether to return failed examples and exceptions.
        provide_traceback: Whether to include traceback information in error logs.
        disable_progress_bar: Whether to display the progress bar.

    Returns:
        List of results, and optionally failed examples and exceptions.
    """
    # Create a list of execution pairs (self, example)
    exec_pairs = [(self, example.inputs()) for example in examples]

    # Create an instance of Parallel
    parallel_executor = Parallel(
        num_threads=num_threads,
        max_errors=max_errors,
        return_failed_examples=return_failed_examples,
        provide_traceback=provide_traceback,
        disable_progress_bar=disable_progress_bar,
    )

    # Execute the forward method of Parallel
    if return_failed_examples:
        results, failed_examples, exceptions = parallel_executor.forward(exec_pairs)
        return results, failed_examples, exceptions
    else:
        results = parallel_executor.forward(exec_pairs)
        return results

deepcopy()

Deep copy the module.

This is a tweak to the default python deepcopy that only deep copies self.parameters(), and for other attributes, we just do the shallow copy.

Source code in dspy/primitives/module.py
def deepcopy(self):
    """Deep copy the module.

    This is a tweak to the default python deepcopy that only deep copies `self.parameters()`, and for other
    attributes, we just do the shallow copy.
    """
    try:
        # If the instance itself is copyable, we can just deep copy it.
        # Otherwise we will have to create a new instance and copy over the attributes one by one.
        return copy.deepcopy(self)
    except Exception:
        pass

    # Create an empty instance.
    new_instance = self.__class__.__new__(self.__class__)
    # Set attribuetes of the copied instance.
    for attr, value in self.__dict__.items():
        if isinstance(value, BaseModule):
            setattr(new_instance, attr, value.deepcopy())
        else:
            try:
                # Try to deep copy the attribute
                setattr(new_instance, attr, copy.deepcopy(value))
            except Exception:
                logging.warning(
                    f"Failed to deep copy attribute '{attr}' of {self.__class__.__name__}, "
                    "falling back to shallow copy or reference copy."
                )
                try:
                    # Fallback to shallow copy if deep copy fails
                    setattr(new_instance, attr, copy.copy(value))
                except Exception:
                    # If even the shallow copy fails, we just copy over the reference.
                    setattr(new_instance, attr, value)

    return new_instance

dump_state()

Source code in dspy/primitives/module.py
def dump_state(self):
    return {name: param.dump_state() for name, param in self.named_parameters()}

get_lm()

Source code in dspy/primitives/program.py
def get_lm(self):
    all_used_lms = [param.lm for _, param in self.named_predictors()]

    if len(set(all_used_lms)) == 1:
        return all_used_lms[0]

    raise ValueError("Multiple LMs are being used in the module. There's no unique LM to return.")

inspect_history(n: int = 1)

Source code in dspy/primitives/program.py
def inspect_history(self, n: int = 1):
    return pretty_print_history(self.history, n)

load(path)

Load the saved module. You may also want to check out dspy.load, if you want to load an entire program, not just the state for an existing program.

Parameters:

Name Type Description Default
path str

Path to the saved state file, which should be a .json or a .pkl file

required
Source code in dspy/primitives/module.py
def load(self, path):
    """Load the saved module. You may also want to check out dspy.load, if you want to
    load an entire program, not just the state for an existing program.

    Args:
        path (str): Path to the saved state file, which should be a .json or a .pkl file
    """
    path = Path(path)

    if path.suffix == ".json":
        with open(path) as f:
            state = ujson.loads(f.read())
    elif path.suffix == ".pkl":
        with open(path, "rb") as f:
            state = cloudpickle.load(f)
    else:
        raise ValueError(f"`path` must end with `.json` or `.pkl`, but received: {path}")

    dependency_versions = get_dependency_versions()
    saved_dependency_versions = state["metadata"]["dependency_versions"]
    for key, saved_version in saved_dependency_versions.items():
        if dependency_versions[key] != saved_version:
            logger.warning(
                f"There is a mismatch of {key} version between saved model and current environment. "
                f"You saved with `{key}=={saved_version}`, but now you have "
                f"`{key}=={dependency_versions[key]}`. This might cause errors or performance downgrade "
                "on the loaded model, please consider loading the model in the same environment as the "
                "saving environment."
            )
    self.load_state(state)

load_state(state)

Source code in dspy/primitives/module.py
def load_state(self, state):
    for name, param in self.named_parameters():
        param.load_state(state[name])

map_named_predictors(func)

Applies a function to all named predictors.

Source code in dspy/primitives/program.py
def map_named_predictors(self, func):
    """Applies a function to all named predictors."""
    for name, predictor in self.named_predictors():
        set_attribute_by_name(self, name, func(predictor))
    return self

named_parameters()

Unlike PyTorch, handles (non-recursive) lists of parameters too.

Source code in dspy/primitives/module.py
def named_parameters(self):
    """
    Unlike PyTorch, handles (non-recursive) lists of parameters too.
    """

    import dspy
    from dspy.predict.parameter import Parameter

    visited = set()
    named_parameters = []

    def add_parameter(param_name, param_value):
        if isinstance(param_value, Parameter):
            if id(param_value) not in visited:
                visited.add(id(param_value))
                param_name = postprocess_parameter_name(param_name, param_value)
                named_parameters.append((param_name, param_value))

        elif isinstance(param_value, dspy.Module):
            # When a sub-module is pre-compiled, keep it frozen.
            if not getattr(param_value, "_compiled", False):
                for sub_name, param in param_value.named_parameters():
                    add_parameter(f"{param_name}.{sub_name}", param)

    if isinstance(self, Parameter):
        add_parameter("self", self)

    for name, value in self.__dict__.items():
        if isinstance(value, Parameter):
            add_parameter(name, value)

        elif isinstance(value, dspy.Module):
            # When a sub-module is pre-compiled, keep it frozen.
            if not getattr(value, "_compiled", False):
                for sub_name, param in value.named_parameters():
                    add_parameter(f"{name}.{sub_name}", param)

        elif isinstance(value, (list, tuple)):
            for idx, item in enumerate(value):
                add_parameter(f"{name}[{idx}]", item)

        elif isinstance(value, dict):
            for key, item in value.items():
                add_parameter(f"{name}['{key}']", item)

    return named_parameters

named_predictors()

Source code in dspy/primitives/program.py
def named_predictors(self):
    from dspy.predict.predict import Predict

    return [(name, param) for name, param in self.named_parameters() if isinstance(param, Predict)]

named_sub_modules(type_=None, skip_compiled=False) -> Generator[tuple[str, BaseModule], None, None]

Find all sub-modules in the module, as well as their names.

Say self.children[4]['key'].sub_module is a sub-module. Then the name will be 'children[4][key].sub_module'. But if the sub-module is accessible at different paths, only one of the paths will be returned.

Source code in dspy/primitives/module.py
def named_sub_modules(self, type_=None, skip_compiled=False) -> Generator[tuple[str, "BaseModule"], None, None]:
    """Find all sub-modules in the module, as well as their names.

    Say self.children[4]['key'].sub_module is a sub-module. Then the name will be
    'children[4][key].sub_module'. But if the sub-module is accessible at different
    paths, only one of the paths will be returned.
    """
    if type_ is None:
        type_ = BaseModule

    queue = deque([("self", self)])
    seen = {id(self)}

    def add_to_queue(name, item):
        name = postprocess_parameter_name(name, item)

        if id(item) not in seen:
            seen.add(id(item))
            queue.append((name, item))

    while queue:
        name, item = queue.popleft()

        if isinstance(item, type_):
            yield name, item

        if isinstance(item, BaseModule):
            if skip_compiled and getattr(item, "_compiled", False):
                continue
            for sub_name, sub_item in item.__dict__.items():
                add_to_queue(f"{name}.{sub_name}", sub_item)

        elif isinstance(item, (list, tuple)):
            for i, sub_item in enumerate(item):
                add_to_queue(f"{name}[{i}]", sub_item)

        elif isinstance(item, dict):
            for key, sub_item in item.items():
                add_to_queue(f"{name}[{key}]", sub_item)

parameters()

Source code in dspy/primitives/module.py
def parameters(self):
    return [param for _, param in self.named_parameters()]

predictors()

Source code in dspy/primitives/program.py
def predictors(self):
    return [param for _, param in self.named_predictors()]

reset_copy()

Deep copy the module and reset all parameters.

Source code in dspy/primitives/module.py
def reset_copy(self):
    """Deep copy the module and reset all parameters."""
    new_instance = self.deepcopy()

    for param in new_instance.parameters():
        param.reset()

    return new_instance

save(path, save_program=False, modules_to_serialize=None)

Save the module.

Save the module to a directory or a file. There are two modes: - save_program=False: Save only the state of the module to a json or pickle file, based on the value of the file extension. - save_program=True: Save the whole module to a directory via cloudpickle, which contains both the state and architecture of the model.

If save_program=True and modules_to_serialize are provided, it will register those modules for serialization with cloudpickle's register_pickle_by_value. This causes cloudpickle to serialize the module by value rather than by reference, ensuring the module is fully preserved along with the saved program. This is useful when you have custom modules that need to be serialized alongside your program. If None, then no modules will be registered for serialization.

We also save the dependency versions, so that the loaded model can check if there is a version mismatch on critical dependencies or DSPy version.

Parameters:

Name Type Description Default
path str

Path to the saved state file, which should be a .json or .pkl file when save_program=False, and a directory when save_program=True.

required
save_program bool

If True, save the whole module to a directory via cloudpickle, otherwise only save the state.

False
modules_to_serialize list

A list of modules to serialize with cloudpickle's register_pickle_by_value. If None, then no modules will be registered for serialization.

None
Source code in dspy/primitives/module.py
def save(self, path, save_program=False, modules_to_serialize=None):
    """Save the module.

    Save the module to a directory or a file. There are two modes:
    - `save_program=False`: Save only the state of the module to a json or pickle file, based on the value of
        the file extension.
    - `save_program=True`: Save the whole module to a directory via cloudpickle, which contains both the state and
        architecture of the model.

    If `save_program=True` and `modules_to_serialize` are provided, it will register those modules for serialization 
    with cloudpickle's `register_pickle_by_value`. This causes cloudpickle to serialize the module by value rather 
    than by reference, ensuring the module is fully preserved along with the saved program. This is useful 
    when you have custom modules that need to be serialized alongside your program. If None, then no modules 
    will be registered for serialization.

    We also save the dependency versions, so that the loaded model can check if there is a version mismatch on
    critical dependencies or DSPy version.

    Args:
        path (str): Path to the saved state file, which should be a .json or .pkl file when `save_program=False`,
            and a directory when `save_program=True`.
        save_program (bool): If True, save the whole module to a directory via cloudpickle, otherwise only save
            the state.
        modules_to_serialize (list): A list of modules to serialize with cloudpickle's `register_pickle_by_value`.
            If None, then no modules will be registered for serialization.

    """
    metadata = {}
    metadata["dependency_versions"] = get_dependency_versions()
    path = Path(path)

    if save_program:
        if path.suffix:
            raise ValueError(
                f"`path` must point to a directory without a suffix when `save_program=True`, but received: {path}"
            )
        if path.exists() and not path.is_dir():
            raise NotADirectoryError(f"The path '{path}' exists but is not a directory.")

        if not path.exists():
            # Create the directory (and any parent directories)
            path.mkdir(parents=True)

        try:
            modules_to_serialize = modules_to_serialize or []
            for module in modules_to_serialize:
                cloudpickle.register_pickle_by_value(module)

            with open(path / "program.pkl", "wb") as f:
                cloudpickle.dump(self, f)
        except Exception as e:
            raise RuntimeError(
                f"Saving failed with error: {e}. Please remove the non-picklable attributes from your DSPy program, "
                "or consider using state-only saving by setting `save_program=False`."
            )
        with open(path / "metadata.json", "w", encoding="utf-8") as f:
            ujson.dump(metadata, f, indent=2, ensure_ascii=False)

        return

    state = self.dump_state()
    state["metadata"] = metadata
    if path.suffix == ".json":
        try:
            with open(path, "w", encoding="utf-8") as f:
                f.write(ujson.dumps(state, indent=2 , ensure_ascii=False))
        except Exception as e:
            raise RuntimeError(
                f"Failed to save state to {path} with error: {e}. Your DSPy program may contain non "
                "json-serializable objects, please consider saving the state in .pkl by using `path` ending "
                "with `.pkl`, or saving the whole program by setting `save_program=True`."
            )
    elif path.suffix == ".pkl":
        with open(path, "wb") as f:
            cloudpickle.dump(state, f)
    else:
        raise ValueError(f"`path` must end with `.json` or `.pkl` when `save_program=False`, but received: {path}")

set_lm(lm)

Source code in dspy/primitives/program.py
def set_lm(self, lm):
    for _, param in self.named_predictors():
        param.lm = lm

CodeAct

CodeAct is a DSPy module that combines code generation with tool execution to solve problems. It generates Python code snippets that use provided tools and the Python standard library to accomplish tasks.

Basic Usage

Here's a simple example of using CodeAct:

import dspy
from dspy.predict import CodeAct

# Define a simple tool function
def factorial(n: int) -> int:
    """Calculate the factorial of a number."""
    if n == 1:
        return 1
    return n * factorial(n-1)

# Create a CodeAct instance
act = CodeAct("n->factorial_result", tools=[factorial])

# Use the CodeAct instance
result = act(n=5)
print(result) # Will calculate factorial(5) = 120

How It Works

CodeAct operates in an iterative manner:

  1. Takes input parameters and available tools
  2. Generates Python code snippets that use these tools
  3. Executes the code using a Python sandbox
  4. Collects the output and determines if the task is complete
  5. Answer the original question based on the collected information

⚠️ Limitations

Only accepts pure functions as tools (no callable objects)

The following example does not work due to the usage of a callable object.

# ❌ NG
class Add():
    def __call__(self, a: int, b: int):
        return a + b

dspy.CodeAct("question -> answer", tools=[Add()])

External libraries cannot be used

The following example does not work due to the usage of the external library numpy.

# ❌ NG
import numpy as np

def exp(i: int):
    return np.exp(i)

dspy.CodeAct("question -> answer", tools=[exp])

All dependent functions need to be passed to CodeAct

Functions that depend on other functions or classes not passed to CodeAct cannot be used. The following example does not work because the tool functions depend on other functions or classes that are not passed to CodeAct, such as Profile or secret_function.

# ❌ NG
from pydantic import BaseModel

class Profile(BaseModel):
    name: str
    age: int

def age(profile: Profile):
    return 

def parent_function():
    print("Hi!")

def child_function():
    parent_function()

dspy.CodeAct("question -> answer", tools=[age, child_function])

Instead, the following example works since all necessary tool functions are passed to CodeAct:

# ✅ OK

def parent_function():
    print("Hi!")

def child_function():
    parent_function()

dspy.CodeAct("question -> answer", tools=[parent_function, child_function])