dspy.RLM¶
RLM (Recursive Language Model) is a DSPy module that lets LLMs programmatically explore large contexts through a sandboxed Python REPL. Instead of feeding huge contexts directly into the prompt, RLM treats context as external data that the LLM examines via code execution and recursive sub-LLM calls.
This implements the approach described in "Recursive Language Models" (Zhang, Kraska, Khattab, 2025).
When to Use RLM¶
As contexts grow, LLM performance degrades — a phenomenon known as context rot. RLMs address this by separating the variable space (information stored in the REPL) from the token space (what the LLM actually processes). The LLM dynamically loads only the context it needs, when it needs it.
Use RLM when:
- Your context is too large to fit in the LLM's context window effectively
- The task benefits from programmatic exploration (searching, filtering, aggregating, chunking)
- You need the LLM to decide how to decompose the problem, not you
Basic Usage¶
import dspy
dspy.configure(lm=dspy.LM("openai/gpt-5"))
# Create an RLM module
rlm = dspy.RLM("context, query -> answer")
# Call it like any other module
result = rlm(
context="...very long document or data...",
query="What is the total revenue mentioned?"
)
print(result.answer)
Deno Installation¶
RLM relies on Deno and (Pyodide)[tocite] to create a WASM sandbox locally that
You can install Deno with: curl -fsSL https://deno.land/install.sh | sh on macOS and Linux. See the Deno Installation Docs for more details. Make sure to accept the prompt when it asks to add it to your shell profile.
After you have installed Deno, Make sure to restart your shell
Then you can run dspy.RLM.
User's have reported issues with the Deno cache not being found by DSPy. We are actively investigating these issues, and your feedback is greatly appreciated.
You can also work with an external sandbox provider. We are still working on creating an example of using external sandbox providers.
How It Works¶
RLM operates in an iterative REPL loop:
- The LLM receives metadata about the context (type, length, preview) but not the full context
- The LLM writes Python code to explore the data (print samples, search, filter)
- Code executes in a sandboxed interpreter, and the LLM sees the output
- The LLM can call
llm_query(prompt)to run sub-LLM calls for semantic analysis on snippets - When done, the LLM calls
SUBMIT(output)to return the final answer
What the LLM sees (step-by-step trace):¶
Step 1: Initial Metadata (no direct access to full context)¶
Output shown to the LLM:Step 2: Write Code to Explore Context¶
# Step 2: Search for relevant sections
import re
matches = re.findall(r'revenue.*?\$[\d,]+', context, re.IGNORECASE)
print(matches)
Step 3: Trigger Sub-LLM Calls¶
# Step 3: Use sub-LLM for semantic extraction
result = llm_query(f"Extract the total revenue from: {matches[0]}")
print(result)
Step 4: Submit Final Answer¶
Output shown to the user:Constructor Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
signature |
str \| Signature |
required | Defines inputs and outputs (e.g., "context, query -> answer") |
max_iterations |
int |
20 |
Maximum REPL interaction loops before fallback extraction |
max_llm_calls |
int |
50 |
Maximum llm_query/llm_query_batched calls per execution |
max_output_chars |
int |
100_000 |
Maximum characters to include from REPL output |
verbose |
bool |
False |
Log detailed execution info |
tools |
list[Union[Callable, dspy.Tool]] |
None |
Additional tool functions callable from interpreter code |
sub_lm |
dspy.LM |
None |
LM for sub-queries. Defaults to dspy.settings.lm. Use a cheaper model here. |
interpreter |
CodeInterpreter |
None |
Custom interpreter. Defaults to PythonInterpreter (Deno/Pyodide WASM). |
Built-in Tools¶
Inside the REPL, the LLM has access to:
| Tool | Description |
|---|---|
llm_query(prompt) |
Query a sub-LLM for semantic analysis (~500K char capacity) |
llm_query_batched(prompts) |
Query multiple prompts concurrently (faster for batch operations) |
print() |
Print output (required to see results) |
SUBMIT(...) |
Submit final output and end execution |
| Standard library | re, json, collections, math, etc. |
Examples¶
Long Document Q&A¶
import dspy
dspy.configure(lm=dspy.LM("openai/gpt-5"))
rlm = dspy.RLM("document, question -> answer", max_iterations=10)
with open("large_report.txt") as f:
document = f.read() # 500K+ characters
result = rlm(
document=document,
question="What were the key findings from Q3?"
)
print(result.answer)
Using a Cheaper Sub-LM¶
import dspy
main_lm = dspy.LM("openai/gpt-5")
cheap_lm = dspy.LM("openai/gpt-5-nano")
dspy.configure(lm=main_lm)
# Root LM (gpt-4o) decides strategy; sub-LM (gpt-4o-mini) handles extraction
rlm = dspy.RLM("data, query -> summary", sub_lm=cheap_lm)
Multiple Typed Outputs¶
rlm = dspy.RLM("logs -> error_count: int, critical_errors: list[str]")
result = rlm(logs=server_logs)
print(f"Found {result.error_count} errors")
print(f"Critical: {result.critical_errors}")
Custom Tools¶
def fetch_metadata(doc_id: str) -> str:
"""Fetch metadata for a document ID."""
return database.get_metadata(doc_id)
rlm = dspy.RLM(
"documents, query -> answer",
tools=[fetch_metadata]
)
Async Execution¶
import asyncio
rlm = dspy.RLM("context, query -> answer")
async def process():
result = await rlm.aforward(context=data, query="Summarize this")
return result.answer
answer = asyncio.run(process())
Inspecting the Trajectory¶
result = rlm(context=data, query="Find the magic number")
# See what code the LLM executed
for step in result.trajectory:
print(f"Code:\n{step['code']}")
print(f"Output:\n{step['output']}\n")
Output¶
RLM returns a Prediction with:
- Output fields from your signature (e.g.,
result.answer) trajectory: List of dicts withreasoning,code,outputfor each stepfinal_reasoning: The LLM's reasoning on the final step
Notes¶
Experimental
RLM is marked as experimental. The API may change in future releases.
Thread Safety
RLM instances are not thread-safe when using a custom interpreter. Create separate instances for concurrent use, or use the default PythonInterpreter which creates a fresh instance per forward() call.
Interpreter Requirements
The default PythonInterpreter requires Deno to be installed for the Pyodide WASM sandbox.
API Reference¶
dspy.RLM(signature: type[Signature] | str, max_iterations: int = 20, max_llm_calls: int = 50, max_output_chars: int = 100000, verbose: bool = False, tools: list[Callable] | None = None, sub_lm: dspy.LM | None = None, interpreter: CodeInterpreter | None = None)
¶
Bases: Module
Recursive Language Model module.
Uses a sandboxed REPL to let the LLM programmatically explore large contexts through code execution. The LLM writes Python code to examine data, call sub-LLMs for semantic analysis, and build up answers iteratively.
The default interpreter is PythonInterpreter (Deno/Pyodide/WASM), but you can provide any CodeInterpreter implementation (e.g., MockInterpreter, or write a custom one using E2B or Modal).
Note: RLM instances are not thread-safe when using a custom interpreter. Create separate RLM instances for concurrent use, or use the default PythonInterpreter which creates a fresh instance per forward() call.
Example
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
signature
|
type[Signature] | str
|
Defines inputs and outputs. String like "context, query -> answer" or a Signature class. |
required |
max_iterations
|
int
|
Maximum REPL interaction iterations. |
20
|
max_llm_calls
|
int
|
Maximum sub-LLM calls (llm_query/llm_query_batched) per execution. |
50
|
max_output_chars
|
int
|
Maximum characters to include from REPL output. |
100000
|
verbose
|
bool
|
Whether to log detailed execution info. |
False
|
tools
|
list[Callable] | None
|
List of tool functions or dspy.Tool objects callable from interpreter code. Built-in tools: llm_query(prompt), llm_query_batched(prompts). |
None
|
sub_lm
|
LM | None
|
LM for llm_query/llm_query_batched. Defaults to dspy.settings.lm. Allows using a different (e.g., cheaper) model for sub-queries. |
None
|
interpreter
|
CodeInterpreter | None
|
CodeInterpreter implementation to use. Defaults to PythonInterpreter. |
None
|
Source code in .venv/lib/python3.14/site-packages/dspy/predict/rlm.py
Attributes¶
tools: dict[str, Tool]
property
¶
User-provided tools (excludes internal llm_query/llm_query_batched).
Functions¶
__call__(*args, **kwargs) -> Prediction
¶
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/module.py
forward(**input_args) -> Prediction
¶
Execute RLM to produce outputs from the given inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**input_args
|
Input values matching the signature's input fields |
{}
|
Returns:
| Type | Description |
|---|---|
Prediction
|
Prediction with output field(s) from the signature and 'trajectory' for debugging |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required input fields are missing |
Source code in .venv/lib/python3.14/site-packages/dspy/predict/rlm.py
aforward(**input_args) -> Prediction
async
¶
Async version of forward(). Execute RLM to produce outputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**input_args
|
Input values matching the signature's input fields |
{}
|
Returns:
| Type | Description |
|---|---|
Prediction
|
Prediction with output field(s) from the signature and 'trajectory' for debugging |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required input fields are missing |
Source code in .venv/lib/python3.14/site-packages/dspy/predict/rlm.py
batch(examples: list[Example], num_threads: int | None = None, max_errors: int | None = None, return_failed_examples: bool = False, provide_traceback: bool | None = None, disable_progress_bar: bool = False, timeout: int = 120, straggler_limit: int = 3) -> list[Example] | tuple[list[Example], list[Example], list[Exception]]
¶
Processes a list of dspy.Example instances in parallel using the Parallel module.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
examples
|
list[Example]
|
List of dspy.Example instances to process. |
required |
num_threads
|
int | None
|
Number of threads to use for parallel processing. |
None
|
max_errors
|
int | None
|
Maximum number of errors allowed before stopping execution.
If |
None
|
return_failed_examples
|
bool
|
Whether to return failed examples and exceptions. |
False
|
provide_traceback
|
bool | None
|
Whether to include traceback information in error logs. |
None
|
disable_progress_bar
|
bool
|
Whether to display the progress bar. |
False
|
timeout
|
int
|
Seconds before a straggler task is resubmitted. Set to 0 to disable. |
120
|
straggler_limit
|
int
|
Only check for stragglers when this many or fewer tasks remain. |
3
|
Returns:
| Type | Description |
|---|---|
list[Example] | tuple[list[Example], list[Example], list[Exception]]
|
List of results, and optionally failed examples and exceptions. |
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/module.py
deepcopy()
¶
Deep copy the module.
This is a tweak to the default python deepcopy that only deep copies self.parameters(), and for other
attributes, we just do the shallow copy.
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/base_module.py
dump_state(json_mode=True)
¶
get_lm()
¶
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/module.py
load(path, allow_pickle=False)
¶
Load the saved module. You may also want to check out dspy.load, if you want to load an entire program, not just the state for an existing program.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the saved state file, which should be a .json or a .pkl file |
required |
allow_pickle
|
bool
|
If True, allow loading .pkl files, which can run arbitrary code. This is dangerous and should only be used if you are sure about the source of the file and in a trusted environment. |
False
|
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/base_module.py
load_state(state)
¶
named_parameters()
¶
Unlike PyTorch, handles (non-recursive) lists of parameters too.
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/base_module.py
named_predictors()
¶
parameters()
¶
predictors()
¶
reset_copy()
¶
Deep copy the module and reset all parameters.
save(path, save_program=False, modules_to_serialize=None)
¶
Save the module.
Save the module to a directory or a file. There are two modes:
- save_program=False: Save only the state of the module to a json or pickle file, based on the value of
the file extension.
- save_program=True: Save the whole module to a directory via cloudpickle, which contains both the state and
architecture of the model.
If save_program=True and modules_to_serialize are provided, it will register those modules for serialization
with cloudpickle's register_pickle_by_value. This causes cloudpickle to serialize the module by value rather
than by reference, ensuring the module is fully preserved along with the saved program. This is useful
when you have custom modules that need to be serialized alongside your program. If None, then no modules
will be registered for serialization.
We also save the dependency versions, so that the loaded model can check if there is a version mismatch on critical dependencies or DSPy version.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the saved state file, which should be a .json or .pkl file when |
required |
save_program
|
bool
|
If True, save the whole module to a directory via cloudpickle, otherwise only save the state. |
False
|
modules_to_serialize
|
list
|
A list of modules to serialize with cloudpickle's |
None
|
Source code in .venv/lib/python3.14/site-packages/dspy/primitives/base_module.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 | |
:::