Custom agents¶

Test agents built with raw LLM SDK calls (OpenAI, Anthropic, Google, etc.).

When to use this guide

Use this guide if you're calling LLM APIs directly (e.g., openai.chat.completions.create()). If you're using a framework like LangChain or CrewAI, see the framework guides instead.

What you'll use¶

Decorator	Purpose
`@link_tool`	Your custom tools (search, database, APIs)
`@link_llm`	Your direct LLM calls (optional, for cleaner verification)
`@link_agent`	Entry point for tracing

Why `@link_llm` is optional¶

Tenro intercepts LLM calls at the HTTP level. This means llm.simulate() works whether or not you use @link_llm:

from tenro import Provider
from tenro.simulate import llm
# This works WITHOUT @link_llm - Tenro intercepts HTTP directly
llm.simulate(Provider.OPENAI, response="Hello!")

# Your OpenAI SDK call is intercepted at the network level
response = openai.chat.completions.create(...)  # Intercepted!

What @link_llm does:

Creates an LLMScope to trace which function made each LLM call
Enables targeted simulation with target= when you have multiple LLM functions
Provides better error messages showing which function failed

Benefits of using @link_llm:

Route simulations to specific functions with target=
Explicit tracing of LLM call origins for debugging
Works when HTTP interception isn't available (non-httpx SDKs)

Customer support example¶

A customer support agent with direct OpenAI SDK calls.

"""Customer Support: Testing knowledge base retrieval with custom OpenAI agents."""

from __future__ import annotations

from examples.myapp import CustomerSupportAgent, generate_response, search_knowledge_base

from tenro import Provider
from tenro.simulate import llm, tool
from tenro.testing import tenro


@tenro
def test_customer_support_answers_question() -> None:
    """Test customer support agent uses knowledge base and LLM."""
    tool.simulate(
        search_knowledge_base,
        result=[{"title": "Refund Policy", "content": "Full refunds within 30 days."}],
    )
    llm.simulate(
        Provider.OPENAI,
        target=generate_response,
        response="You can get a full refund within 30 days of purchase.",
    )

    CustomerSupportAgent().run("How do I get a refund?")

    tool.verify_many(search_knowledge_base, count=1)
    llm.verify(Provider.OPENAI)
    llm.verify(output_contains="refund")

RAG pipeline example¶

A retrieval-augmented generation pipeline with direct LLM calls.

"""RAG Pipeline: Testing document retrieval with custom OpenAI agents."""

from __future__ import annotations

from examples.myapp import RAGPipeline, fetch_documents, generate_response

from tenro import Provider
from tenro.simulate import llm, tool
from tenro.testing import tenro


@tenro
def test_rag_pipeline_synthesizes_answer() -> None:
    """Test RAG pipeline fetches documents and generates answer."""
    tool.simulate(
        fetch_documents,
        result=[
            {"id": "doc1", "text": "Machine learning uses algorithms to learn."},
            {"id": "doc2", "text": "Deep learning is a subset of ML."},
        ],
    )
    llm.simulate(
        Provider.OPENAI,
        target=generate_response,
        response="Machine learning is a field where algorithms learn patterns from data.",
    )

    RAGPipeline().run("What is machine learning?", "AI")

    tool.verify_many(fetch_documents, count=1)
    llm.verify(Provider.OPENAI)
    llm.verify(output_contains="Machine learning")

Multi-turn conversation example¶

An agent handling multi-turn conversations.

"""Multi-Turn Conversation: Testing sequential LLM calls with custom OpenAI agents."""

from __future__ import annotations

from examples.myapp import ConversationAgent

from tenro import Provider
from tenro.simulate import llm
from tenro.testing import tenro


@tenro
def test_multi_turn_conversation() -> None:
    """Test agent handles multi-turn conversation with context."""
    llm.simulate(
        Provider.OPENAI,
        responses=[
            "A list in Python is created with square brackets: my_list = [1, 2, 3]",
            "To add items, use append(): my_list.append(4)",
        ],
    )

    responses = ConversationAgent().run(
        ["How do I create a list in Python?", "How do I add items to it?"]
    )

    assert len(responses) == 2
    llm.verify_many(Provider.OPENAI, count=2)
    llm.verify(output_contains="list", call_index=0)
    llm.verify(output_contains="append", call_index=1)

More examples¶

Research assistant¶

"""Basic example: Research assistant agent.

Tests an agent that searches the web and summarizes findings.
"""

from __future__ import annotations

from tenro import Provider, link_agent, link_llm, link_tool
from tenro.simulate import llm, tool
from tenro.testing import tenro

# APPLICATION CODE


@link_tool("web_search")
def web_search(query: str, num_results: int = 5) -> list[dict]:
    """Search the web for information."""
    return [{"title": "Result 1", "snippet": "...", "url": "https://..."}]


@link_llm(Provider.OPENAI)
def synthesize_findings(query: str, search_results: list[dict]) -> str:
    """Synthesize search results into a coherent summary."""
    import openai

    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[
            {
                "role": "user",
                "content": f"Query: {query}\nResults: {search_results}\n\nSynthesize.",
            }
        ],
    )
    return response.choices[0].message.content


@link_agent("ResearchAssistantAgent")
class ResearchAssistantAgent:
    """Agent that researches a topic and provides a summary with sources."""

    def run(self, question: str) -> dict:
        """Run the research assistant agent."""
        results = web_search(question, num_results=5)
        summary = synthesize_findings(question, results)
        return {
            "summary": summary,
            "sources": [r["url"] for r in results],
        }


# TESTS


@tenro
def test_research_agent_finds_and_summarizes():
    """Test that agent searches and synthesizes results."""
    # Control what tools and LLMs return
    tool.simulate(
        web_search,
        result=[
            {
                "title": "AI Trends 2025",
                "snippet": "AI agents are...",
                "url": "https://example.com/1",
            },
            {"title": "Future of AI", "snippet": "Agentic AI...", "url": "https://example.com/2"},
        ],
    )
    llm.simulate(
        Provider.OPENAI,
        response="AI agents are becoming the dominant paradigm in 2025.",
    )

    # Run the agent
    ResearchAssistantAgent().run("What are the AI trends in 2025?")

    # Verify behavior
    tool.verify_many(web_search, count=1)
    llm.verify_many(Provider.OPENAI, at_least=1)


@tenro
def test_research_agent_handles_no_results():
    """Test agent behavior when search returns nothing."""
    # Simulate empty search results
    tool.simulate(web_search, result=[])
    llm.simulate(
        Provider.OPENAI,
        response="I couldn't find relevant information on this topic.",
    )

    # Run the agent
    ResearchAssistantAgent().run("Very obscure topic")

    # Verify graceful fallback
    tool.verify_many(web_search, count=1)
    llm.verify(output_contains="couldn't find")

Code review agent¶

"""Intermediate example: Code review agent.

Tests an agent that reviews pull requests and suggests improvements.
"""

from __future__ import annotations

from tenro import Provider, link_agent, link_llm, link_tool
from tenro.simulate import llm, tool
from tenro.testing import tenro

# APPLICATION CODE


@link_tool("fetch_pr_diff")
def fetch_pr_diff(pr_url: str) -> str:
    """Fetch the diff from a pull request."""
    return "diff --git a/main.py..."


@link_tool("post_review_comment")
def post_review_comment(pr_url: str, comment: str) -> bool:
    """Post a review comment on the PR."""
    return True


@link_llm(Provider.OPENAI)
def analyze_code(diff: str) -> dict:
    """Analyze code changes for issues and improvements."""
    import openai

    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": f"Review this code diff:\n{diff}"}],
    )
    return {"review": response.choices[0].message.content}


@link_agent("CodeReviewAgent")
class CodeReviewAgent:
    """Agent that reviews PRs and posts feedback."""

    def run(self, pr_url: str) -> dict:
        """Run the code review agent."""
        diff = fetch_pr_diff(pr_url)
        analysis = analyze_code(diff)
        post_review_comment(pr_url, analysis["review"])
        return analysis


# TESTS


@tenro
def test_code_review_agent_finds_security_issue():
    """Test that agent identifies code issues."""
    # Control what tools and LLMs return
    tool.simulate(
        fetch_pr_diff,
        result="+ def process(data):\n+     eval(data)  # dangerous!",
    )
    tool.simulate(post_review_comment, result=True)
    # Simulate at HTTP level - response becomes the message content
    llm.simulate(
        Provider.OPENAI,
        response="Security issue: Using eval() is dangerous. Use ast.literal_eval().",
    )

    # Run the agent
    CodeReviewAgent().run("https://github.com/org/repo/pull/123")

    # Verify issue was detected
    tool.verify_many(fetch_pr_diff, count=1)
    tool.verify_many(post_review_comment, count=1)
    llm.verify(output_contains="Security issue")


@tenro
def test_code_review_agent_approves_clean_code():
    """Test agent with well-formatted code."""
    # Simulate well-formatted PR
    tool.simulate(
        fetch_pr_diff,
        result="+ def add(a: int, b: int) -> int:\n+     return a + b",
    )
    tool.simulate(post_review_comment, result=True)
    llm.simulate(
        Provider.OPENAI,
        response="LGTM! Good implementation with type hints.",
    )

    # Run the agent
    CodeReviewAgent().run("https://github.com/org/repo/pull/456")

    # Verify approval given
    tool.verify_many(fetch_pr_diff, count=1)
    llm.verify(output_contains="LGTM")

Key patterns¶

Agentic loop (LLM calls tool)¶

When the LLM decides to call a tool, then responds with the result:

from tenro import Provider, ToolCall
from tenro.simulate import llm, tool
# Assuming web_search is defined with @link_tool("web_search")

# 1. Set up tool result (use function reference)
tool.simulate(web_search, result={"title": "AI Trends", "content": "..."})

# 2. Set up LLM responses: first triggers tool, second is final response
llm.simulate(Provider.OPENAI, responses=[
    ToolCall(web_search, query="AI trends 2025"),
    "Based on my research, AI trends in 2025 include...",
])

Using `@link_llm` for tracing¶

from tenro import link_llm, Provider

@link_llm(Provider.OPENAI)
def call_gpt(prompt: str) -> str:
    """Direct LLM call - you control this."""
    import openai
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

Targeted simulation¶

from tenro import Provider
from tenro.simulate import llm
# Route simulation to a specific @link_llm function
llm.simulate(
    Provider.OPENAI,
    target=call_gpt,
    response="Expected answer",
)

When to use `@link_llm`¶

Scenario	Use `@link_llm`?
Single LLM function, httpx SDK	Optional (HTTP interception works)
Multiple LLM functions	Recommended (for targeted simulation)
Non-httpx SDK	Required (HTTP interception unavailable)
Complex agent with many LLM calls	Recommended (for tracing)