Design a Custom Tool | ClaudePedia

Your agent can talk, but it cannot act until you give it tools. The quickstart showed how to wire a simple read_file tool into the agent loop. This guide goes deeper: we will build a database query tool from scratch, making every design decision explicit along the way. By the end, you will know how to define a schema the model can reason about, choose a concurrency class that keeps dispatch safe, set behavioral flags for cross-cutting concerns, and register the finished tool so the dispatcher can find it.

The running example is a query_database tool that accepts a SQL query and returns rows. We chose this because it hits every interesting design surface: the schema needs careful descriptions, the concurrency class depends on whether the query mutates state, the tool needs permission checks, and the result can be large enough to require size limits.

Define the Schema

The schema is the interface between the model and your tool. The model never sees your implementation code. It sees the schema (the name, description, and typed parameters) and uses that to decide whether to call the tool and what arguments to pass. This makes schema design the single highest-leverage activity in tool building.

A schema has three parts: a name the model can reference, a description that explains what the tool does and when to use it, and a set of typed parameters with their own descriptions.

The following defines the schema for our database query tool:

tool_schema = {
  name: "query_database",
  description: "Execute a read-only SQL query against the application database and return matching rows. Use this when you need to look up data, check record counts, or verify database state. Do not use for INSERT, UPDATE, or DELETE operations. Use write_database for mutations.",
  parameters: {
    query: {
      type: "string",
      description: "A read-only SQL SELECT statement. Must not contain INSERT, UPDATE, DELETE, DROP, or ALTER."
    },
    max_rows: {
      type: "integer",
      description: "Maximum number of rows to return. Defaults to 100. Use a lower value when you only need to check existence.",
      default: 100
    }
  }
}

Three things to notice about this schema:

The description says when to use it and when not to. "Use this when you need to look up data" gives the model positive guidance. "Do not use for INSERT, UPDATE, or DELETE" gives it a clear boundary. Models make better tool selection decisions when the description includes both cases.

Parameter descriptions carry constraints. The query parameter description says "Must not contain INSERT, UPDATE, DELETE, DROP, or ALTER." This is not enforced by the schema itself. It is guidance for the model. The actual enforcement happens in validation (next section). But stating the constraint in the description means the model is less likely to generate a violating input in the first place.

The max_rows parameter has a default. This means the model can call query_database(query="SELECT * FROM users") without specifying max_rows, and the system fills in 100. Defaults reduce the number of decisions the model has to make per call, which reduces error rates.

Tip: Schema descriptions matter more than names. The model reads the description to decide when to call your tool, so invest your design effort there. A tool named qdb with a great description outperforms a tool named query_database with a vague one.

Choose a Concurrency Class

The concurrency class tells the dispatcher whether it is safe to run your tool in parallel with other tools. When the model requests multiple tool calls in a single response, the dispatcher uses concurrency classes to decide which calls can run simultaneously and which must be serialized.

There are three classes:

READ_ONLY: The tool only reads. Multiple instances can run concurrently without interference. File searches, API lookups, and database SELECT queries are read-only.
WRITE_EXCLUSIVE: The tool writes to shared state. It must run serially, meaning no other tool runs at the same time. File writes, database mutations, and email sends are write-exclusive.
UNSAFE: The tool has side effects that are hard to bound. It runs in isolation, often in a subprocess or sandbox. Arbitrary shell execution is the canonical example.

For our query_database tool, the class is READ_ONLY because it executes SELECT statements and does not modify state.

But here is a subtlety: concurrency class can depend on the input. A more general execute_sql tool that accepts any SQL statement cannot be statically classified. A SELECT is read-only, an UPDATE is write-exclusive, and a DROP TABLE is unsafe. In that case, the tool implements a runtime check:

function is_concurrency_safe(parsed_input: QueryInput) -> bool:
  query_upper = parsed_input.query.strip().upper()
  if query_upper.startswith("SELECT"):
    return True
  return False   # conservative: anything non-SELECT serializes

The dispatcher calls this function with the parsed input before deciding on parallel execution. If the function throws (because the input failed to parse, for instance), the dispatcher defaults to False. It never optimistically assumes concurrent dispatch is safe.

For our focused query_database tool, we hardcode READ_ONLY because the schema and validation already constrain it to SELECT queries. The runtime check is for tools with broader input surfaces.

Set Behavioral Flags

Behavioral flags declare cross-cutting concerns as data rather than embedding them in the function body. They tell the rest of the system (the permission layer, the dispatcher, the logging system) how to handle this tool without reading its implementation.

The following attaches behavioral flags to our tool:

tool_config = {
  schema: tool_schema,
  concurrency_class: "READ_ONLY",
  behavioral_flags: {
    is_destructive: False,
    requires_permission: True,
    interrupt_behavior: "block",
    max_result_size_chars: 50_000,
    timeout_seconds: 30,
    retry_on_failure: True,
    max_retries: 2
  }
}

Each flag serves a specific downstream consumer:

is_destructive: Tells the permission system whether this tool modifies state that cannot be undone. Our query tool is not destructive. A delete_records tool would be.
requires_permission: Tells the permission cascade to check before execution. Even read-only tools might need permission if they access sensitive data. Database queries can expose PII, so we set this to True.
interrupt_behavior: Tells the dispatcher what to do if the user cancels mid-execution. "block" means wait for the current call to finish before stopping. "abort" would cancel immediately. Database queries are safe to let finish.
max_result_size_chars: Caps the result size. A SELECT without a LIMIT on a large table can return megabytes. Capping at 50,000 characters prevents a single tool result from consuming the entire context window.
timeout_seconds: Prevents a slow query from blocking the agent indefinitely. 30 seconds is generous for a database query, so adjust based on your workload.
retry_on_failure and max_retries: Transient database errors (connection reset, lock timeout) are worth retrying. Two retries before giving up is a reasonable default.

Tip: Default to requires_permission: True for any tool you have not explicitly classified. Fail-closed is the only safe default, and you can always relax it later. See Safety and Permissions for how the permission cascade evaluates these flags.

Implement the Tool

The implementation is the function that does the actual work. It receives parsed, validated input and returns a result. The schema and behavioral flags handle everything upstream. By the time your function runs, the input has passed schema validation, semantic validation, and permission checks.

The following implements the query tool with input validation:

async function query_database_impl(input: QueryInput, context: ToolContext) -> ToolResult:
  # Semantic validation: reject mutations even if they got past schema parsing
  forbidden = ["INSERT", "UPDATE", "DELETE", "DROP", "ALTER", "TRUNCATE"]
  query_upper = input.query.strip().upper()
  for keyword in forbidden:
    if keyword in query_upper:
      return error_result(f"Rejected: query contains forbidden keyword '{keyword}'")

  # Execute the query with timeout protection
  try:
    rows = await context.database.execute(
      input.query,
      max_rows=input.max_rows,
      timeout=context.tool_config.timeout_seconds
    )
  except TimeoutError:
    return error_result("Query timed out after 30 seconds")
  except ConnectionError:
    return error_result("Database connection failed")

  # Format and size-cap the result
  formatted = format_rows_as_table(rows)
  if len(formatted) > context.tool_config.max_result_size_chars:
    formatted = formatted[:context.tool_config.max_result_size_chars]
    formatted += f"\n[Truncated - {len(rows)} rows total, showing first portion]"

  return success_result(formatted)

Two implementation patterns are worth calling out:

Double validation. The semantic validation (checking for forbidden keywords) runs even though the schema description already tells the model not to send mutations. Defense in depth: the model might ignore the description, or a future schema change might weaken the constraint. Never rely on the model following instructions as your only enforcement layer.

Error results, not exceptions. The function returns error_result(...) instead of throwing. This is critical: the agent loop needs a tool_result message for every tool_call in the conversation history. If your tool throws an exception, the dispatcher catches it and converts it to an error result anyway. But returning error results explicitly gives you control over the error message the model sees.

Wire Into Dispatch

The final step is registering the tool so the dispatcher can find it. The registry pattern maps tool names to their configurations and implementations:

# Registry: maps tool names to configs and implementations
tool_registry = ToolRegistry()

tool_registry.register(
  name="query_database",
  config=tool_config,
  implementation=query_database_impl
)

# Dispatch: the agent loop calls this for every tool call
async function dispatch_tool(name: str, args: dict, context: ToolContext) -> ToolResult:
  tool = tool_registry.get(name)
  if tool is None:
    return error_result(f"Unknown tool: {name}")

  # Schema validation (phase 1)
  parsed = tool.config.schema.parse(args)
  if not parsed.success:
    return error_result(f"Invalid arguments: {parsed.error}")

  # Permission check (uses behavioral flags)
  if tool.config.behavioral_flags.requires_permission:
    permission = await check_permission(name, parsed.data, context)
    if permission.denied:
      return error_result(f"Permission denied: {permission.reason}")

  # Execute
  return await tool.implementation(parsed.data, context)

The dispatcher does not know anything about databases, SQL, or query formatting. It knows how to validate schemas, check permissions, and call implementations. Adding a new tool means adding a registry entry. The dispatch logic stays untouched.

In a production system, the registry also handles tool listing for the LLM (extracting schemas into the format the model API expects), dynamic tool sets (tools that appear or disappear based on context), and tool aliases for backward compatibility. See Tool System for the full registry pattern.

Putting It All Together

Here is the complete tool, from schema to registration, in one view:

# 1. Schema - what the model sees
schema = {
  name: "query_database",
  description: "Execute a read-only SQL SELECT query and return rows.",
  parameters: {
    query: { type: "string", description: "A read-only SELECT statement." },
    max_rows: { type: "integer", description: "Max rows to return.", default: 100 }
  }
}

# 2. Config - what the system sees
config = ToolConfig(
  schema=schema,
  concurrency_class="READ_ONLY",
  behavioral_flags=Flags(
    is_destructive=False,
    requires_permission=True,
    max_result_size_chars=50_000,
    timeout_seconds=30
  )
)

# 3. Implementation - what actually runs
async function query_database(input, context) -> ToolResult:
  rows = await context.database.execute(input.query, max_rows=input.max_rows)
  return success_result(format_rows_as_table(rows))

# 4. Registration - wire it in
registry.register("query_database", config, query_database)

Four decisions, four pieces of code, one tool. The schema is the interface. The config is the metadata. The implementation is the plumbing. The registration is the wiring. Every production tool follows this pattern.

Tool System. The full tool architecture: concurrency partitioning, the dispatch algorithm, two-phase validation, dynamic tool sets, and schema flattening for LLM APIs.
Safety and Permissions. How the permission cascade evaluates requires_permission and is_destructive flags before your tool runs.
Agent Loop. The loop that calls dispatch, and how tool results feed back into the conversation history.