Skip to content

Base Dialect Module

The Base Dialect module defines the core interfaces and classes for the dialect system. It provides the foundation for implementing different prompt formats for various language models.

Overview

The dialect system allows Azad to work with different prompt formats through a common interface. The key components are:

  • Dialect: An abstract base class for implementing different prompt dialects
  • DialectParser: An abstract base class for parsing responses from language models
  • PromptData: A class for representing prompt data

This module is critical for the agent's ability to communicate with different language models using their preferred formats.

Key Concepts

Dialect Interface

The Dialect abstract base class defines the interface for prompt dialects. It includes methods for:

  • Formatting system prompts with rules and tool documentation
  • Formatting message history for the language model
  • Formatting tool calls and results
  • Creating parsers for processing responses

Parser Interface

The DialectParser abstract base class defines the interface for response parsers. It includes methods for:

  • Processing streaming responses
  • Extracting tool calls and parameters
  • Emitting standardized events

Prompt Data

The PromptData class represents the data needed to format a prompt. It includes:

  • The message history
  • The task configuration
  • The tool metadata
  • The current assistant ID

Implementation Details

The Dialect class provides several default implementations that can be overridden by subclasses:

  • format_messages(): Formats the entire message history
  • format_system_prompt(): Formats the system prompt
  • format_history(): Formats the message history
  • format_tool_docs(): Formats documentation for available tools
  • format_user_prompt(): Formats the user prompt
  • format_user_agent_prompt(): Formats the user agent prompt

Subclasses must implement:

  • format_dialect_rules(): Returns a string describing the dialect's formatting rules
  • format_example(): Formats an example tool call
  • format_tool_call(): Formats a tool call
  • format_tool_result(): Formats a tool result
  • format_history_item(): Converts a message into a LiteLLM-compatible message
  • create_parser(): Creates a parser for the dialect
  • is_native_toolcalling(): Returns whether the dialect supports native tool calling

Usage Example

Here's a simplified example of how to implement a custom dialect:

class MyDialect(Dialect):
    """A simple dialect for AI agent communication."""

    def format_dialect_rules(self, prompt_data: PromptData) -> str:
        """Return the dialect rules."""
        return """
        # My Dialect Rules

        When you want to use a tool, format your request like this:

        USE_TOOL: tool_name
        PARAM1: value1
        PARAM2: value2
        END_TOOL
        """

    def format_example(self, tool_name: str, parameters: dict) -> str:
        """Format an example tool call."""
        params = "
".join([f"{k.upper()}: {v}" for k, v in parameters.items()])
        return f"USE_TOOL: {tool_name}
{params}
END_TOOL"

    def format_tool_call(self, tool_call: ToolCallPart) -> str:
        """Format a tool call."""
        params = "
".join([f"{k.upper()}: {v}" for k, v in tool_call.args.items()])
        return f"USE_TOOL: {tool_call.tool_name}
{params}
END_TOOL"

    def format_tool_result(self, tool_result: ToolResultPart) -> str:
        """Format a tool result."""
        return f"TOOL_RESULT: {tool_result.tool_name}
{tool_result.result}
END_RESULT"

    def format_history_item(self, item: Message) -> Optional[dict]:
        """Convert a Message to a LiteLLM message."""
        # Implementation details...

    def create_parser(self, prompt_data: PromptData) -> DialectParser:
        """Create a parser for this dialect."""
        from .parser import MyDialectParser
        return MyDialectParser(self.config)

    def is_native_toolcalling(self) -> bool:
        """Return whether this dialect supports native tool calling."""
        return False

Important Considerations

When working with the Base Dialect module:

  1. Dialect Implementation: Implement all required methods when creating a new dialect.

  2. Parser Implementation: Implement a parser that can handle the dialect's format.

  3. Tool Formatting: Ensure that tool calls and results are formatted correctly for the language model.

  4. Message Formatting: Ensure that messages are formatted correctly for the language model.

API Reference

azad.prompts.base_dialect

Attributes

Classes

PromptData

Bases: BaseModel

Attributes
Functions
override_messages
override_messages(messages: List[Message])
Source code in azad/prompts/base_dialect.py
def override_messages(self, messages: List[Message]):
    deepclone = [msg.model_copy() for msg in messages]
    self.messages = deepclone

DialectParser

Bases: ABC

Functions
feed abstractmethod
feed(data: bytes) -> list[AINetworkEventUnion]
Source code in azad/prompts/base_dialect.py
@abstractmethod
# returns iterable of messages
def feed(self, data:bytes) -> list[AINetworkEventUnion]:
    pass
feed_tool_call_delta
feed_tool_call_delta(tool_call: ChatCompletionDeltaToolCall) -> list[AINetworkEventUnion]
Source code in azad/prompts/base_dialect.py
def feed_tool_call_delta(self, tool_call: litellm.types.utils.ChatCompletionDeltaToolCall) -> list[AINetworkEventUnion]:
    return NotImplemented

DialectConfig

Bases: BaseModel

Dialect

Bases: ABC

Attributes
Functions
create_parser abstractmethod
create_parser(prompt_data: PromptData) -> DialectParser
Source code in azad/prompts/base_dialect.py
@abstractmethod
def create_parser(self, prompt_data:PromptData) -> DialectParser:
    return NotImplemented
format_tool_schema
format_tool_schema(tool: ToolMetadata) -> dict

Format a tool's metadata into a JSON schema format.

Parameters:

Returns:

  • dict

    A dictionary representing the formatted tool schema

Source code in azad/prompts/base_dialect.py
def format_tool_schema(self, tool: ToolMetadata) -> dict:
    """Format a tool's metadata into a JSON schema format.

    Args:
        tool: The tool metadata to format

    Returns:
        A dictionary representing the formatted tool schema
    """
    properties = {}
    for param_name, param_info in tool.parameters.items():
        # Extract parameter information
        if isinstance(param_info, ParameterMetadata):
            description = param_info.description
            # Determine the parameter type - we don't have explicit type info in ParameterMetadata
            # so we'll default to string
            param_type = "string"
        else:
            description = str(param_info)
            param_type = "string"

        properties[param_name] = {
            "type": param_type,
            "description": description
        }

    return {
        "name": tool.name,
        "description": tool.description,
        "input_schema": {
            "type": "object",
            "properties": properties,
            "required": tool.required_parameters,
        }
    }
format_tools_schema
format_tools_schema(tools: List[ToolMetadata]) -> List[dict]

Format multiple tools' metadata into JSON schema format.

Parameters:

  • tools (List[ToolMetadata]) –

    A list of tool metadata objects to format

Returns:

  • List[dict]

    A list of dictionaries representing the formatted tool schemas

Source code in azad/prompts/base_dialect.py
def format_tools_schema(self, tools: List[ToolMetadata]) -> List[dict]:
    """Format multiple tools' metadata into JSON schema format.

    Args:
        tools: A list of tool metadata objects to format

    Returns:
        A list of dictionaries representing the formatted tool schemas
    """
    return [self.format_tool_schema(tool) for tool in tools]
inject_prompt_cache
inject_prompt_cache(messages: list[dict], prompt_data: PromptData)

Modifies the formatted messages list IN-PLACE to add cache control flags based on the model type and specific caching rules.

Source code in azad/prompts/base_dialect.py
def inject_prompt_cache(self, messages: list[dict], prompt_data: PromptData):
    """
    Modifies the formatted messages list IN-PLACE to add cache control flags
    based on the model type and specific caching rules.
    """
    # Check if caching is explicitly enabled via dynamic config first
    cache_dialect_config = getattr(prompt_data.dyanmic_task_config, 'cache_dialect', None)

    # --- Anthropic Caching Logic ---
    # Apply if cache_dialect_config is ANTHROPIC OR if it's None and model is Anthropic
    is_anthropic_model = "anthropic" in prompt_data.task_config.model_name # Or use a more specific check if needed
    should_apply_anthropic = is_anthropic_model and (cache_dialect_config == CacheDialect.ANTHROPIC or cache_dialect_config is None)

    if should_apply_anthropic:
        # This uses the original logic provided in the snippet, which applies
        # cache control to system, second-last user, and last user messages.
        print("Applying Anthropic caching rules (System, Last User, Second-Last User)...")

        last_system_index = -1
        last_user_index = -1
        second_last_user_index = -1

        for i, msg in enumerate(messages):
            if msg.get('role') == 'system': last_system_index = i
            elif msg.get('role') == 'user': second_last_user_index = last_user_index; last_user_index = i

        def apply_anthropic_cache(index, is_last_user=False):
            if index != -1 and index < len(messages):
                msg = messages[index]
                content = msg.get('content')
                if isinstance(content, str): msg['content'] = [{"type": "text", "text": content}]; content = msg['content']
                if not isinstance(content, list): print(f"    Warning: Anthropic - content not list: {type(content)}"); return
                if not content: print(f"    Warning: Anthropic - content list empty."); return

                last_part = content[-1]
                if isinstance(last_part, dict): last_part['cache_control'] = {'type': 'ephemeral'}
                else: print(f"    Warning: Anthropic - last part not dict: {type(last_part)}")

                if is_last_user and len(content) >= 2:
                     second_last_part = content[-2]
                     if isinstance(second_last_part, dict): second_last_part['cache_control'] = {'type': 'ephemeral'}
                     else: print(f"    Warning: Anthropic - second-last part not dict: {type(second_last_part)}")

        apply_anthropic_cache(last_system_index)
        apply_anthropic_cache(second_last_user_index)
        apply_anthropic_cache(last_user_index, is_last_user=True)
        return

    # --- Gemini Caching Logic ---
    is_gemini_model = "gemini" in prompt_data.task_config.model_name
    should_apply_gemini = is_gemini_model and (cache_dialect_config == CacheDialect.GEMINI or cache_dialect_config is None)

    if should_apply_gemini and prompt_data.dyanmic_task_config.enable_explicit_caching:
        print("Checking Gemini caching rules...")

        # Calculate total characters as a proxy for tokens
        total_chars = 0
        for msg in messages:
            content = msg.get('content')
            if isinstance(content, list):
                for part in content:
                    if isinstance(part, dict) and part.get('type') == 'text':
                        total_chars += len(part.get('text', ''))
            elif isinstance(content, str):
                total_chars += len(content)

        # Minimum character count approximation (4096 tokens * ~4 chars/token)
        min_chars_for_cache = 16000 # Adjusted slightly lower
        print(f"  Total characters calculated: {total_chars}")

        if total_chars < min_chars_for_cache:
            print(f"  Skipping Gemini caching: Character count ({total_chars}) is below threshold ({min_chars_for_cache}).")
            return # Not enough content to warrant caching

        # check if includes at least one assistant message
        has_assistant_message = any(msg.get('role') == 'assistant' for msg in messages)

        if not has_assistant_message:
            print("  Skipping Gemini caching: No assistant message found.")
            return

        print("Applying Anthropic caching rules for Gemini model (System, Last User, Second-Last User)...")

        last_system_index = -1
        last_user_index = -1
        second_last_user_index = -1

        for i, msg in enumerate(messages):
            if msg.get('role') == 'system': last_system_index = i
            elif msg.get('role') == 'user': second_last_user_index = last_user_index; last_user_index = i

        def apply_gemini_cache(index, is_last_user=False):
            if index != -1 and index < len(messages):
                msg = messages[index]
                content = msg.get('content')
                if isinstance(content, str): msg['content'] = [{"type": "text", "text": content}]; content = msg['content']
                if not isinstance(content, list): print(f"    Warning: Gemini - content not list: {type(content)}"); return
                if not content: print(f"    Warning: Gemini - content list empty."); return

                last_part = content[-1]
                if isinstance(last_part, dict): last_part['cache_control'] = {'type': 'ephemeral'}
                else: print(f"    Warning: Gemini - last part not dict: {type(last_part)}")

                if is_last_user and len(content) >= 2:
                     second_last_part = content[-2]
                     if isinstance(second_last_part, dict): second_last_part['cache_control'] = {'type': 'ephemeral'}
                     else: print(f"    Warning: Gemini - second-last part not dict: {type(second_last_part)}")

        apply_gemini_cache(last_system_index)
        apply_gemini_cache(second_last_user_index)
        apply_gemini_cache(last_user_index, is_last_user=True)

        return

    # No explicit return needed
    return
format_messages
format_messages(prompt_data: PromptData, rules_path: Optional[str]) -> list[dict]
Source code in azad/prompts/base_dialect.py
def format_messages(self, prompt_data:PromptData, rules_path: Optional[str]) -> list[dict]:
    system_prompt = self.format_system_prompt(prompt_data, rules_path)
    history = self.format_history(prompt_data)

    # delete any messages with empty content or empty tool_calls
    history = [msg for msg in history if msg.get('content') and len(msg['content']) > 0
               or (msg.get('tool_calls') and len(msg['tool_calls']) > 0)]


    formated = [system_prompt, *history]
    formated = self._interleave_consecutive_messages(formated)

    # Check if it's a kodu provider
    is_kodu_provider = "kodu" in prompt_data.task_config.model_name

    if is_kodu_provider:
        from ..env_settings import settings
        from ..ainetwork.errors import AIInsufficientCreditsError, AIUserNotFoundError

        if settings.FLY_PROCESS_GROUP is not None and settings.DATABASE_URL is not None and settings.TURSO_AUTH_TOKEN is not None:
            user_api_key = prompt_data.dyanmic_task_config.model_api_key
            user_credits = 0
            credit_threshold = settings.CREDIT_THRESHOLD
            try:
                from ..db_models import db

                user_credits = db.get_user_credits(user_api_key)
            except Exception as e:
                print(f"Error in Kodu provider check: {str(e)}")

            if user_credits is not None:
                if user_credits < credit_threshold:
                    raise AIInsufficientCreditsError()
            else:
                raise AIUserNotFoundError()


    self.inject_prompt_cache(formated, prompt_data)


    # if we have dynamic environment details, add them to the last user message if it exists
    if prompt_data.dyanmic_task_config.dynamic_environment_details_block:
        last_user_index = -1
        for i, msg in enumerate(formated):
            if msg.get('role') == 'user':
                last_user_index = i
        if last_user_index != -1:
            # validate that the content field is array
            if isinstance(formated[last_user_index]['content'], list):
                formated[last_user_index]['content'].append({
                    "type":"text",
                    "text":prompt_data.dyanmic_task_config.dynamic_environment_details_block
                })
            else:
                # edge case this should not happen
                formated[last_user_index]['content'] = [
                    {
                        "type":"text",
                        "text":formated[last_user_index]['content']
                    },
                    {
                        "type":"text",
                        "text":prompt_data.dyanmic_task_config.dynamic_environment_details_block
                    }
                ]

    return formated
format_system_prompt
format_system_prompt(prompt_data: PromptData, rules_path: Optional[str]) -> dict

Format the system prompt for the dialect. This method handles the replacement of placeholders in the system prompt with the actual values from the prompt data. If there is no placeholder it will format the system prompt in following way: TOOL_USE_INSTRUCTION AVAILABLE_TOOLS USER AGENT PROMPT SYSTEM GENERAL INSTRUCTIONS USER CUSTOM INSTRUCTIONS

Source code in azad/prompts/base_dialect.py
def format_system_prompt(self, prompt_data:PromptData, rules_path: Optional[str]) -> dict:
    """Format the system prompt for the dialect.
    This method handles the replacement of placeholders in the system prompt
    with the actual values from the prompt data.
    If there is no placeholder it will format the system prompt in following way:
    TOOL_USE_INSTRUCTION
    AVAILABLE_TOOLS
    USER AGENT PROMPT
    SYSTEM GENERAL INSTRUCTIONS
    USER CUSTOM INSTRUCTIONS
    """
    system = f"""{self.format_user_agent_prompt(prompt_data)}"""
    # check if {{AVAILABLE_TOOLS}} is in the system prompt
    def is_key_in_system_prompt(key: str) -> bool:
        return key in system
    def format_key(key: str) -> str:
        return f"{{{key}}}"
    system_key = format_key(PromptTemplateType.SYSTEM_INSTRUCTION)
    available_tools_key = format_key(PromptTemplateType.AVAILABLE_TOOLS)
    tool_use_instruction_key = format_key(PromptTemplateType.TOOL_USE_INSTRUCTION)
    user_prompt_key = format_key(PromptTemplateType.USER_PROMPT)
    if is_key_in_system_prompt(system_key):
        system = system.replace(system_key, self.format_system_rules(prompt_data))
    else:
        system = f"{system}\n{self.format_system_rules(prompt_data)}"
    if is_key_in_system_prompt(available_tools_key):
        system = system.replace(available_tools_key, self.format_tool_docs(prompt_data))
    else:
        system = f"{self.format_tool_docs(prompt_data)}\n{system}"
    if is_key_in_system_prompt(tool_use_instruction_key):
        system = system.replace(tool_use_instruction_key, self.format_dialect_rules(prompt_data, rules_path))
    else:
        system = f"{self.format_dialect_rules(prompt_data, rules_path)}\n{system}"
    if is_key_in_system_prompt(user_prompt_key):
        system = system.replace(user_prompt_key, self.format_user_prompt(prompt_data))
    else:
        system = f"{system}\n{self.format_user_prompt(prompt_data)}"
    return dict(content=[{"type":"text","text": system}],role="system")
format_system_rules
format_system_rules(prompt_data)
Source code in azad/prompts/base_dialect.py
    def format_system_rules(self, prompt_data):
        return """# You have special messages called Informational Messages that are generated by the local environment and are not part of the user's input. These messages may be visible or invisible to the user, you should observe the informational messages and take them into account when responding to the user in accordance to the task.
# Pay extra attention to your mistakes and try to self improve by learning from them and acting better in the future.
# You should always try to be direct and clear generating the best possible output for the user task.
# When commuinicating with the user about informational messages, you should always be clear and understand that informationals messages are generated by the local environment and the user may not be aware of them.
# This means when talking about any details from the informational messages, you should say the environment instead of the user, so when talking about the informational content always say "the environment" instead of "the user".
"""
format_dialect_rules abstractmethod
format_dialect_rules(prompt_data: PromptData, rules_file_name: Optional[str]) -> str

Return a string describing the dialect's formatting rules.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def format_dialect_rules(self, prompt_data: PromptData, rules_file_name: Optional[str]) -> str:
    """Return a string describing the dialect's formatting rules."""
    pass
format_example abstractmethod
format_example(tool_name: str, parameters: dict) -> str

Format an example tool call for the dialect.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def format_example(self, tool_name: str, parameters: dict) -> str:
    """Format an example tool call for the dialect."""
    pass
format_tool_call abstractmethod
format_tool_call(tool_call: ToolCallPart) -> str

Format a tool call into the dialect's representation.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def format_tool_call(self, tool_call: ToolCallPart) -> str:
    """Format a tool call into the dialect's representation."""
    pass
format_tool_result abstractmethod
format_tool_result(tool_result: ToolResultPart) -> str

Format a tool result into the dialect's representation.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def format_tool_result(self, tool_result: ToolResultPart) -> str:
    """Format a tool result into the dialect's representation."""
    pass
format_history_item abstractmethod
format_history_item(item: Message) -> Optional[dict]

Convert a message into a LiteLLM-compatible message.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def format_history_item(self, item: Message) -> Optional[dict]:
    """Convert a message into a LiteLLM-compatible message."""
    pass
is_native_toolcalling abstractmethod
is_native_toolcalling() -> bool | None

Return True if the dialect supports native tool calling.

Source code in azad/prompts/base_dialect.py
@abstractmethod
def is_native_toolcalling(self) -> bool | None:
    """Return True if the dialect supports native tool calling."""
    pass
format_history
format_history(prompt_data: PromptData) -> List[dict]

Format the entire message history from the mindmap.

Source code in azad/prompts/base_dialect.py
def format_history(self, prompt_data: PromptData) -> List[dict]:
    """Format the entire message history from the mindmap."""
    messages = prompt_data.messages
    return [msg for msg in (self.format_history_item(item) for item in messages) if msg is not None]
format_user_prompt
format_user_prompt(prompt_data: PromptData) -> str
Source code in azad/prompts/base_dialect.py
def format_user_prompt(self, prompt_data:PromptData) -> str:
    if not prompt_data.task_config.user_prompt:
        return ""

    return inspect.cleandoc(f"""====
        The user has provided additional instructions or details for you to use, Please understand the the user may or may not have knowledge of the overall system instructions, and this is their attempt to configure your behavior to match their needs.
        Here is the user custom instructions:
        {prompt_data.task_config.user_prompt}
        ====
        """)
format_user_agent_prompt
format_user_agent_prompt(prompt_data: PromptData) -> str
Source code in azad/prompts/base_dialect.py
def format_user_agent_prompt(self, prompt_data:PromptData) -> str:
    if not prompt_data.task_config.user_agent_prompt:
        return ""
    # return inspect.cleandoc(f"""<agent_prompt>{prompt_data.task_config.user_agent_prompt}</agent_prompt>""")
    return prompt_data.task_config.user_agent_prompt
format_tool_docs
format_tool_docs(prompt_data: PromptData) -> str

Format documentation for available tools.

This base implementation handles the structure, while specific dialects format the individual examples.

Parameters:

  • tools

    List of Tool classes or ToolSignature objects to document

Returns:

  • str

    Formatted tool documentation string

Source code in azad/prompts/base_dialect.py
    def format_tool_docs(self, prompt_data:PromptData) -> str:
        """Format documentation for available tools.

        This base implementation handles the structure, while specific dialects
        format the individual examples.

        Args:
            tools: List of Tool classes or ToolSignature objects to document

        Returns:
            Formatted tool documentation string
        """

        tools: list[ToolMetadata] = prompt_data.tool_metadata
        docs = []
        for tool in tools:
            # Handle both Tool and ToolSignature objects
            params = []
            for name, param_info in tool.parameters.items():
                required = "(required)" if name in tool.required_parameters else "(optional)"
                params.append(f"- {name}: {required} {self._get_parameter_description(param_info)}") # type: ignore

            examples = []
            for i, example in enumerate(tool.examples, 1):
                formatted_example = self.format_example(tool.name, example.parameters) # type: ignore
                examples.append(f"""<tool_call_example tool_name="{tool.name}">### {example.explanation}\n> Azad Output :\n{formatted_example}\n</<tool_call_example>""")

            doc = f"""<tool_doc tool_name="{tool.name}">
Tool name: {tool.name}
Tool Description: {tool.description}
Tool Input Parameters:
{"".join(params)}
Tool Usage examples:
<tool_call_examples tool_name="{tool.name}">
Here is a list of examples of how to use the tool, please read them carefully to understand how to use the tool effectively, if no instructions are provided, please use the tool as you see fit.
{"".join(examples)}
</tool_call_examples>
"""
            docs.append(doc)

        return f"""## Tool Documentation
Here are all the tools available for you to use in the task, please read the documentation carefully to understand how to use them effectively.
{"\n\n".join(docs).strip()}
"""