azad.compression.strategies Package ¶

azad.compression.strategies ¶

Compression strategies package.

This package contains implementations of different compression strategies. Each strategy is responsible for a specific method of compressing task messages.

Classes ¶

TruncationStrategy ¶

TruncationStrategy()

Bases: CompressionStrategy

Strategy that removes a subset of messages to reduce context size.

This strategy preserves the first human message and the most recent message pairs, while removing others to fit within limits. It is deterministic and replayable, producing the same output given the same input messages and configuration.

When a user reaches a large context, we can assume that the first half is likely irrelevant to their current task. Therefore, this function should only be called when absolutely necessary to fit within context limits, not as a continuous process.

This strategy respects task boundaries and only operates on messages within the current task level.

Initialize the truncation strategy.

Source code in azad/compression/strategies/truncation.py

def __init__(self):
    """Initialize the truncation strategy."""
    self.logger = logging.getLogger(__name__)

Attributes ¶

logger `instance-attribute` ¶

logger = getLogger(__name__)

strategy_type `property` ¶

strategy_type: CompressionStrategyType

Get the type of this compression strategy.

Functions ¶

compress ¶

compress(task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: CompressionConfig) -> List[Message]

Compress messages using the truncation strategy.

This method identifies which messages to keep and which to compress based on the truncation configuration. It preserves the first human message and the most recent message pairs based on the configuration.

This method respects task boundaries and only operates on messages within the current task level.

Parameters:

task (Task) –

The task containing messages to compress
new_checkpoint (Optional[CompressionCheckpoint]) –

The checkpoint to update (None for transform only)
config (CompressionConfig) –

Configuration for compression

Returns:

List[Message] –

The compressed messages list for the current task level

Source code in azad/compression/strategies/truncation.py

def compress(self, task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: CompressionConfig) -> List[Message]:
    """Compress messages using the truncation strategy.

    This method identifies which messages to keep and which to compress
    based on the truncation configuration. It preserves the first human message
    and the most recent message pairs based on the configuration.

    This method respects task boundaries and only operates on messages within
    the current task level.

    Args:
        task: The task containing messages to compress
        new_checkpoint: The checkpoint to update (None for transform only)
        config: Configuration for compression

    Returns:
        The compressed messages list for the current task level
    """
    # Cast config to TruncationConfig
    trunc_config = cast(TruncationConfig, config)

    # Get all messages for the current task level only
    all_messages = task.current_task_messages()

    # If no messages, nothing to do
    if not all_messages:
        return []

    # Initialize sets for tracking message IDs
    kept_message_ids = set()
    compressed_message_ids = set()

    # Check if we're creating a new checkpoint or using existing ones
    if new_checkpoint is None:
        # Transformation mode: Use existing checkpoints 
        # Look for any existing compression checkpoints in the current task level
        compression_messages = [msg for msg in all_messages if msg.role == MessageRole.compression]

        if compression_messages:
            # Use the most recent checkpoint to determine which messages are compressed
            for msg in all_messages:
                # Skip compression messages
                if msg.role == MessageRole.compression:
                    continue

                # Check if this message was compressed in any checkpoint
                message_compressed = False
                for comp_msg in compression_messages:
                    for part in comp_msg.content:
                        metadata = getattr(part, "metadata", {}) or {}
                        if "checkpoint" in metadata:
                            checkpoint_data = metadata["checkpoint"]
                            if msg.id in checkpoint_data.get("compressed_message_ids", []):
                                message_compressed = True
                                compressed_message_ids.add(msg.id)
                                break
                    if message_compressed:
                        break

                # If not compressed, keep it
                if not message_compressed:
                    kept_message_ids.add(msg.id)
        else:
            # No checkpoints, keep all messages
            for msg in all_messages:
                if msg.role != MessageRole.compression:
                    kept_message_ids.add(msg.id)
    else:
        # Creating a new checkpoint: Apply the truncation strategy

        # STEP 1: Find the first human message to preserve
        first_user_message = self._find_first_user_message(all_messages)
        if first_user_message:
            kept_message_ids.add(first_user_message.id)

        # STEP 2: Find all message pairs (assistant-tool)
        all_pairs = self._find_message_pairs(all_messages)

        # STEP 3: Keep the most recent message pairs based on config
        pairs_to_keep_count = min(trunc_config.preserve_recent_pairs_count, len(all_pairs))
        recent_pairs = all_pairs[-pairs_to_keep_count:] if pairs_to_keep_count > 0 else []

        # Add message IDs from recent pairs to kept list
        for pair in recent_pairs:
            for msg in pair:
                kept_message_ids.add(msg.id)

        # STEP 4: Mark all other non-compression messages as compressed
        for msg in all_messages:
            if msg.role == MessageRole.compression:
                continue  # Skip compression messages

            # Keep any TaskEntry or TaskExit messages to preserve task boundaries
            if msg.role in (MessageRole.taskentry, MessageRole.taskexit):
                kept_message_ids.add(msg.id)
                continue

            if msg.id not in kept_message_ids:
                compressed_message_ids.add(msg.id)
                if new_checkpoint.metadata is not None:
                    new_checkpoint.metadata[msg.id] = {"reason": "truncated"}

        # STEP 5: Update checkpoint with kept and compressed message IDs
        new_checkpoint.kept_message_ids.extend(kept_message_ids)
        new_checkpoint.compressed_message_ids.extend(compressed_message_ids)

    # Log compression statistics
    total_count = len(all_messages) - sum(1 for msg in all_messages if msg.role == MessageRole.compression)
    kept_count = len(kept_message_ids)
    compressed_count = len(compressed_message_ids)

    self.logger.info(f"Truncation strategy kept {kept_count}/{total_count} messages "
                     f"({kept_count/total_count:.1%} if total_count else 0) "
                     f"and compressed {compressed_count} messages")

    # Return list of messages to keep (excluding compression messages)
    return [msg for msg in all_messages if msg.id in kept_message_ids and msg.role != MessageRole.compression]

CompactSummarizationStrategy ¶

CompactSummarizationStrategy()

Bases: CompressionStrategy

Source code in azad/compression/strategies/compact.py

def __init__(self):
    self.logger = logging.getLogger(__name__)
    self.ai_network = None

Attributes ¶

logger `instance-attribute` ¶

logger = getLogger(__name__)

ai_network `instance-attribute` ¶

ai_network = None

strategy_type `property` ¶

strategy_type: CompressionStrategyType

Functions ¶

compress `async` ¶

compress(task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: Any) -> List[Message]

Called either to create/update a checkpoint (new_checkpoint != None) or to do a "transform-only" pass (new_checkpoint == None).

When new_checkpoint is None, we skip re-compression and simply reuse previously-compressed state. Otherwise, we apply normal compression, generate a summary, and create a new checkpoint. Handles context limit errors during summarization by moving messages to the kept set.

Source code in azad/compression/strategies/compact.py

async def compress(self,
                   task: Task,
                   new_checkpoint: Optional[CompressionCheckpoint],
                   config: Any
                  ) -> List[Message]:
    """
    Called either to create/update a checkpoint (new_checkpoint != None)
    or to do a "transform-only" pass (new_checkpoint == None).

    When new_checkpoint is None, we skip re-compression and simply reuse
    previously-compressed state. Otherwise, we apply normal compression,
    generate a summary, and create a new checkpoint. Handles context
    limit errors during summarization by moving messages to the kept set.
    """
    # If no new checkpoint, do transform-only (reuse existing compression)
    if new_checkpoint is None:
        self.logger.info("Performing transform-only operation.")
        return self.transform_only(task)

    # Otherwise, this is an actual compression event:
    self.logger.info("Starting compression event.")
    compact_config = config
    all_messages = task.current_task_messages()
    if not all_messages:
        self.logger.info("No messages in task to compress.")
        return []

    # 1) Identify initial critical vs. compressible messages
    critical_messages, compressible_messages = self._get_critical_messages(
        all_messages, compact_config
    )

    # Keep track of messages moved due to context limits
    rescued_messages: List[Message] = []

    # 2) Try to generate a summary, handling context limits iteratively
    summary = None
    if compressible_messages:
        messages_to_summarize = list(compressible_messages) # Work on a copy
        while messages_to_summarize:
            try:
                self.logger.info(f"Attempting summary with {len(messages_to_summarize)} compressible messages.")
                summary = await self._generate_summary(messages_to_summarize, compact_config, task)
                self.logger.debug("Summary generation successful.")
                # Successfully generated summary, break the loop
                break
            except litellm.exceptions.ContextWindowExceededError as e:
                self.logger.warning(f"Context window exceeded during summary generation: {e}")
                # Move the *last* message from the current summarization list
                # to the rescued list (it will be kept)
                rescued_message = messages_to_summarize.pop()
                rescued_messages.append(rescued_message)
                self.logger.debug(f"Removed message {rescued_message.id} from summarization batch due to context limit. It will be kept. Retrying summary...")
                if not messages_to_summarize:
                    self.logger.warning("No compressible messages left after attempting to handle context limits. No summary will be generated.")
                    summary = None # Ensure summary is None if we exhausted all messages
                    break # Exit loop
            except Exception as e:
                # Handle other potential errors during summarization
                self.logger.error(f"Unexpected error generating summary: {e}", exc_info=True)
                summary = None # Failed to generate summary
                new_checkpoint.metadata["summary_error"] = str(e)
                break # Exit loop
        else:
             # This else block executes if the while loop finished *without* break
             # i.e., messages_to_summarize became empty due to context limits
             self.logger.warning("Exhausted all compressible messages trying to fit context window. No summary generated.")
             summary = None


    # 3) Update critical and compressible lists *after* potential rescues
    # Add rescued messages to the critical set
    critical_messages.extend(rescued_messages)
    # The remaining messages in messages_to_summarize are the ones actually summarized (if any)
    # But compressible_messages should reflect the final state *before* summarization attempt started
    # So, we update the final compressed_message_ids based on which ones were *not* rescued.
    original_compressible_ids = {m.id for m in compressible_messages}
    rescued_message_ids = {m.id for m in rescued_messages}
    final_compressed_message_ids = original_compressible_ids - rescued_message_ids

    # Update the final list of compressible messages (those *actually* compressed)
    final_compressible_messages = [m for m in compressible_messages if m.id in final_compressed_message_ids]

    # Get final kept message IDs
    final_kept_message_ids = {m.id for m in critical_messages}


    # 4) Update checkpoint metadata with final state
    new_checkpoint.kept_message_ids = list(final_kept_message_ids)
    new_checkpoint.compressed_message_ids = list(final_compressed_message_ids)
    new_checkpoint.metadata = {
        "compressed_count": len(final_compressed_message_ids),
        "kept_count": len(final_kept_message_ids),
        "rescued_due_to_context_limit": len(rescued_message_ids),
    }

    # 5) Add compression message to Task if a summary was generated
    if summary and final_compressed_message_ids:
        self.logger.debug("Summary generated, adding compression message to task.")
        new_checkpoint.metadata["summary_text"] = summary
        new_checkpoint.metadata["summary_length"] = len(summary)

        try:
            # Determine start/end indices for the *original* range of compressed messages
            message_id_to_index = {msg.id: i for i, msg in enumerate(all_messages)}
            # Use original compressible_messages to define the span, even if some were rescued
            indices = [message_id_to_index.get(msg.id, -1) for msg in compressible_messages]
            valid_indices = [idx for idx in indices if idx >= 0]

            if valid_indices:
                start_idx = min(valid_indices)
                end_idx = max(valid_indices)

                task.add_compression_message(
                    start_idx=start_idx,
                    end_idx=end_idx,
                    reason="Compressed messages to maintain context",
                    strategies=[self.strategy_type.value],
                    # Report IDs actually compressed and kept
                    compressed_message_ids=list(final_compressed_message_ids),
                    kept_message_ids=list(final_kept_message_ids),
                    metadata={
                        "summary_text": summary,
                        "summary_length": len(summary),
                        # Report counts based on final state
                        "compressed_count": len(final_compressed_message_ids),
                        "initially_compressible_count": len(compressible_messages),
                        "rescued_count": len(rescued_message_ids),
                    },
                )
            else:
                self.logger.warning("Could not determine valid indices for compression message.")

        except Exception as e:
            self.logger.error(f"Error creating or adding compression message: {e}", exc_info=True)
            # Update checkpoint metadata even if task modification fails
            if "summary_error" not in new_checkpoint.metadata: # Don't overwrite previous error
                 new_checkpoint.metadata["compression_message_error"] = str(e)
    elif not summary and final_compressed_message_ids:
         # We intended to compress but failed to generate a summary (e.g. due to errors or context limits)
         self.logger.warning("Messages were identified for compression, but no summary was generated.")
         new_checkpoint.metadata["summary_text"] = f"[Failed to summarize {len(final_compressed_message_ids)} messages]"
         # Optionally add a compression message indicating failure? Depends on desired behavior.
         # For now, we don't add a compression message if summary generation failed.

    # 6) Log final stats
    # Recalculate total_count accurately based on non-compression messages in the original list
    original_non_compression_msgs = [m for m in all_messages if m.role != MessageRole.compression]
    total_count = len(original_non_compression_msgs)

    self.logger.info(
        f"Compact summarization finished: Kept {len(final_kept_message_ids)}/{total_count} original messages. "
        f"Successfully compressed {len(final_compressed_message_ids)} messages. "
        f"Rescued {len(rescued_message_ids)} messages due to context limits."
    )

    # 7) Return the final "kept" messages (original critical + rescued, excluding compression role messages)
    final_kept_messages: List[Message] = [
        msg
        for msg in all_messages # Iterate through original to preserve order somewhat
        if msg.id in final_kept_message_ids and msg.role != MessageRole.compression
    ]

    # Ensure all messages intended to be kept are included, even if order isn't perfect
    final_kept_ids_set = set(m.id for m in final_kept_messages)
    missing_kept = [m for m in critical_messages if m.id not in final_kept_ids_set]
    if missing_kept:
         self.logger.warning(f"Adding {len(missing_kept)} kept messages that were missed in primary list construction.")
         # A simple append might mess up order, but ensures they are returned
         final_kept_messages.extend(missing_kept) 

    return final_kept_messages

transform_only ¶

transform_only(task: Task) -> List[Message]

When no new checkpoint is provided, we simply reuse whatever compression checkpoints already exist. We do NOT create or update any compression messages. We just figure out which messages are kept vs. compressed based on previously recorded checkpoints, and then return them in the order:

1) All kept messages up to the checkpoint
2) An informational summary (if one exists)
3) The remaining kept messages (after the checkpoint).

Source code in azad/compression/strategies/compact.py

def transform_only(self, task: Task) -> List[Message]:
    """
    When no new checkpoint is provided, we simply reuse whatever compression
    checkpoints already exist. We do NOT create or update any compression
    messages. We just figure out which messages are kept vs. compressed based
    on previously recorded checkpoints, and then return them in the order:

        1) All kept messages up to the checkpoint
        2) An informational summary (if one exists)
        3) The remaining kept messages (after the checkpoint).
    """
    all_messages = task.current_task_messages()
    if not all_messages:
        return []

    # Separate compression messages from normal messages.
    compression_msgs: List[Message] = [m for m in all_messages if m.role == MessageRole.compression]
    normal_msgs: List[Message] = [m for m in all_messages if m.role != MessageRole.compression]

    # If there are no compression checkpoints, keep all normal messages as-is.
    if not compression_msgs:
        self.logger.debug("[transform_only] No existing checkpoints. Keeping all messages.")
        return normal_msgs

    # Identify which messages are compressed vs. kept
    compressed_message_ids = set()
    kept_message_ids = set()

    for msg in normal_msgs:
        previously_compressed = False
        for comp_msg in compression_msgs:
            for part in comp_msg.content:
                if (
                    isinstance(part, CompressionPart) 
                    and part.compressed_message_ids
                    and msg.id in part.compressed_message_ids
                ):
                    compressed_message_ids.add(msg.id)
                    previously_compressed = True
                    break
            if previously_compressed:
                break

        if not previously_compressed:
            kept_message_ids.add(msg.id)

    total_count = len(normal_msgs)
    kept_count = len(kept_message_ids)
    compressed_count = len(compressed_message_ids)

    self.logger.debug(
        f"[transform_only] Reusing existing checkpoints. "
        f"Kept {kept_count}/{total_count} messages, compressed {compressed_count}."
    )

    # Find any summary text from the LAST compression message that includes it
    summary_text = None
    for comp_msg in reversed(compression_msgs):
        for part in comp_msg.content:
            if isinstance(part, CompressionPart) and part.metadata and isinstance(part.metadata, dict):
                st = (
                    part.metadata.get('checkpoint', {})
                                .get('metadata', {})
                                .get('summary_text')
                )
                if st:
                    summary_text = st
                    break
        if summary_text:
            break

    # Filter down to only kept messages (in chronological order)
    kept_messages = [m for m in normal_msgs if m.id in kept_message_ids]
    kept_messages.sort(key=lambda x: x.started_ts)  # or use x.id if that's guaranteed chronological

    # If there is no summary, just return the kept messages in order.
    if not summary_text:
        return kept_messages

    # Otherwise, we split at the largest (most recent) compressed message's timestamp or ID.
    # We'll use started_ts to determine "before vs. after the checkpoint."
    compressed_msg_map = {m.id: m for m in normal_msgs if m.id in compressed_message_ids}
    if compressed_msg_map:
        # Grab the message with the largest started_ts among the compressed
        latest_ts = max(msg.started_ts for msg in compressed_msg_map.values())
    else:
        # If for some reason no compressed messages had valid timestamps, treat all as post-summary
        latest_ts = 0

    pre_summary: List[Message] = []
    post_summary: List[Message] = []
    for m in kept_messages:
        if m.started_ts <= latest_ts:
            pre_summary.append(m)
        else:
            post_summary.append(m)

    # Create an informational message for the summary
    informational_part = InformationalPart(
        is_visible_ai=True,
        is_visible_ui=False,
        informational_type="summary",
        details=summary_text,
        additional_data=None
    )

    ts = int(time.time())
    new_message = InformationalMessage(
        task_id=task.id,
        started_ts=ts,
        finished_ts=ts, # Informational is instantaneous
        content=[informational_part]
    )
    # Return them in the order: [informational summary] -> [post-summary kept]
    return [new_message] + post_summary

Modules ¶

compact ¶

Compact Summarization compression strategy.

This module provides an AI-assisted compression strategy that preserves critical messages (like the first task messages and recent exchanges), while creating a compact summary of the rest to maintain context.

Attributes ¶

Classes ¶

CompactSummarizationStrategy ¶

CompactSummarizationStrategy()

Bases: CompressionStrategy

Source code in azad/compression/strategies/compact.py

def __init__(self):
    self.logger = logging.getLogger(__name__)
    self.ai_network = None

Attributes ¶

logger instance-attribute ¶

logger = getLogger(__name__)

ai_network instance-attribute ¶

ai_network = None

strategy_type property ¶

strategy_type: CompressionStrategyType

Functions ¶

compress async ¶

compress(task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: Any) -> List[Message]

Called either to create/update a checkpoint (new_checkpoint != None) or to do a "transform-only" pass (new_checkpoint == None).

When new_checkpoint is None, we skip re-compression and simply reuse previously-compressed state. Otherwise, we apply normal compression, generate a summary, and create a new checkpoint. Handles context limit errors during summarization by moving messages to the kept set.

Source code in azad/compression/strategies/compact.py

async def compress(self,
                   task: Task,
                   new_checkpoint: Optional[CompressionCheckpoint],
                   config: Any
                  ) -> List[Message]:
    """
    Called either to create/update a checkpoint (new_checkpoint != None)
    or to do a "transform-only" pass (new_checkpoint == None).

    When new_checkpoint is None, we skip re-compression and simply reuse
    previously-compressed state. Otherwise, we apply normal compression,
    generate a summary, and create a new checkpoint. Handles context
    limit errors during summarization by moving messages to the kept set.
    """
    # If no new checkpoint, do transform-only (reuse existing compression)
    if new_checkpoint is None:
        self.logger.info("Performing transform-only operation.")
        return self.transform_only(task)

    # Otherwise, this is an actual compression event:
    self.logger.info("Starting compression event.")
    compact_config = config
    all_messages = task.current_task_messages()
    if not all_messages:
        self.logger.info("No messages in task to compress.")
        return []

    # 1) Identify initial critical vs. compressible messages
    critical_messages, compressible_messages = self._get_critical_messages(
        all_messages, compact_config
    )

    # Keep track of messages moved due to context limits
    rescued_messages: List[Message] = []

    # 2) Try to generate a summary, handling context limits iteratively
    summary = None
    if compressible_messages:
        messages_to_summarize = list(compressible_messages) # Work on a copy
        while messages_to_summarize:
            try:
                self.logger.info(f"Attempting summary with {len(messages_to_summarize)} compressible messages.")
                summary = await self._generate_summary(messages_to_summarize, compact_config, task)
                self.logger.debug("Summary generation successful.")
                # Successfully generated summary, break the loop
                break
            except litellm.exceptions.ContextWindowExceededError as e:
                self.logger.warning(f"Context window exceeded during summary generation: {e}")
                # Move the *last* message from the current summarization list
                # to the rescued list (it will be kept)
                rescued_message = messages_to_summarize.pop()
                rescued_messages.append(rescued_message)
                self.logger.debug(f"Removed message {rescued_message.id} from summarization batch due to context limit. It will be kept. Retrying summary...")
                if not messages_to_summarize:
                    self.logger.warning("No compressible messages left after attempting to handle context limits. No summary will be generated.")
                    summary = None # Ensure summary is None if we exhausted all messages
                    break # Exit loop
            except Exception as e:
                # Handle other potential errors during summarization
                self.logger.error(f"Unexpected error generating summary: {e}", exc_info=True)
                summary = None # Failed to generate summary
                new_checkpoint.metadata["summary_error"] = str(e)
                break # Exit loop
        else:
             # This else block executes if the while loop finished *without* break
             # i.e., messages_to_summarize became empty due to context limits
             self.logger.warning("Exhausted all compressible messages trying to fit context window. No summary generated.")
             summary = None


    # 3) Update critical and compressible lists *after* potential rescues
    # Add rescued messages to the critical set
    critical_messages.extend(rescued_messages)
    # The remaining messages in messages_to_summarize are the ones actually summarized (if any)
    # But compressible_messages should reflect the final state *before* summarization attempt started
    # So, we update the final compressed_message_ids based on which ones were *not* rescued.
    original_compressible_ids = {m.id for m in compressible_messages}
    rescued_message_ids = {m.id for m in rescued_messages}
    final_compressed_message_ids = original_compressible_ids - rescued_message_ids

    # Update the final list of compressible messages (those *actually* compressed)
    final_compressible_messages = [m for m in compressible_messages if m.id in final_compressed_message_ids]

    # Get final kept message IDs
    final_kept_message_ids = {m.id for m in critical_messages}


    # 4) Update checkpoint metadata with final state
    new_checkpoint.kept_message_ids = list(final_kept_message_ids)
    new_checkpoint.compressed_message_ids = list(final_compressed_message_ids)
    new_checkpoint.metadata = {
        "compressed_count": len(final_compressed_message_ids),
        "kept_count": len(final_kept_message_ids),
        "rescued_due_to_context_limit": len(rescued_message_ids),
    }

    # 5) Add compression message to Task if a summary was generated
    if summary and final_compressed_message_ids:
        self.logger.debug("Summary generated, adding compression message to task.")
        new_checkpoint.metadata["summary_text"] = summary
        new_checkpoint.metadata["summary_length"] = len(summary)

        try:
            # Determine start/end indices for the *original* range of compressed messages
            message_id_to_index = {msg.id: i for i, msg in enumerate(all_messages)}
            # Use original compressible_messages to define the span, even if some were rescued
            indices = [message_id_to_index.get(msg.id, -1) for msg in compressible_messages]
            valid_indices = [idx for idx in indices if idx >= 0]

            if valid_indices:
                start_idx = min(valid_indices)
                end_idx = max(valid_indices)

                task.add_compression_message(
                    start_idx=start_idx,
                    end_idx=end_idx,
                    reason="Compressed messages to maintain context",
                    strategies=[self.strategy_type.value],
                    # Report IDs actually compressed and kept
                    compressed_message_ids=list(final_compressed_message_ids),
                    kept_message_ids=list(final_kept_message_ids),
                    metadata={
                        "summary_text": summary,
                        "summary_length": len(summary),
                        # Report counts based on final state
                        "compressed_count": len(final_compressed_message_ids),
                        "initially_compressible_count": len(compressible_messages),
                        "rescued_count": len(rescued_message_ids),
                    },
                )
            else:
                self.logger.warning("Could not determine valid indices for compression message.")

        except Exception as e:
            self.logger.error(f"Error creating or adding compression message: {e}", exc_info=True)
            # Update checkpoint metadata even if task modification fails
            if "summary_error" not in new_checkpoint.metadata: # Don't overwrite previous error
                 new_checkpoint.metadata["compression_message_error"] = str(e)
    elif not summary and final_compressed_message_ids:
         # We intended to compress but failed to generate a summary (e.g. due to errors or context limits)
         self.logger.warning("Messages were identified for compression, but no summary was generated.")
         new_checkpoint.metadata["summary_text"] = f"[Failed to summarize {len(final_compressed_message_ids)} messages]"
         # Optionally add a compression message indicating failure? Depends on desired behavior.
         # For now, we don't add a compression message if summary generation failed.

    # 6) Log final stats
    # Recalculate total_count accurately based on non-compression messages in the original list
    original_non_compression_msgs = [m for m in all_messages if m.role != MessageRole.compression]
    total_count = len(original_non_compression_msgs)

    self.logger.info(
        f"Compact summarization finished: Kept {len(final_kept_message_ids)}/{total_count} original messages. "
        f"Successfully compressed {len(final_compressed_message_ids)} messages. "
        f"Rescued {len(rescued_message_ids)} messages due to context limits."
    )

    # 7) Return the final "kept" messages (original critical + rescued, excluding compression role messages)
    final_kept_messages: List[Message] = [
        msg
        for msg in all_messages # Iterate through original to preserve order somewhat
        if msg.id in final_kept_message_ids and msg.role != MessageRole.compression
    ]

    # Ensure all messages intended to be kept are included, even if order isn't perfect
    final_kept_ids_set = set(m.id for m in final_kept_messages)
    missing_kept = [m for m in critical_messages if m.id not in final_kept_ids_set]
    if missing_kept:
         self.logger.warning(f"Adding {len(missing_kept)} kept messages that were missed in primary list construction.")
         # A simple append might mess up order, but ensures they are returned
         final_kept_messages.extend(missing_kept) 

    return final_kept_messages

transform_only ¶

transform_only(task: Task) -> List[Message]

When no new checkpoint is provided, we simply reuse whatever compression checkpoints already exist. We do NOT create or update any compression messages. We just figure out which messages are kept vs. compressed based on previously recorded checkpoints, and then return them in the order:

1) All kept messages up to the checkpoint
2) An informational summary (if one exists)
3) The remaining kept messages (after the checkpoint).

Source code in azad/compression/strategies/compact.py

def transform_only(self, task: Task) -> List[Message]:
    """
    When no new checkpoint is provided, we simply reuse whatever compression
    checkpoints already exist. We do NOT create or update any compression
    messages. We just figure out which messages are kept vs. compressed based
    on previously recorded checkpoints, and then return them in the order:

        1) All kept messages up to the checkpoint
        2) An informational summary (if one exists)
        3) The remaining kept messages (after the checkpoint).
    """
    all_messages = task.current_task_messages()
    if not all_messages:
        return []

    # Separate compression messages from normal messages.
    compression_msgs: List[Message] = [m for m in all_messages if m.role == MessageRole.compression]
    normal_msgs: List[Message] = [m for m in all_messages if m.role != MessageRole.compression]

    # If there are no compression checkpoints, keep all normal messages as-is.
    if not compression_msgs:
        self.logger.debug("[transform_only] No existing checkpoints. Keeping all messages.")
        return normal_msgs

    # Identify which messages are compressed vs. kept
    compressed_message_ids = set()
    kept_message_ids = set()

    for msg in normal_msgs:
        previously_compressed = False
        for comp_msg in compression_msgs:
            for part in comp_msg.content:
                if (
                    isinstance(part, CompressionPart) 
                    and part.compressed_message_ids
                    and msg.id in part.compressed_message_ids
                ):
                    compressed_message_ids.add(msg.id)
                    previously_compressed = True
                    break
            if previously_compressed:
                break

        if not previously_compressed:
            kept_message_ids.add(msg.id)

    total_count = len(normal_msgs)
    kept_count = len(kept_message_ids)
    compressed_count = len(compressed_message_ids)

    self.logger.debug(
        f"[transform_only] Reusing existing checkpoints. "
        f"Kept {kept_count}/{total_count} messages, compressed {compressed_count}."
    )

    # Find any summary text from the LAST compression message that includes it
    summary_text = None
    for comp_msg in reversed(compression_msgs):
        for part in comp_msg.content:
            if isinstance(part, CompressionPart) and part.metadata and isinstance(part.metadata, dict):
                st = (
                    part.metadata.get('checkpoint', {})
                                .get('metadata', {})
                                .get('summary_text')
                )
                if st:
                    summary_text = st
                    break
        if summary_text:
            break

    # Filter down to only kept messages (in chronological order)
    kept_messages = [m for m in normal_msgs if m.id in kept_message_ids]
    kept_messages.sort(key=lambda x: x.started_ts)  # or use x.id if that's guaranteed chronological

    # If there is no summary, just return the kept messages in order.
    if not summary_text:
        return kept_messages

    # Otherwise, we split at the largest (most recent) compressed message's timestamp or ID.
    # We'll use started_ts to determine "before vs. after the checkpoint."
    compressed_msg_map = {m.id: m for m in normal_msgs if m.id in compressed_message_ids}
    if compressed_msg_map:
        # Grab the message with the largest started_ts among the compressed
        latest_ts = max(msg.started_ts for msg in compressed_msg_map.values())
    else:
        # If for some reason no compressed messages had valid timestamps, treat all as post-summary
        latest_ts = 0

    pre_summary: List[Message] = []
    post_summary: List[Message] = []
    for m in kept_messages:
        if m.started_ts <= latest_ts:
            pre_summary.append(m)
        else:
            post_summary.append(m)

    # Create an informational message for the summary
    informational_part = InformationalPart(
        is_visible_ai=True,
        is_visible_ui=False,
        informational_type="summary",
        details=summary_text,
        additional_data=None
    )

    ts = int(time.time())
    new_message = InformationalMessage(
        task_id=task.id,
        started_ts=ts,
        finished_ts=ts, # Informational is instantaneous
        content=[informational_part]
    )
    # Return them in the order: [informational summary] -> [post-summary kept]
    return [new_message] + post_summary

truncation ¶

Truncation compression strategy.

This module provides a deterministic compression strategy that preserves the first human message and the most recent message pairs, removing older messages to fit within context limits.

We can't implement a dynamically updating sliding window as it would break prompt cache every time. To maintain the benefits of caching, we need to keep conversation history static. This operation should be performed as infrequently as possible.

Attributes ¶

Classes ¶

TruncationStrategy ¶

TruncationStrategy()

Bases: CompressionStrategy

Strategy that removes a subset of messages to reduce context size.

This strategy preserves the first human message and the most recent message pairs, while removing others to fit within limits. It is deterministic and replayable, producing the same output given the same input messages and configuration.

When a user reaches a large context, we can assume that the first half is likely irrelevant to their current task. Therefore, this function should only be called when absolutely necessary to fit within context limits, not as a continuous process.

This strategy respects task boundaries and only operates on messages within the current task level.

Initialize the truncation strategy.

Source code in azad/compression/strategies/truncation.py

def __init__(self):
    """Initialize the truncation strategy."""
    self.logger = logging.getLogger(__name__)

Attributes ¶

logger instance-attribute ¶

logger = getLogger(__name__)

strategy_type property ¶

strategy_type: CompressionStrategyType

Get the type of this compression strategy.

Functions ¶

compress ¶

compress(task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: CompressionConfig) -> List[Message]

Compress messages using the truncation strategy.

This method identifies which messages to keep and which to compress based on the truncation configuration. It preserves the first human message and the most recent message pairs based on the configuration.

This method respects task boundaries and only operates on messages within the current task level.

Parameters:

task (Task) –

The task containing messages to compress
new_checkpoint (Optional[CompressionCheckpoint]) –

The checkpoint to update (None for transform only)
config (CompressionConfig) –

Configuration for compression

Returns:

List[Message] –

The compressed messages list for the current task level

Source code in azad/compression/strategies/truncation.py

def compress(self, task: Task, new_checkpoint: Optional[CompressionCheckpoint], config: CompressionConfig) -> List[Message]:
    """Compress messages using the truncation strategy.

    This method identifies which messages to keep and which to compress
    based on the truncation configuration. It preserves the first human message
    and the most recent message pairs based on the configuration.

    This method respects task boundaries and only operates on messages within
    the current task level.

    Args:
        task: The task containing messages to compress
        new_checkpoint: The checkpoint to update (None for transform only)
        config: Configuration for compression

    Returns:
        The compressed messages list for the current task level
    """
    # Cast config to TruncationConfig
    trunc_config = cast(TruncationConfig, config)

    # Get all messages for the current task level only
    all_messages = task.current_task_messages()

    # If no messages, nothing to do
    if not all_messages:
        return []

    # Initialize sets for tracking message IDs
    kept_message_ids = set()
    compressed_message_ids = set()

    # Check if we're creating a new checkpoint or using existing ones
    if new_checkpoint is None:
        # Transformation mode: Use existing checkpoints 
        # Look for any existing compression checkpoints in the current task level
        compression_messages = [msg for msg in all_messages if msg.role == MessageRole.compression]

        if compression_messages:
            # Use the most recent checkpoint to determine which messages are compressed
            for msg in all_messages:
                # Skip compression messages
                if msg.role == MessageRole.compression:
                    continue

                # Check if this message was compressed in any checkpoint
                message_compressed = False
                for comp_msg in compression_messages:
                    for part in comp_msg.content:
                        metadata = getattr(part, "metadata", {}) or {}
                        if "checkpoint" in metadata:
                            checkpoint_data = metadata["checkpoint"]
                            if msg.id in checkpoint_data.get("compressed_message_ids", []):
                                message_compressed = True
                                compressed_message_ids.add(msg.id)
                                break
                    if message_compressed:
                        break

                # If not compressed, keep it
                if not message_compressed:
                    kept_message_ids.add(msg.id)
        else:
            # No checkpoints, keep all messages
            for msg in all_messages:
                if msg.role != MessageRole.compression:
                    kept_message_ids.add(msg.id)
    else:
        # Creating a new checkpoint: Apply the truncation strategy

        # STEP 1: Find the first human message to preserve
        first_user_message = self._find_first_user_message(all_messages)
        if first_user_message:
            kept_message_ids.add(first_user_message.id)

        # STEP 2: Find all message pairs (assistant-tool)
        all_pairs = self._find_message_pairs(all_messages)

        # STEP 3: Keep the most recent message pairs based on config
        pairs_to_keep_count = min(trunc_config.preserve_recent_pairs_count, len(all_pairs))
        recent_pairs = all_pairs[-pairs_to_keep_count:] if pairs_to_keep_count > 0 else []

        # Add message IDs from recent pairs to kept list
        for pair in recent_pairs:
            for msg in pair:
                kept_message_ids.add(msg.id)

        # STEP 4: Mark all other non-compression messages as compressed
        for msg in all_messages:
            if msg.role == MessageRole.compression:
                continue  # Skip compression messages

            # Keep any TaskEntry or TaskExit messages to preserve task boundaries
            if msg.role in (MessageRole.taskentry, MessageRole.taskexit):
                kept_message_ids.add(msg.id)
                continue

            if msg.id not in kept_message_ids:
                compressed_message_ids.add(msg.id)
                if new_checkpoint.metadata is not None:
                    new_checkpoint.metadata[msg.id] = {"reason": "truncated"}

        # STEP 5: Update checkpoint with kept and compressed message IDs
        new_checkpoint.kept_message_ids.extend(kept_message_ids)
        new_checkpoint.compressed_message_ids.extend(compressed_message_ids)

    # Log compression statistics
    total_count = len(all_messages) - sum(1 for msg in all_messages if msg.role == MessageRole.compression)
    kept_count = len(kept_message_ids)
    compressed_count = len(compressed_message_ids)

    self.logger.info(f"Truncation strategy kept {kept_count}/{total_count} messages "
                     f"({kept_count/total_count:.1%} if total_count else 0) "
                     f"and compressed {compressed_count} messages")

    # Return list of messages to keep (excluding compression messages)
    return [msg for msg in all_messages if msg.id in kept_message_ids and msg.role != MessageRole.compression]