Porting CTIgor to AutoGen

2025-04-06

In my earlier blog “CTIgor: An Agentic CTI Assistant“, I demonstrated building a basic agent for consuming CTI reporting and generating summaries of it. The Semantic Kernel framework was used in that case. However, Microsoft R&D also maintains another framework named AutoGen which is centered around rapid development of agents and multi-agent systems. This framework also supports a broad set of tool extensions. In this post, I will discuss my experience porting the earlier project over to AutoGen, and compare and contrast the experience.

Introduction

If you haven’t already read my earlier post, it is recommended you do so, or you’ll lack necessary context for this entry:

CTIgor: An Agentic CTI Assistant
GitHub project: https://github.com/ckane/ctigor

The AutoGen framework is another GenAI framework which is designed at a higher level of abstraction than Semantic Kernel. I decided to port CTIgor over to AutoGen for multiple reasons. One of these was simply to learn another framework for GenAI Agent development. As I had already become familiar with Semantic Kernel, the work of refactoring the code to perform the same functionality using AutoGen’s API would be extremely informative in learning a new framework. Additionally, AutoGen had been recommended to me due to it offering a more concise API for rapid development, and it having a large library of abstractions that can help in building systems with functional integrations.

Some of the extensions offered by AutoGen include:

Observations

I am maintaining (and will plan to continue to maintain) a fork of the project that works on Semantic Kernel, in addition to the new AutoGen-targeted implementation. Semantic Kernel has been adding additional functionality that implements some of the features AutoGen used to uniquely offer, such as abstract Agent object implementations. In addition, Semantic Kernel is considered by Microsoft to be a production-ready framework (version 1.27.1) at the time of this writing. Whereas, AutoGen (version 0.5.1) is still considered a pre-release research effort with an API and feature set that is still under active development and subject to change. AutoGen has a lot of extensions and powerful features that I’d like to take advantage of (for the purpose of this open-source R&D project), but may present challenges if it is attempted to be used for building a production customer-facing product, where some level of stable behavior and support commitment is desirable.

As well, historic design principles behind Semantic Kernel appear to have been geared towards incorporating LLM-utilizing features into larger applications. With AutoGen, the framework appears more geared toward making GenAI agents as the product or application. That said, some newer features in Semantich Kernel appear to be implementing a lot of Agent-focused capabilities into SK. Therefore, it may end up developing into a larger framework where today’s functionality in AutoGen is also implemented completely within Semantic Kernel. For example, the semantic-kernel.agents package now implements a lot of the features offered by the AssistantAgent and related classes from AutoGen.

For this reason, as stated earlier, I do plan to continue to maintain both forks in parallel (where possible), but cutting-edge features do seem to largely end up in AutoGen first, so that will be my R&D focus.

Refactoring the Tool Functions

A big difference between the two frameworks is how Tool Functions are implemented and connected to the system. While both frameorks make use of typing.Annotated for helping to explain the purpose of function inputs and outputs to the LLM, Semantic Kernel employs class objects to declare and organize related tool functions, and function decorators to provide additional metadata to the LLM explaining the purpose and name of the function.

For example:

class RandomNumberPlugin:
    """Generates a Random Number"""

    @kernel_function(
      name='gen_random',
      description="Generate a random number given a low and high bound"
    )
    async def gen_random(
      self,
      low: Annotated[int, "Lower bound of the random number"],
      high: Annotated[int, "Upper bound of the random number"]
    ) -> Annotated[int, "Generated random number within the bounds"]:
        print(f"Running gen_random with low={low}, high={high}")
        return randint(low, high)

AutoGen, on the other hand, allows Python functions to be provided directly to the framework, and will utilize the function docstring, a Python-native feature, to inform the LLM about the new tool. As mentioned, the typing.Annotated markups also will continue to be leveraged too, where they’re used (which I strongly recommend):

async def gen_random(
  low: Annotated[int, "Lower bound of the random number"],
  high: Annotated[int, "Upper bound of the random number"]
) -> Annotated[int, "Generated random number within the bounds"]:
    """Generate a random number given a low and high bound"""
    print(f"Running gen_random with low={low}, high={high}")
    return randint(low, high)

This is a much more concise implementation for the functions. Furthermore, registering the functions with an agent developed in AutoGen looks like this:

# Instantiate the CTI Agent
self.agent = AssistantAgent(
    name="ctigor",
    model_client=self.chat_service,

    # Register the tools to use (will auto-parse docstrings and Annotated metadata)
    tools=[gen_random, load_from_web, load_text_file],
    reflect_on_tool_use=True,
)

We now have a new replacement implementation above for gen_random. The other two tool functions defined in the prior blog post can now be refactored as follows. Providing them as tools available to the LLM Agent is included in the above example, as well.

async def load_from_web(
    url: Annotated[str, "URL to read from the web into markdown content"]
) -> Annotated[bytes, "The contents from the site, formatted as Markdown"]:
    """Given a URL, convert the page to markdown text and return it as a string"""
    async with aiohttp.ClientSession() as session:
        resphtml = await session.get(url)
        async with resphtml:
            resptxt = html2text.html2text(await resphtml.text())
            return resptxt

    return None

async def load_text_file(
    file_name: Annotated[str, "The name and path of the file on disk to return the text contents of"]
) -> Annotated[bytes, "The contents from the file"]:
    """Load a file from disk, given a filename. Returns a bytestring of the file contents."""
    with open(file_name, "rb") as txtfile:
        return txtfile.read()

One benefit to this is that the implementation of these functions no longer needs to depend upon any of the symbols imported from the framework. This enables broader reuse of these utility functions across a larger set of applications. Or, it allows for the tool functions to be imported from external packages, so long as they’re employing the native Python code-documentation tools to annotate the code. It cuts down the verbosity attributable to the Agent framework considerably.

AutoGen has us import the tool functions into ctiagent.py and explicitly attach them to the AutoGen Agent instance during construction, rather than this being an operation that occurs behind the scenes as a side-effect of the decorators in Semantic Kernel.

Adapting the CTIGor Code

The following imports replace all of the semantic_kernel.* imports in the ctiagent.py code, to import the functionality that will be used in the refactor:

from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
from autogen_ext.models.ollama import OllamaChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.ui import Console as AgentConsole
from autogen_core.tools import FunctionTool
from autogen_core import CancellationToken

The AutoGen framework provides an AzureOpenAIChatCompletionClient which provides a role similar to the AzureOpenAIChatCompletion from Semantic Kernel, and has a similar, yet not identical, API and constructor. The agent’s chat_service is constructed from this.

self.chat_service = AzureOpenAIChatCompletionClient(
    azure_deployment=local_settings.deployment,
    api_key=local_settings.azure_api_key,
    azure_endpoint=local_settings.endpoint,
    api_version="2024-10-21",
    model='gpt-4o-mini',
)

A significant difference is that the behavior that was implemented in the Semantic Kernel version was that the CTIgor instance maintained and had to explicitly update a ChatHistory object through conversation with the analyst user. In AutoGen, this behavior is built into the AssistantAgent and all of the code that was implemented in CTIgor to manage updates to the ChatHistory can also be removed.

Instantiating the AssistantAgent looks like this (repeated from above) in the __init__ method of CTIgor:

# Instantiate the CTI Agent
self.agent = AssistantAgent(
    name="ctigor",
    model_client=self.chat_service,

    # Register the tools to use (will auto-parse docstrings and Annotated metadata)
    tools=[gen_random, load_from_web, load_text_file],
    reflect_on_tool_use=True,
)

Then, the CTIgor.prompt code can be simplified, as the code managing updates to the ChatHistory now lives within the AssistantAgent.on_messages method:

async def prompt(self, input_prompt: str):
    # Prompt the model with the given input + state, waiting for response
    response = await self.agent.on_messages([TextMessage(content=input_prompt, source="user")], CancellationToken())

    # Ensure response isn't None
    assert response is not None

    # Strip the ending TERMINATE message that's part of AutoGen's internals
    text_response = response.chat_message.content
    if text_response[-9:] == "TERMINATE":
        text_response = text_response[:-9]

    return text_response

The on_messages call now passes the input_prompt to the LLM, saves it in the history, and waits (with some caveats explained below) for the result(s) from the LLM, which is also saved to the history before returning it to the caller.

One key item to keep in consideration here is that the on_messages method is designed to be “steppable”, which means that it may return to the caller after intermediate steps during the course of the GenAI operations. Due to this, AutoGen implements a feature called “Termination Message” which is a string that will be tacked onto the end of a message to indicate that it contains the final message of an exchange with the LLM. In the above, the default TERMINATE is used. Additionally, the prompt method above always returns the result (even the intermediate ones) to the caller. Thus, an intermediate message will be displayed, and the end user can simply hit the ENTER key in order to step through them. An alternative would be to implement a loop here that reissues the on_messages(...) call until the termination message is encountered, displaying the intermediate returned message (if any) to the user.

Additionally, AutoGen also allows creation of CancellationToken() instances that can be used in asynchronous code to prematurely halt and exit from pending operations. In the above example, a new CancellationToken is generated with each call to prompt, but this value can even be reused across multiple parallel tasks such that all of them can be cancelled if the calling system or end user wishes them to be. In this initial proof of concept, using them this way isn’t implemented, but it can allow us to provide a UI where the user can continue making requests to the system and have an opportunity to cancel pending LLM operations, as well.

Summarizer Tool

The ctigor_summarizer.py script can be preserved identically across both versions, as all of the refactor has been completely done under the hood in the ctiagent.py and ctiagent_functions.py code. This fairly clean implementation opens the door in the future for a consolidated CTIgor implementation which could employ either of these two frameworks via a run-time choice.

Code

The code is now the main branch of the CTIgor repository, and also is tracked as the autogen-port:

https://github.com/ckane/ctigor/tree/autogen-port

Author: Coleman Kane
Permanent Link: https://blog.malware.re/2025/04/06/ctigor-autogen/index.html