Getting Started

Overview

aegis-agent-core is a production-grade FastAPI server for running:

Individual agents
Teams of collaborating agents (which can also be a single LLM call, ideal for using OpenAI’s Batch API)
Graph-based agent workflows (Aegis contributed this functionality to Autogen)

It provides a ready-to-use, extensible API layer for managing agents and workflows over HTTP, making it easy to integrate Autogen-based automation into production systems.

Key Features

FastAPI Server: Host agent teams and tasks as HTTP APIs for easy integration.
Structured Messaging: Enforce type-safe, schema-driven message exchanges between agents.
Batch Task Processing: Support for batched submissions and asynchronous background task execution (optional Celery support).
Production Ready: Built with FastAPI, SQLModel, async SQLAlchemy, and Alembic migrations for database management.
Lightweight Dependency Management: Uses uv for faster dependency installation based on pyproject.toml.
Containerized Deployment: Provides Docker and docker-compose configurations for local development and production deployment.

Core Concepts

aegis-agent-core revolves around four core concepts:

Tasks: Definitions of what needs to be accomplished. Each task represents an abstract objective, such as “Automark student responses.”
Teams: Logical groupings of one or more agents configured to collaborate and complete a task. Teams can consist of a single LLM call (e.g., for batch scoring tasks) or complex multi-agent collaborations.
Sessions: Containers that link a task and a team together. Sessions ensure contextual continuity across multiple runs and maintain the association between task, team, and execution history.
Runs: Individual executions where a specific team attempts to solve a task. Each run can represent a single request or a batch of requests.

This structure allows you to easily manage workflows at scale, tracking which teams are solving which tasks across sessions and runs.

Getting Started

Prerequisites

Python 3.12+
uv
Docker and docker-compose
Git

1. Clone the Repository


git clone https://github.com/your-org/aegis-agent-core.git
cd aegis-agent-core

2. Set Up a Virtual Environment (Recommended)

Using uv to create and manage a virtual environment:


uv venv .venv

Activate the virtual environment:

On Linux/macOS:
```
source .venv/bin/activate
```
On Windows:
```
.venv\Scripts\activate
```

3. Install Dependencies

Using uv to sync dependencies from pyproject.toml:


uv sync

To install development dependencies as well:


uv sync --dev

4. Start the FastAPI Server Locally

Bring up Postgres and Redis:


docker-compose -f docker/docker-compose.infra.yaml up --build

Run the app:


uvicorn src.aegis.agents.main:app --reload --port 8000

The API will be available at: http://localhost:8000/docs

5. (Optional) Run with Docker Compose


make deploy-local-docker

This will start the database services and the FastAPI application in containers.

Recommended Development and Deployment Pattern

Unlike traditional applications where workflows are hardcoded, aegis-agent-core promotes a configuration-driven approach.

You can develop and test agents, teams, and graph-based workflows interactively using the provided notebooks/tutorial.ipynb. Once a workflow is validated, you create a config.json file that specifies:

Team composition (agents, tools, workflows)
Input and output message formats
Optional parameters like batch settings or streaming behavior

This configuration can then be uploaded to the running API server to dynamically create and manage agent workflows without needing to redeploy or modify server code.

Why This Matters:

Traditional software developers may not be familiar with agent-based AI design patterns.
Separating configuration from code makes it easier to iterate and deploy new workflows.
Reduces downtime by avoiding code redeploys for each new workflow.

This method enables a clean separation between development (agent/team/graph design) and deployment (runtime configuration management), offering flexibility and scalability in AI workflow automation.

Explore the Tutorial

Once the server is running, follow the Tutorial to learn how to:

Create tasks, teams, sessions, and runs
Run structured Autogen agents
Retrieve and manage batch runs
Simulate errors for observability testing

The tutorial walks through complete examples to help you get started quickly.

Also see system_tests.py for a detailed Python example to create and run agents via the API.

Deployment

You can deploy aegis-agent-core using a combination of Docker, AWS CloudFormation, and ECS. The system is structured to support both local and production deployments using a consistent Makefile interface.

⚙️ Prerequisites

AWS CLI configured with appropriate credentials
An S3 bucket for storing CloudFormation templates ($(S3_BUCKET))
An ECR repository for hosting Docker images ($(ECR_REPO))
VPC, subnet, and security group configuration already provisioned
A secret (aegis/openai-api-key) created in AWS Secrets Manager

🧪 Local Development

To start the API server and infrastructure locally:


make prepare-local
make deploy-local

This does the following:

Sets up a local Python virtual environment
Syncs dependencies using uv
Brings up Postgres and Redis using Docker Compose
Runs the FastAPI server on localhost:8000

🚀 Production Deployment (AWS ECS Fargate)

To deploy to production:


make deploy-prod

This command performs:

Builds the production image using the Dockerfile multi-stage production target
Pushes the image to your configured ECR repository
Uploads all CloudFormation templates in the cloudformation/ folder to S3
Deploys the ECS stack using cloudformation/app.yaml
Updates your OpenAI secret in AWS Secrets Manager
Forces a new ECS service deployment with the latest task definition

📊 Updating the Application

To apply code changes in production, follow this process:

Note: After pushing a new Docker image, you must run make force-deploy to trigger an ECS service update. This is necessary even if you’re reusing the same image tag (e.g. :prod), because ECS does not automatically redeploy services unless the task definition changes or a deployment is explicitly triggered.

Make your changes in the source code.
Rebuild the Docker image:
```
make build-prod
```
Push the updated image to ECR:
```
make push-prod
```
Trigger a new ECS deployment with the updated image:
```
make force-deploy
```

If you’ve updated any CloudFormation templates (e.g. changed task memory, ports, etc.):

Upload the latest templates:
```
make upload-cf-templates
```
Deploy the CloudFormation stack:
```
make deploy-cf-app
```

To update just the OpenAI API key:


make update-secret OPENAI_API_KEY=your-new-key

Customizing the Deployment

You can modify the following in the Makefile or pass them via CLI:

ECR_REPO for naming ECS resources
REGION and PROFILE for AWS environment selection
SECRET_NAME for pointing to a different OpenAI API key

Enterprise Extensions

For organizations with advanced requirements, we maintain optional enterprise extensions for aegis-agent-core. These include capabilities like:

Evaluations and Monitoring: Track agent performance, workflow accuracy, and system health over time.
Comprehensive Support for Enterprise Data RAG: Advanced search and retrieval capabilities for context rendering.
AAA (Authentication, Authorization, Accounting): Enterprise-grade security and fine-grained access control.
LLM Proxy for Per-Tenant/Per-User Cost Tracking and Rate Limiting: Monitor and control API usage with detailed analytics.

Enterprise modules are developed separately to ensure the open-source core remains clean, modular, and lightweight.

If you’re interested in learning more, please contact us or open an issue.