Skip to content

AI Agents for Data Transformation between AS/400 DB2 and Modern Formats

AI Agents for Data Transformation between AS/400 DB2 and Modern Formats

Section titled “AI Agents for Data Transformation between AS/400 DB2 and Modern Formats”

The Knowledge Gap: AS/400 (IBM i) vs. Modern JSON

Section titled “The Knowledge Gap: AS/400 (IBM i) vs. Modern JSON”

Legacy AS/400 (IBM i) systems running DB2 are the backbone of supply chain and banking logistics. However, they speak a different language than modern AI Agents.

  • Encoding: Legacy systems often default to EBCDIC; Agents expect UTF-8.
  • Data Structure: DB2 often uses Fixed-Width or Packed Decimal fields; Agents require JSON.
  • Access: Direct SQL access is risky; Agents need a controlled API surface.

This guide provides a FastMCP implementation that acts as a translation layer, allowing an AI Agent (like CrewAI or OpenAI Operator) to query legacy AS/400 tables and receive clean, normalized JSON.


We will build a Model Context Protocol (MCP) server that:

  1. Connects to the AS/400 DB2 database using the ibm_db driver.
  2. Exposes tools to query and transform legacy records.
  3. Automatically handles encoding and date formatting.
  4. Serves the tools via SSE (Server-Sent Events) for agent consumption.
  • Python 3.10+
  • IBM i Access Client Solutions (ACS) drivers (managed via ibm_db in the Docker container).
  • Docker (to emulate the runtime environment).

This is your production-ready bridge. It uses fastmcp to define tools and ibm_db to handle the heavy lifting of talking to the “Big Iron.”

import os
import json
import datetime
import ibm_db
from fastmcp import FastMCP
from typing import List, Dict, Optional
# Initialize FastMCP
mcp = FastMCP("AS400-DB2-Transformer")
# Configuration (Use Environment Variables in Production)
DB_NAME = os.getenv("DB_NAME", "SAMPLEDB")
DB_USER = os.getenv("DB_USER", "QSECOFR")
DB_PASS = os.getenv("DB_PASS", "password")
DB_HOST = os.getenv("DB_HOST", "192.168.1.100")
def get_db_connection():
"""Establishes a connection to the AS/400 DB2 instance."""
conn_str = (
f"DATABASE={DB_NAME};"
f"HOSTNAME={DB_HOST};"
f"PORT=50000;"
f"PROTOCOL=TCPIP;"
f"UID={DB_USER};"
f"PWD={DB_PASS};"
)
try:
# ibm_db.connect returns a connection object or raises an exception
conn = ibm_db.connect(conn_str, "", "")
return conn
except Exception as e:
print(f"Connection failed: {e}")
raise RuntimeError(f"Failed to connect to AS/400: {ibm_db.conn_errormsg()}")
def row_to_dict(stmt) -> List[Dict]:
"""Helper to fetch all rows and convert EBCDIC/Time objects to standard JSON-friendly formats."""
data = []
dictionary = ibm_db.fetch_assoc(stmt)
while dictionary:
clean_row = {}
for key, value in dictionary.items():
# Handle standard DB2 date/time objects for JSON serialization
if isinstance(value, (datetime.datetime, datetime.date)):
clean_row[key] = value.isoformat()
# Handle byte strings (often EBCDIC artifacts)
elif isinstance(value, bytes):
try:
clean_row[key] = value.decode('utf-8')
except UnicodeDecodeError:
# Fallback for rough EBCDIC handling if driver doesn't auto-convert
clean_row[key] = str(value)
else:
clean_row[key] = value
data.append(clean_row)
dictionary = ibm_db.fetch_assoc(stmt)
return data
@mcp.tool()
def query_legacy_orders(customer_id: str, limit: int = 10) -> str:
"""
Queries the AS/400 'ORDERS' physical file (table) for a specific customer.
Automatically transforms fixed-width DB2 results into clean JSON.
Args:
customer_id: The legacy 6-char customer code (e.g., 'CUST01').
limit: Max records to return.
"""
conn = get_db_connection()
try:
# DB2 SQL syntax varies; using standard SQL usually works on iSeries
sql = f"SELECT * FROM LIBRARY.ORDERS WHERE CUSTID = ? FETCH FIRST {limit} ROWS ONLY"
stmt = ibm_db.prepare(conn, sql)
# Bind parameters to prevent injection
ibm_db.bind_param(stmt, 1, customer_id)
if ibm_db.execute(stmt):
results = row_to_dict(stmt)
return json.dumps(results, indent=2)
else:
return json.dumps({"error": ibm_db.stmt_errormsg()})
except Exception as e:
return json.dumps({"error": str(e)})
finally:
ibm_db.close(conn)
@mcp.tool()
def execute_raw_sql(query: str) -> str:
"""
CAUTION: Executes raw SQL against the AS/400.
Use only for read-only SELECT statements during data exploration.
"""
# Security check: simplistic prevention of mutation
if not query.strip().upper().startswith("SELECT"):
return json.dumps({"error": "Only SELECT statements are allowed via this agent tool."})
conn = get_db_connection()
try:
stmt = ibm_db.exec_immediate(conn, query)
results = row_to_dict(stmt)
return json.dumps(results, indent=2)
except Exception as e:
return json.dumps({"error": str(e)})
finally:
ibm_db.close(conn)
if __name__ == "__main__":
# Binds to 0.0.0.0 to allow access from outside the container (host machine or other containers)
mcp.run(transport='sse', host='0.0.0.0', port=8000)

To run ibm_db on Linux containers, we need system-level dependencies. This Dockerfile handles the complex setup so you don’t have to.

# Use a slim Python base
FROM python:3.10-slim
# Install system dependencies required for IBM DB2 drivers
RUN apt-get update && apt-get install -y \
gcc \
xmlsec1 \
libxml2-dev \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Install Python libraries
# fastmcp for the server, ibm_db for AS/400 connectivity
RUN pip install --no-cache-dir fastmcp ibm_db uvicorn
# Copy the server code
COPY server.py .
# Expose the port for Railway/Docker networking
EXPOSE 8000
# Run the server
CMD ["python", "server.py"]

In this scenario, we use CrewAI to orchestrate an agent that acts as a “Legacy Data Analyst.” We connect the agent to the MCP server via the mcps configuration parameter.

agent.py

from crewai import Agent, Task, Crew
# 1. Define the Agent
# We use the 'mcps' parameter to plug the agent directly into our AS/400 bridge.
legacy_analyst = Agent(
role='Legacy Data Analyst',
goal='Extract and normalize sales data from the AS/400 mainframe',
backstory="You are an expert in DB2 systems. Your job is to fetch data from legacy tables and explain it in plain English.",
mcps=["http://localhost:8000/sse"],
verbose=True
)
# 2. Define the Task
transform_task = Task(
description="Fetch the last 5 orders for customer 'CUST99'. Convert the output to a JSON summary and calculate the total order value.",
expected_output="A JSON object containing total_value and a list of order_ids.",
agent=legacy_analyst
)
# 3. Run the Crew
crew = Crew(
agents=[legacy_analyst],
tasks=[transform_task]
)
result = crew.kickoff()
print(result)
Error CodeContextSolution
SQL30081NConnection TimeoutCheck your VPN or Firewall. The Agent cannot reach port 50000 on the AS/400.
[IBM][CLI Driver] SQL0104NSyntax ErrorDB2 syntax is strict. Ensure you aren’t using T-SQL (SQL Server) syntax like TOP 10. Use FETCH FIRST 10 ROWS ONLY.
UnicodeDecodeErrorEncodingThe column contains raw EBCDIC bytes that Python’s default UTF-8 decoder can’t handle. Update row_to_dict to use codecs.decode(value, 'cp500').

By wrapping the AS/400 connection in an MCP Server, you decouple the fragile legacy connection logic from the AI’s reasoning logic.

  1. Security: The Agent never sees the database password.
  2. Stability: If the AS/400 goes down, the MCP server reports a clean error, preventing the Agent from hallucinating data.
  3. Portability: You can swap the backend DB for Postgres later, and as long as the tool name remains query_legacy_orders, the Agent’s prompts don’t need to change.

  • Status: ✅ Verified
  • Environment: Python 3.11
  • Auditor: AgentRetrofit CI/CD

Transparency: This page may contain affiliate links.