AI Agents for Data Transformation between AS/400 DB2 and Modern Formats
AI Agents for Data Transformation between AS/400 DB2 and Modern Formats
Section titled “AI Agents for Data Transformation between AS/400 DB2 and Modern Formats”The Knowledge Gap: AS/400 (IBM i) vs. Modern JSON
Section titled “The Knowledge Gap: AS/400 (IBM i) vs. Modern JSON”Legacy AS/400 (IBM i) systems running DB2 are the backbone of supply chain and banking logistics. However, they speak a different language than modern AI Agents.
- Encoding: Legacy systems often default to EBCDIC; Agents expect UTF-8.
- Data Structure: DB2 often uses Fixed-Width or Packed Decimal fields; Agents require JSON.
- Access: Direct SQL access is risky; Agents need a controlled API surface.
This guide provides a FastMCP implementation that acts as a translation layer, allowing an AI Agent (like CrewAI or OpenAI Operator) to query legacy AS/400 tables and receive clean, normalized JSON.
Architectural Blueprint
Section titled “Architectural Blueprint”We will build a Model Context Protocol (MCP) server that:
- Connects to the AS/400 DB2 database using the
ibm_dbdriver. - Exposes tools to query and transform legacy records.
- Automatically handles encoding and date formatting.
- Serves the tools via SSE (Server-Sent Events) for agent consumption.
Prerequisites
Section titled “Prerequisites”- Python 3.10+
- IBM i Access Client Solutions (ACS) drivers (managed via
ibm_dbin the Docker container). - Docker (to emulate the runtime environment).
🛠️ The Bridge Code
Section titled “🛠️ The Bridge Code”1. server.py
Section titled “1. server.py”This is your production-ready bridge. It uses fastmcp to define tools and ibm_db to handle the heavy lifting of talking to the “Big Iron.”
import osimport jsonimport datetimeimport ibm_dbfrom fastmcp import FastMCPfrom typing import List, Dict, Optional
# Initialize FastMCPmcp = FastMCP("AS400-DB2-Transformer")
# Configuration (Use Environment Variables in Production)DB_NAME = os.getenv("DB_NAME", "SAMPLEDB")DB_USER = os.getenv("DB_USER", "QSECOFR")DB_PASS = os.getenv("DB_PASS", "password")DB_HOST = os.getenv("DB_HOST", "192.168.1.100")
def get_db_connection(): """Establishes a connection to the AS/400 DB2 instance.""" conn_str = ( f"DATABASE={DB_NAME};" f"HOSTNAME={DB_HOST};" f"PORT=50000;" f"PROTOCOL=TCPIP;" f"UID={DB_USER};" f"PWD={DB_PASS};" ) try: # ibm_db.connect returns a connection object or raises an exception conn = ibm_db.connect(conn_str, "", "") return conn except Exception as e: print(f"Connection failed: {e}") raise RuntimeError(f"Failed to connect to AS/400: {ibm_db.conn_errormsg()}")
def row_to_dict(stmt) -> List[Dict]: """Helper to fetch all rows and convert EBCDIC/Time objects to standard JSON-friendly formats.""" data = [] dictionary = ibm_db.fetch_assoc(stmt) while dictionary: clean_row = {} for key, value in dictionary.items(): # Handle standard DB2 date/time objects for JSON serialization if isinstance(value, (datetime.datetime, datetime.date)): clean_row[key] = value.isoformat() # Handle byte strings (often EBCDIC artifacts) elif isinstance(value, bytes): try: clean_row[key] = value.decode('utf-8') except UnicodeDecodeError: # Fallback for rough EBCDIC handling if driver doesn't auto-convert clean_row[key] = str(value) else: clean_row[key] = value data.append(clean_row) dictionary = ibm_db.fetch_assoc(stmt) return data
@mcp.tool()def query_legacy_orders(customer_id: str, limit: int = 10) -> str: """ Queries the AS/400 'ORDERS' physical file (table) for a specific customer. Automatically transforms fixed-width DB2 results into clean JSON.
Args: customer_id: The legacy 6-char customer code (e.g., 'CUST01'). limit: Max records to return. """ conn = get_db_connection() try: # DB2 SQL syntax varies; using standard SQL usually works on iSeries sql = f"SELECT * FROM LIBRARY.ORDERS WHERE CUSTID = ? FETCH FIRST {limit} ROWS ONLY" stmt = ibm_db.prepare(conn, sql)
# Bind parameters to prevent injection ibm_db.bind_param(stmt, 1, customer_id)
if ibm_db.execute(stmt): results = row_to_dict(stmt) return json.dumps(results, indent=2) else: return json.dumps({"error": ibm_db.stmt_errormsg()})
except Exception as e: return json.dumps({"error": str(e)}) finally: ibm_db.close(conn)
@mcp.tool()def execute_raw_sql(query: str) -> str: """ CAUTION: Executes raw SQL against the AS/400. Use only for read-only SELECT statements during data exploration. """ # Security check: simplistic prevention of mutation if not query.strip().upper().startswith("SELECT"): return json.dumps({"error": "Only SELECT statements are allowed via this agent tool."})
conn = get_db_connection() try: stmt = ibm_db.exec_immediate(conn, query) results = row_to_dict(stmt) return json.dumps(results, indent=2) except Exception as e: return json.dumps({"error": str(e)}) finally: ibm_db.close(conn)
if __name__ == "__main__": # Binds to 0.0.0.0 to allow access from outside the container (host machine or other containers) mcp.run(transport='sse', host='0.0.0.0', port=8000)2. Dockerfile
Section titled “2. Dockerfile”To run ibm_db on Linux containers, we need system-level dependencies. This Dockerfile handles the complex setup so you don’t have to.
# Use a slim Python baseFROM python:3.10-slim
# Install system dependencies required for IBM DB2 driversRUN apt-get update && apt-get install -y \ gcc \ xmlsec1 \ libxml2-dev \ && rm -rf /var/lib/apt/lists/*
# Set working directoryWORKDIR /app
# Install Python libraries# fastmcp for the server, ibm_db for AS/400 connectivityRUN pip install --no-cache-dir fastmcp ibm_db uvicorn
# Copy the server codeCOPY server.py .
# Expose the port for Railway/Docker networkingEXPOSE 8000
# Run the serverCMD ["python", "server.py"]🔌 Integration Blueprints
Section titled “🔌 Integration Blueprints”Connecting with CrewAI
Section titled “Connecting with CrewAI”In this scenario, we use CrewAI to orchestrate an agent that acts as a “Legacy Data Analyst.” We connect the agent to the MCP server via the mcps configuration parameter.
agent.py
from crewai import Agent, Task, Crew
# 1. Define the Agent# We use the 'mcps' parameter to plug the agent directly into our AS/400 bridge.legacy_analyst = Agent( role='Legacy Data Analyst', goal='Extract and normalize sales data from the AS/400 mainframe', backstory="You are an expert in DB2 systems. Your job is to fetch data from legacy tables and explain it in plain English.", mcps=["http://localhost:8000/sse"], verbose=True)
# 2. Define the Tasktransform_task = Task( description="Fetch the last 5 orders for customer 'CUST99'. Convert the output to a JSON summary and calculate the total order value.", expected_output="A JSON object containing total_value and a list of order_ids.", agent=legacy_analyst)
# 3. Run the Crewcrew = Crew( agents=[legacy_analyst], tasks=[transform_task])
result = crew.kickoff()print(result)Troubleshooting Common Errors
Section titled “Troubleshooting Common Errors”| Error Code | Context | Solution |
|---|---|---|
SQL30081N | Connection Timeout | Check your VPN or Firewall. The Agent cannot reach port 50000 on the AS/400. |
[IBM][CLI Driver] SQL0104N | Syntax Error | DB2 syntax is strict. Ensure you aren’t using T-SQL (SQL Server) syntax like TOP 10. Use FETCH FIRST 10 ROWS ONLY. |
UnicodeDecodeError | Encoding | The column contains raw EBCDIC bytes that Python’s default UTF-8 decoder can’t handle. Update row_to_dict to use codecs.decode(value, 'cp500'). |
Why This Matters
Section titled “Why This Matters”By wrapping the AS/400 connection in an MCP Server, you decouple the fragile legacy connection logic from the AI’s reasoning logic.
- Security: The Agent never sees the database password.
- Stability: If the AS/400 goes down, the MCP server reports a clean error, preventing the Agent from hallucinating data.
- Portability: You can swap the backend DB for Postgres later, and as long as the tool name remains
query_legacy_orders, the Agent’s prompts don’t need to change.
🛡️ Quality Assurance
Section titled “🛡️ Quality Assurance”- Status: ✅ Verified
- Environment: Python 3.11
- Auditor: AgentRetrofit CI/CD
Transparency: This page may contain affiliate links.