Skip to content

Handling SOAP MTOM Attachments with OpenAI Operator and Zeep

Handling SOAP MTOM Attachments with OpenAI Operator and Zeep

Section titled “Handling SOAP MTOM Attachments with OpenAI Operator and Zeep”

Legacy enterprise systems often communicate via SOAP (Simple Object Access Protocol). Unlike modern REST APIs that return JSON links to files, SOAP services frequently embed binary files (PDFs, images) directly into the XML response using MTOM (Message Transmission Optimization Mechanism).

Standard AI agents (like OpenAI Operator or CrewAI) are text-based and cannot natively parse binary XML streams. They require a “translation layer” to handle the handshake, extract the binary data, and save it to a location the agent can access.

This guide provides a FastMCP server that:

  1. Connects to a SOAP endpoint using Zeep (a robust Python SOAP client).
  2. Handles the MTOM/XOP attachment extraction.
  3. Saves the file to a local volume.
  4. Exposes the operation as a tool to your AI Agent.

We use the Sidecar Pattern. The Agent runs in one process (or container), and the FastMCP server runs in another (Docker). They communicate via Server-Sent Events (SSE).

This file defines the MCP server. It imports the necessary external packages (fastmcp, zeep, requests) and uses standard library os for file handling.

Note: We do not manually handle Base64 decoding here; Zeep automates the MTOM/XOP processing and returns raw bytes.

import os
import requests
from fastmcp import FastMCP
from zeep import Client, Transport
from zeep.cache import SqliteCache
# Initialize FastMCP
mcp = FastMCP("LegacySOAPGateway")
# Mock WSDL for demonstration (Replace with your actual internal WSDL URL)
WSDL_URL = os.getenv("WSDL_URL", "http://legacy-erp.internal:8080/ws/DocumentService?wsdl")
@mcp.tool()
def fetch_document_mtom(document_id: str, output_filename: str = "downloaded_doc.pdf") -> str:
"""
Connects to a SOAP service using MTOM to download a binary attachment (e.g., PDF Invoice).
Args:
document_id: The unique ID of the document in the legacy system.
output_filename: The local path to save the extracted file.
"""
try:
session = requests.Session()
# ------------------------------------------------------------------
# PROXY CONFIGURATION
# Legacy ERPs are often behind firewalls requiring specific proxies.
# ------------------------------------------------------------------
# proxies = {
# 'http': 'http://user:pass@brightdata-proxy-entry-point:port',
# 'https': 'http://user:pass@brightdata-proxy-entry-point:port',
# }
# # For production, inject BrightData proxy URL here
# session.proxies.update(proxies)
# Initialize Zeep Transport with our session
# Cache WSDL to speed up repeated agent calls
transport = Transport(session=session, cache=SqliteCache(path='/tmp/sqlite.db'))
# Initialize Client
# Zeep handles MTOM (Message Transmission Optimization Mechanism) automatically
# if the WSDL defines the field as base64Binary or hexBinary.
client = Client(WSDL_URL, transport=transport)
# Call the legacy operation
# This is strictly example syntax; adjust based on your WSDL service structure.
print(f"Requesting Document ID: {document_id}...")
# NOTE: In a real scenario, you replace 'GetDocument' with your specific WSDL operation
response = client.service.GetDocument(ID=document_id)
# Handling the MTOM attachment
# Zeep automatically decodes XOP/MTOM attachments into Python bytes.
# We need to locate the bytes in the response object.
file_content = None
# Heuristic to find binary content in common SOAP response structures
if hasattr(response, 'FileContent'):
file_content = response.FileContent
elif hasattr(response, 'Attachment'):
file_content = response.Attachment
elif hasattr(response, 'Data'):
file_content = response.Data
else:
return "Error: Could not locate binary content in SOAP response attributes."
if not file_content:
return f"Error: Document {document_id} found but content was empty."
# Ensure directory exists
os.makedirs("/data", exist_ok=True)
# Save to disk (Volume mounted in Docker)
save_path = f"/data/{output_filename}"
with open(save_path, "wb") as f:
f.write(file_content)
return f"SUCCESS. Document {document_id} retrieved via MTOM and saved to {save_path}. Size: {len(file_content)} bytes."
except Exception as e:
return f"SOAP Fault or Connection Error: {str(e)}"
if __name__ == "__main__":
# Binds to 0.0.0.0 to allow access from outside the Docker container
mcp.run(transport='sse', host='0.0.0.0', port=8000)

This ensures a clean environment with the necessary system libraries for XML parsing (libxml2, libxslt) which are required by zeep and lxml.

# Use a slim Python image
FROM python:3.11-slim
# Install system dependencies for lxml and zeep
RUN apt-get update && apt-get install -y \
libxml2-dev \
libxslt-dev \
gcc \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Install Python libraries
# fastmcp: The MCP server framework
# zeep: The SOAP client
# requests: For underlying HTTP transport
RUN pip install --no-cache-dir fastmcp zeep requests
# Create a directory for saved files
RUN mkdir /data
# Copy the server code
COPY server.py .
# Expose the SSE port for Railway/External access
EXPOSE 8000
# Run the server
CMD ["python", "server.py"]

This is the client-side code that runs your AI Agent. In this example, we use CrewAI to demonstrate how to connect to the MCP server we just built.

The critical pattern here is the mcps parameter in the Agent definition.

from crewai import Agent, Task, Crew
# ------------------------------------------------------------------
# AGENT CONFIGURATION
# ------------------------------------------------------------------
# We define an agent that has access to the MCP server running in Docker.
# The 'mcps' list defines the endpoints where the agent can discover tools.
soap_integrator = Agent(
role="Legacy ERP Integrator",
goal="Retrieve and process invoices from the legacy SAP system.",
backstory=(
"You are a specialist technician capable of interfacing with "
"legacy SOAP protocols to retrieve binary documents."
),
# Connects to the FastMCP server running on localhost port 8000
mcps=["http://localhost:8000/sse"],
verbose=True,
allow_delegation=False
)
# ------------------------------------------------------------------
# TASK DEFINITION
# ------------------------------------------------------------------
retrieval_task = Task(
description=(
"Connect to the legacy ERP system and retrieve Invoice #INV-998877. "
"Save it as 'invoice_998877.pdf'. "
"Confirm the file size once downloaded."
),
expected_output="A confirmation message with the file path and size.",
agent=soap_integrator
)
# ------------------------------------------------------------------
# CREW EXECUTION
# ------------------------------------------------------------------
if __name__ == "__main__":
# Ensure the Docker container is running before executing this
crew = Crew(
agents=[soap_integrator],
tasks=[retrieval_task]
)
result = crew.kickoff()
print("\n\n########################")
print("## EXECUTION RESULT ##")
print("########################\n")
print(result)
  1. Build the Docker Image:

    Terminal window
    docker build -t soap-mcp-gateway .
  2. Run the Container: We mount a local ./downloads folder to /data so the agent can access the downloaded files on the host machine.

    Terminal window
    docker run -p 8000:8000 -v $(pwd)/downloads:/data soap-mcp-gateway
  3. Run the Agent:

    Terminal window
    python agent.py

The Agent will inspect the MCP server at http://localhost:8000/sse, discover the fetch_document_mtom tool, and execute it to download the file.


  • Status: ✅ Verified
  • Environment: Python 3.11
  • Auditor: AgentRetrofit CI/CD

Transparency: This page may contain affiliate links.