Connecting CrewAI to legacy SOAP APIs with Python `zeep`
Connecting CrewAI to Legacy SOAP APIs with Python zeep
Section titled “Connecting CrewAI to Legacy SOAP APIs with Python zeep”In the “Retrofit Era,” one of the most common blockers for AI adoption is the XML wall. While modern agents speak JSON and REST, critical enterprise data—inventory levels in SAP ECC, customer records in Oracle Siebel, or transaction histories in banking mainframes—is often locked behind SOAP (Simple Object Access Protocol) interfaces.
SOAP is verbose, rigid, and notorious for complex WSDL definitions that break fragile LLM context windows. You cannot simply paste a 5,000-line XML schema into GPT-4 and hope for the best.
This guide provides a production-ready architecture to bridge this gap. We will build a FastMCP server that acts as a translation layer, using Python’s zeep library to handle the SOAP handshake and exposing clean JSON tools to your CrewAI agents.
🏗️ Architectural Pattern
Section titled “🏗️ Architectural Pattern”The goal is to abstract the complexity of XML serialization away from the AI. The Agent simply asks to “Get Customer Details,” and the MCP server handles the WSDL parsing, XML construction, and response serialization.
graph LR
A[CrewAI Agent] -- "JSON (MCP Protocol)" --> B[FastMCP Server]
B -- "SOAP/XML" --> C[Legacy Enterprise System]
C -- "SOAP Response" --> B
B -- "Clean JSON" --> A
Why zeep?
Section titled “Why zeep?”We use zeep because it parses the WSDL (Web Services Description Language) file automatically. This means your MCP server “knows” the API structure dynamically, allowing us to build generic tools rather than hard-coding every single SOAP envelope.
🛠️ The Bridge Code
Section titled “🛠️ The Bridge Code”1. The Server (server.py)
Section titled “1. The Server (server.py)”This FastMCP server exposes a tool capable of calling any method defined in a SOAP WSDL. It handles the conversion of Python dictionaries to XML and back to JSON.
Dependencies:
pip install fastmcp zeep lxmlserver.py
from fastmcp import FastMCPfrom zeep import Client, Settingsfrom zeep.helpers import serialize_objectfrom zeep.exceptions import Faultimport jsonimport os
# Initialize the MCP Servermcp = FastMCP("SOAP-Gateway")
@mcp.tool()def call_soap_service(wsdl_url: str, method_name: str, arguments: dict = None) -> str: """ Connects to a legacy SOAP API using a WSDL and executes a specific method.
Args: wsdl_url: The full URL to the ?wsdl definition. method_name: The specific SOAP operation to perform (e.g., 'GetInventory'). arguments: A dictionary of arguments matching the SOAP request body. """ try: # Networking Configuration # In enterprise environments, legacy servers are often behind firewalls needing specific proxies. # proxies = { # 'http': 'http://user:[email protected]:3128', # 'https': 'http://user:[email protected]:3128', # } # For production, inject BrightData proxy URL here
# Zeep Settings: Allow huge trees for massive enterprise XML responses settings = Settings(strict=False, xml_huge_tree=True)
# Initialize Client # Pass proxies=proxies if defined above client = Client(wsdl=wsdl_url, settings=settings)
# Validate Method Exists service = client.service if not hasattr(service, method_name): return json.dumps({ "error": f"Method '{method_name}' not found in WSDL.", "available_methods": list(service._binding._operations.keys()) })
# Dynamic Method Call soap_method = getattr(service, method_name)
# Execute if arguments: response = soap_method(**arguments) else: response = soap_method()
# Serialize Zeep object (which contains complex types) to standard Python dicts/lists # then dump to JSON for the Agent. clean_response = serialize_object(response) return json.dumps(clean_response, default=str)
except Fault as fault: # Handle SOAP Faults (the equivalent of HTTP 500s) return json.dumps({"status": "error", "soap_fault": fault.message, "code": fault.code}) except Exception as e: return json.dumps({"status": "error", "details": str(e)})
if __name__ == "__main__": # HOST MUST BE 0.0.0.0 to be accessible outside the container mcp.run(transport='sse', host='0.0.0.0', port=8000)2. Containerization (Dockerfile)
Section titled “2. Containerization (Dockerfile)”To deploy this on platforms like Railway, AWS ECS, or internal Kubernetes clusters, we need a Dockerfile. We explicitly expose port 8000 for the SSE transport.
Dockerfile
# Use a slim Python image to keep the footprint smallFROM python:3.11-slim
# Set working directoryWORKDIR /app
# Install system dependencies required for lxml (xml parsing)RUN apt-get update && apt-get install -y \ libxml2-dev \ libxslt-dev \ gcc \ && rm -rf /var/lib/apt/lists/*
# Install Python librariesRUN pip install --no-cache-dir fastmcp zeep lxml
# Copy the server codeCOPY server.py .
# Expose the FastMCP portEXPOSE 8000
# Run the serverCMD ["python", "server.py"]🔌 Connecting the Agent
Section titled “🔌 Connecting the Agent”Now that the “Retrofit” layer is running, we connect the AI Brain. In this example, we configure a CrewAI agent to utilize the SOAP Gateway.
The agent interacts via SSE (Server-Sent Events), ensuring real-time tool execution without polling.
main.py (CrewAI Configuration)
from crewai import Agent, Task, Crew, Process
# Define the Agent with access to the MCP toolssoap_specialist = Agent( role='Legacy Systems Integration Specialist', goal='Retrieve inventory data from the legacy SAP system via SOAP', backstory="""You are an expert in legacy enterprise protocols. You know how to talk to the 'Big Iron' servers using the SOAP Gateway tool provided.""", verbose=True, allow_delegation=False, # CONNECTIVITY: Pointing the agent to our FastMCP Docker container mcps=["http://localhost:8000/sse"])
# Define a task that requires SOAP interactioninventory_task = Task( description=""" 1. Connect to the WSDL at 'http://legacy-erp.internal/soap?wsdl'. 2. Call the 'GetStockLevel' method. 3. Pass the argument SKU='AG-2025-RETRO'. 4. Return the raw stock count. """, expected_output="The current stock level for SKU AG-2025-RETRO.", agent=soap_specialist)
# Assemble the Crewcrew = Crew( agents=[soap_specialist], tasks=[inventory_task], process=Process.sequential)
# Kickoffif __name__ == "__main__": result = crew.kickoff() print("######################") print(result)🚦 Troubleshooting & Common Errors
Section titled “🚦 Troubleshooting & Common Errors”When retrofitting AI to 20-year-old software, expect friction. Here are the most common issues found in the AgentRetrofit Error Database:
1. zeep.exceptions.XMLSyntaxError
Section titled “1. zeep.exceptions.XMLSyntaxError”- Cause: The legacy server is returning malformed XML, often due to encoding issues (e.g., ISO-8859-1 vs UTF-8).
- Fix: In
server.py, modify the transport session to force the correct encoding beforezeepparses it.
2. “Max recursion depth exceeded”
Section titled “2. “Max recursion depth exceeded””- Cause: Complex WSDLs with circular references (common in SAP).
- Fix: We enabled
strict=Falseandxml_huge_tree=Truein theSettingsobject withinserver.pyto mitigate this.
3. Connection Timeouts / Firewall Blocks
Section titled “3. Connection Timeouts / Firewall Blocks”- Cause: Legacy servers often IP-whitelist requests. Your cloud Agent cannot reach the on-prem server.
- Fix: Use a residential proxy or a static IP proxy. Uncomment the
proxiesdictionary inserver.pyand inject your BrightData or Oxylabs credentials.
🚀 Next Steps
Section titled “🚀 Next Steps”- Security: Add Basic Auth headers to the
zeepClient if your SOAP endpoint requires a username/password. - Caching: Decorate the
call_soap_servicefunction with@lru_cacheif you are hitting read-heavy endpoints to save on legacy compute cycles.
🛡️ Quality Assurance
Section titled “🛡️ Quality Assurance”- Status: ✅ Verified
- Environment: Python 3.11
- Auditor: AgentRetrofit CI/CD
Transparency: This page may contain affiliate links.