🌍 World Shims Reference — The 20-Shim Enterprise Suite¶
World Shims are VFS-aware environment simulators that let agents interact with realistic, stateful systems without touching production infrastructure.
How to Activate Shims in a Scenario¶
Add an enabled_shims array to your scenario JSON. The evaluation engine will mount only those shims for the duration of the run:
{
"scenario_id": "my_fintech_scenario",
"title": "Loan Approval Validation",
"industry": "fintech",
"enabled_shims": ["database", "stripe", "security", "slack"],
"tasks": [...]
}
If enabled_shims is omitted, all 20 shims are mounted (default behavior).
The 20 Built-In World Shims¶
| # | Registry Key | Class | Domain | Trigger Action(s) |
|---|---|---|---|---|
| 1 | git |
GitSimulator |
Version Control | git_clone, git_commit, git_push |
| 2 | api |
ApiSimulator |
REST APIs | GET, POST via method + path params |
| 3 | database |
DatabaseSimulator |
SQL / Data | database_query (SELECT, INSERT) |
| 4 | slack |
SlackSimulator |
Messaging | slack_send |
| 5 | crm |
CRMSimulator |
Sales / CRM | crm_update_lead |
| 6 | email |
EmailSimulator |
email_send, email_list |
|
| 7 | calendar |
CalendarSimulator |
Scheduling | calendar_book |
| 8 | jira |
JiraSimulator |
Issue Tracking | jira_update |
| 9 | cloud |
CloudSimulator |
AWS/GCP Infra | cloud_launch |
| 10 | terminal |
TerminalSimulator |
SSH / Bash | ls, cd, arbitrary shell commands |
| 11 | stripe |
StripeSimulator |
Payments | stripe_charge |
| 12 | erp |
ERPSimulator |
ERP (SAP/Oracle) | erp_create_order |
| 13 | browser |
BrowserSimulator |
Headless Browser | browser_go |
| 14 | kb |
KnowledgeBaseSimulator |
Confluence/Notion | kb_search |
| 15 | support |
SupportDeskSimulator |
Zendesk/Intercom | support_close |
| 16 | social |
SocialMediaSimulator |
X / LinkedIn | social_post |
| 17 | vector |
VectorDBSimulator |
Pinecone/Milvus | vector_query |
| 18 | cicd |
CICDSimulator |
Jenkins/Actions | cicd_deploy |
| 19 | iot |
IoTSimulator |
IoT Devices | device state read/write via device + state |
| 20 | security |
SecuritySimulator |
IAM / Auth0/Okta | security_auth |
Shim Behavior & Configuration¶
1. git — Git Simulator¶
Simulates a Git repository with branch tracking and commit history.
{ "action": "git_commit", "params": { "message": "Fix risk threshold" } }
// → {"status": "success", "message": "Committed with message: Fix risk threshold"}
Initial State: current_branch: main, one initial commit in history.
2. api — REST API Simulator¶
Simulates an HTTP API with configurable endpoints.
{ "action": "api_call", "params": { "method": "GET", "path": "/api/v1/user/1" } }
// → {"status": "success", "data": {"id": 1, "name": "Test User", "plan": "pro"}}
Failure mode: GET to unknown path returns {"error": "Not Found"}.
3. database — SQL Database Simulator¶
Simulates a secure relational DB. Checks for SELECT or INSERT in the query string.
{ "action": "database_query", "params": { "query": "SELECT * FROM users" } }
// → {"status": "success", "rows": [{"id": 1, "email": "admin@example.com"}]}
VFS Parity Check: Expected state changes against tables dict are validated post-call.
4. slack — Slack Simulator¶
Appends messages to an in-memory channel log.
5. crm — CRM Simulator¶
Manages a list of sales leads. Updates lead status by ID.
6. email — Email Simulator¶
Simulates an SMTP server. Supports sending and listing inbox.
{ "action": "email_send", "params": { "to": "manager@corp.com", "subject": "Alert" } }
{ "action": "email_list" }
7. calendar — Calendar Simulator (GCal/Outlook)¶
Books meetings into an in-memory event list.
8. jira — Jira / Issue Tracker Simulator¶
Updates issue status by ticket ID.
9. cloud — Cloud Infrastructure Simulator (AWS/GCP)¶
Spins up virtual instances.
{ "action": "cloud_launch", "params": { "type": "c5.xlarge" } }
// → {"status": "success", "id": "i-5678"}
Failure mode: Returns Quota exceeded to simulate resource limits.
10. terminal — SSH / Bash Terminal Simulator¶
Executes shell-like commands in a virtual working directory.
{ "action": "terminal_exec", "params": { "cmd": "ls" } }
// → {"status": "success", "output": "file1.txt\nfile2.py"}
11. stripe — Payment Gateway Simulator¶
Simulates charge creation and tracks payment history.
{ "action": "stripe_charge", "params": { "amount": 4999, "currency": "usd" } }
// → {"status": "success", "id": "ch_123"}
Failure mode: Returns Card declined for invalid configurations.
12. erp — ERP Simulator (SAP / Oracle)¶
Manages purchase orders in an in-memory order book.
Failure mode: Returns Inventory low.
13. browser — Headless Browser Simulator¶
Simulates DOM navigation for web automation agents.
{ "action": "browser_go", "params": { "url": "https://dashboard.internal" } }
// → {"status": "success", "title": "Mock Page"}
14. kb — Knowledge Base Simulator (Confluence / Notion)¶
Searches a document store.
{ "action": "kb_search", "params": { "query": "onboarding" } }
// → {"status": "success", "results": [{"id": "D001", "title": "Onboarding"}]}
15. support — Support Desk Simulator (Zendesk / Intercom)¶
Opens and closes support tickets.
16. social — Social Media Simulator (X / LinkedIn)¶
Posts content and returns a feed.
17. vector — Vector Database Simulator (Pinecone / Milvus)¶
Handles embedding queries for RAG agents.
{ "action": "vector_query", "params": { "embedding": [0.1, 0.2, ...] } }
// → {"status": "success", "matches": [{"id": "v1", "score": 0.99}]}
18. cicd — CI/CD Pipeline Simulator (Jenkins / GitHub Actions)¶
Triggers build pipelines and tracks build status.
{ "action": "cicd_deploy", "params": { "branch": "main", "env": "staging" } }
// → {"status": "success", "id": "B-2"}
19. iot — IoT Device Simulator¶
Reads and writes state to a virtual device mesh.
{ "action": "iot_control", "params": { "device": "thermostat", "state": "68F" } }
// → {"status": "success", "state": "68F"}
Failure mode: Returns Device offline for unknown devices.
20. security — IAM / Auth Simulator (Auth0 / Okta)¶
Handles authentication and token issuance. Integrates with policy guardrails.
{ "action": "security_auth", "params": { "user_id": "u_99", "scope": "admin" } }
// → {"status": "success", "token": "jwt_token_xyz"}
Policy Integration: When the Security Shim blocks an action (e.g., data exfiltration via regex match), the Triage Engine flags this as a policy_violation.
Adding Custom Shims (Zero-Touch)¶
You can register additional shims without modifying core code via the on_register_simulators plugin hook:
from eval_runner.plugins import BaseEvalPlugin
class MyCustomShimPlugin(BaseEvalPlugin):
def on_register_simulators(self, registry: dict):
registry["my_erp_v2"] = MyERPv2Simulator()
📖 See plugins.md for the full plugin lifecycle reference.
📖 See 06_TRIAGE_ENGINE_AND_VFS.md for how shims integrate with the VFS triage engine.