An evaluation suite for agentic models in real MCP tool environments (Notion / GitHub / Filesystem / Postgres / Playwright). MCPMark provides a reproducible, extensible benchmark for researchers and ...
AGPL-3.0 — because research infrastructure deserves the same freedoms as the software it runs on. .env.d/ ├── entry.src # Single entry point ├── 00_scitex.env # Base settings (SCITEX_DIR) ├── ...