Back to changelog

May 6, 2025

Fuse β€” Multi-Agent Layer (Beta)

Fuse β€” Multi-Agent Layer (Beta)

Green Fern
Green Fern

We’re rolling out Fuse, our new architecture for orchestrating multiple specialized agents in a single workflow.
With Fuse, you can now:

  • Chain agents with task-specific prompts

  • Enable parallel agent execution with shared memory

  • Call external tools from within agent trees

The beta includes support for function calling and shared context across agents.

Fuse is available for Pro and Enterprise plans.

πŸ“¦ New: GPU Auto-Scaling (for Inference APIs)

Our API now supports automatic GPU scaling based on real-time traffic.
This helps reduce cold starts and ensures low-latency inference even during usage spikes.

  • Support added for NVIDIA A100, H100

  • Billing adjusts dynamically based on load

  • Requires no configuration β€” just deploy your model

🧠 Improved: Model Updates

  • Upgraded our default CodeGen-7B endpoint to v2.1 β€” better accuracy, fewer hallucinations

  • DocQA model now supports 150k token contexts

  • Improved multi-language support in Chat endpoint (added Korean, Dutch, Polish)

πŸ” API Changes

  • New /v1/agents/run endpoint for orchestrated multi-agent flows

  • Deprecated /v1/tasks/create β€” use /v1/agents/launch instead

  • API keys can now be scoped per model, feature, or environment (dev/staging/prod)

πŸ§ͺ Labs

  • Internal tests running for speech-to-code pipeline (using Whisper + CodeT5)

  • Early access to fine-tuned vision transformer (ViT-x3) for document parsing

  • Testing memory-aware agents with local context retention beyond sessions

πŸ›  Fixes

  • Fixed a memory leak in real-time embeddings endpoint

  • Resolved an auth issue causing 401 errors on PUT /models/train

  • Improved latency for European region (Frankfurt): -35ms avg per call