Mastra adds AI Gateway tools and sturdier agent memory
For engineers, designers & product people. Stay up to date with free daily digest.
TLDR: Mastra tightens the agent loop, AWS ships evals and AI A/B patterns, and the tooling around agentic coding keeps getting sharper.
Mastra 1.14.0 adds AI Gateway tools and stronger memory
The @mastra/core 1.14.0 release adds native support for AI Gateway tools, like gateway.tools.perplexitySearch(), directly in the Mastra agentic loop. The runtime now infers providerExecuted, merges streamed provider results back into the originating tool call, and avoids re-running tools locally when the gateway already produced an answer. The release also improves observational memory stability through dated message boundaries, which should reduce cache weirdness and retrieval drift.
For anyone wiring agents to hosted tool ecosystems or search APIs, this makes Mastra feel more like a first-class orchestration layer instead of glue code. The memory tweaks are subtle but important if you rely on long-lived agents. As of 2026-03-19 there are no public benchmarks, so you will want to watch real traces after upgrading.
AWS details Strands Evals for production AI agents
Amazon Web Services used a new AWS Machine Learning Blog post to walk through Strands Evals, a framework for systematically evaluating AI agents before and after production deployment. The guide covers built-in evaluators, multi turn simulations, and patterns for integrating evals into CI or live monitoring. It focuses on concrete workflows instead of abstract metrics.
If you are responsible for shipping agents into regulated or high volume surfaces, this is worth a close read. The examples show how to design task specific evals, run scenario simulations, and feed results back into model or prompt updates. As of 2026-03-19 this is still an AWS centric stack, so it fits best if you are already on Amazon Bedrock or broader AWS infrastructure.
AWS shows AI powered A/B testing with Amazon Bedrock
A separate AWS Machine Learning Blog post walks through building an AI powered A/B testing engine using Amazon Bedrock, Amazon Elastic Container Service, Amazon DynamoDB, and the Model Context Protocol. Instead of static bucketing, the system uses user context to assign variants dynamically during experiments, while still tracking metrics per treatment.
Product teams that already lean on large language models for personalization will recognize the pattern: treat the model as a policy that chooses which experience to show. The writeup is useful because it addresses practical pieces such as state storage, latency, and experiment integrity. As of 2026-03-19 this is a reference architecture rather than a turnkey service, so plan on real engineering work to adapt it to your stack.
Quick Hits
- Show HN: Tmux-IDE, OSS agent-first terminal IDE Declarative, scriptable tmux based IDE focused on agentic engineering so you can spin up always on, SSH accessible multi agent coding setups tied to your terminal.
- Show HN: Reprompt – Score your AI coding prompts with NLP papers Open source tool that scores coding prompts against findings from natural language processing research, useful if you want a more principled way to iterate on agent prompts.
- NVIDIA GTC 2026: Agentic AI inflection hits healthcare and life sciences NVIDIA highlighted Cosmos H synthetic data models, GR00T H vision language action models, and Rheo hospital digital twin blueprints, all aimed at building simulation heavy healthcare agents. As of 2026-03-19 most artifacts target partners and early adopters.
- Cook: A simple CLI for orchestrating Claude Code Minimal command line interface that wraps Claude Code into reproducible workflows so you can script agentic coding sessions instead of driving everything from the UI.
- Introducing Nova Forge SDK for customizing Nova models New Amazon Nova Forge SDK streamlines fine tuning Nova models with opinionated recipes and dependency management, targeting enterprise teams that want customization without maintaining training infra.
- AI Agents For Retail: How retail AI agents work Shopify outlines agentic commerce patterns such as in chat checkout protocols and inventory agents for demand forecasting, useful background if you are building retail facing agents.
- Meta’s Manus AI agent arrives on your desktop Meta is preparing a Manus desktop agent that connects its Avocado models to OpenClaw compatible workflows, signaling that open source agent frameworks are now integration targets for big consumer players. Also covered by: The Next Web
- VoltAgent/awesome-codex-subagents Curated list of 130+ specialized Codex subagents for common development tasks, currently at 971 stars, handy inspiration if you are designing modular tool using agents.
- openai-python v2.29.0 Latest OpenAI Python SDK adds 5.4 nano and mini model slugs, a
/v1/videosendpoint in batches, and adefer_loadingfield forToolFunction, which can simplify complex tool setups. - Autoresearching Apple’s “LLM in a Flash” to run Qwen 397B locally Dan Woods demonstrates Qwen3.5 397B A17B running at 5.5+ tokens per second on a 48GB M3 Max MacBook using Apple’s LLM in a Flash tricks, a strong signal for near future local giant model agents.
- State of Open Source on Hugging Face: Spring 2026 Hugging Face analyzes open source AI trends across regions, model types, and communities, useful context for deciding where to bet your agent stack.
- Polly is generally available everywhere you work in LangSmith LangChain’s Polly assistant is now GA inside LangSmith, giving you an AI helper that can traverse deep traces and prompts to debug misbehaving agents more quickly.
More from the Digest
For engineers, designers & product people. Stay up to date with free daily digest.