Tiny Kitten TTS models hit SOTA expressivity on-device

TLDR: Tiny open TTS models get scary good, AWS tries RAG for video, and Thomson Reuters explains why agentic workflows are eating legal AI.

Kitten TTS releases three tiny expressive speech models

Kitten TTS, an open source text-to-speech project, released three new on-device models with 80M, 40M, and 14M parameters, with the smallest under 25 MB as of 2026-03-20. The 80M model targets maximum quality, while the 14M model is claimed to reach state of the art expressivity among similarly sized models, which is unusual at this footprint.

For agent builders, this is a concrete step toward shipping fully offline, voice-native agents on phones, consumer devices, and embedded hardware. No paid API, no network call, and small enough that you can realistically bundle multiple voices or languages inside a single app. Quality is still below frontier server TTS and you should expect tradeoffs in edge cases, but the scale here is what makes it interesting.

If you are prototyping voice agents, it is worth A/B testing Kitten TTS against your current cloud TTS and seeing where users notice the gap versus where the latency and privacy wins dominate.

Thomson Reuters: Legal AI is a workflow orchestration game

Thomson Reuters Legal Solutions published a deep dive on why its CoCounsel Legal product is thriving while many legal AI startups fade, arguing that legal AI in 2026 is defined by workflow orchestration as of 2026-03-20. CoCounsel Legal runs multi-stage, agentic workflows over trusted, expert-curated content instead of just offering a chatbox on top of generic large language models.

The core claim: legal teams buy measurable outcomes such as hours saved, better margins, and lower compliance risk, not model cleverness. That aligns with what many of you see in other verticals: the value is in encoding real-world procedures, routing, and guardrails, not just retrieval-augmented generation. The piece also emphasizes evaluation, since an apparently smart agent that fails bar-adjacent tasks is worse than useless.

If you are building agents for regulated domains, the article reads like a playbook for how to wrap models in domain-aware workflows that procurement, risk, and partners will actually approve.

AWS introduces V-RAG for grounded AI video generation

The AWS Machine Learning Blog introduced Video Retrieval-Augmented Generation (V-RAG), a pattern that combines retrieval-augmented generation with advanced video AI models to generate more accurate and controllable videos as of 2026-03-20. V-RAG pulls relevant assets or metadata, then conditions a video model on that retrieved context instead of generating from a bare prompt.

This matters if you are trying to build agents that output video in enterprise or media settings where brand consistency and factual grounding matter. With this approach, you can drive video generation from structured product data, scripts, or knowledge bases stored in Amazon Bedrock or other retrieval layers, which reduces hallucinated visuals and off-brand scenes. The blog focuses on production workflows and reliability rather than raw model novelty.

AWS also published a companion walkthrough using Amazon Bedrock and Amazon Nova Reel that shows how to wire this into a fully automated pipeline so agents can turn text and images into templated videos.

Enhanced metrics for Amazon SageMaker AI endpoints: deeper visibility for better performance Amazon SageMaker AI endpoints now expose enhanced, configurable metrics for production inference. Useful if you are running multi-model or agent backends on SageMaker and need more granular latency and error visibility.
Launch HN: Canary (YC W26) – AI QA that understands your code Canary builds AI agents that read your codebase, diff pull requests, and then generate and execute tests for affected user flows. If you are trying to automate regression testing for fast-moving teams, this is directly in your lane.
How we monitor internal coding agents for misalignment OpenAI describes how it uses chain-of-thought monitoring on internal coding agents to detect misalignment and risky behavior. Worth a read if your agents write or run code in semi-autonomous modes.
Multiverse Computing pushes its compressed AI models into the mainstream TechCrunch profiles Multiverse Computing and the broader shift toward small, compressed models for edge and cost-sensitive deployments, mentioning competitors like Mistral Small 4. Good context if you are deciding between large language models and distilled variants.
Use RAG for video generation using Amazon Bedrock and Amazon Nova Reel AWS walks through a practical V-RAG pipeline that turns structured prompts and images into grounded video assets through Nova Reel. This is the concrete implementation behind the higher level V-RAG concept.
litellm v1.82.3.dev.2 The latest dev release for the LiteLLM router adds UI test coverage, improves aiohttp session recovery, and introduces richer org admin controls for team listing endpoints. Handy if you use LiteLLM as the abstraction layer for multi-provider agents.
langchain==1.2.13 LangChain 1.2.13 ships minor fixes plus better LangSmith integration metadata for create_agent and init_chat_model. If you rely on LangSmith to observe agents in production, this makes wiring metrics and traces slightly cleaner.
Warranty Void If Regenerated A longform piece on building a fiction project with Claude using world bibles and style guides, basically treating storytelling as an agentic workflow. Interesting patterns if you orchestrate creative agents.
Securing Enterprise AI with Weaviate Weaviate explains how to lock down enterprise deployments with OpenID Connect, role-based access control, and multi-tenant isolation. Relevant if your retrieval-augmented generation stack sits on Weaviate.
AI Is No Longer a Tool, It’s a Workforce Strategy Opinion piece arguing that companies like Alibaba, Microsoft, and Anthropic are turning agents into an execution layer inside existing tools. High level but matches what many of you see with embedded workflow agents.
Introducing LangSmith Fleet LangChain rebrands Agent Builder as LangSmith Fleet, positioned as a central place for teams to build, use, and manage agents across an enterprise. If you are standardizing on LangSmith, this is their multi-team control plane.
anthropic-sdk-python v0.86.0 The latest Python SDK adds filesystem memory tools plus manual API updates. That makes it easier to give agents persistent scratch space using a first-party abstraction.

Tiny Kitten TTS models hit SOTA expressivity on-device

Kitten TTS releases three tiny expressive speech models

Thomson Reuters: Legal AI is a workflow orchestration game

AWS introduces V-RAG for grounded AI video generation

Quick Hits

More from the Digest