AI++ // why is everyone talking about loops?


When we last published AI++ Anthropic had just launched Fable 5 and everyone was very excited about another step-change in the quality of models. That excitement didn’t last long as the US government issued an export directive that effectively meant Anthropic had to withdraw access. If you missed out, Ethan Mollick wrote about what it was like to work with Fable 5.

I found two interesting looks at the world of AI this week. First, Anthropic published the results of 81,000 interviews with users of AI asking what people want from it. It covers both benefits and concerns, so is a fascinating look across how people are experiencing AI. Meanwhile the folks at Build Club put together a bunch of research into how industries are using AI in the State of AI.

In this edition of AI++ we look at the new buzzword everyone is talking about: loops! We also investigate the promise of multi-agent systems, the world of data extraction, and see whether an AI model can beat Zork.

Phil Nash

Developer relations engineer for IBM

🛠️ Building with AI, Agents & MCP

Loops and durability

There has been a lot of talk lately about how you should no longer be prompting coding agents, but building loops that do the prompting. I think this should apply to building agents too, as the key to loops is giving an agent a verifiable goal that it can loop until it completes the goal. PostHog wrote up why they are bullish on loops which outlines the basics of how they work.

One of the emerging building blocks for loops is durable execution. Dan Farrelly from Inngest wrote about Agent Loop Architecture and Sunil Pai of Cloudflare wrote Never Waste a Token, bringing in resumable streaming to the problem.

On the governance side of the loop, both IBM and Amazon have been talking about how human-in-the-loop isn’t the silver bullet it sounds like for AI supervision. IBM warns about automation bias and argues for human accountability while Amazon warns about the normalization of deviance and how human identity and ownership should govern AI decisions.

Multi-LLM systems

Fable 5 may have been retracted, but there have been reports of getting better results not from a single model but by combining panels of models. OpenRouter’s Fusion has been benchmarked as beating individual models by forming a model network. Sakana released Fugu, a “multi-agent system as a model” that they also report to beat Fable on certain benchmarks.

It’s great that lesser models can team up to beat a frontier level model, but it is worth considering time and cost. OpenRouter say that there will be N panel calls + 1 judge call, so costs for the default 3-model panel will be 4-5 times the cost of a single prompt. I don’t think this makes multi-model panels the future yet, but perhaps for the hardest problems they will be worth it.

Formatting data for AI

This week saw the release of Docling for IBM watsonx, a managed service for the open-source Docling document converter. Also Docling related, the Linux Foundation formed a working group to develop DocLang as a standard for representing documents in an structured, AI-native manner.

Google also launched the Open Knowledge Format which is intended to formalize Andrej Karpathy’s LLM-wiki pattern into an interoperable format.

🧠 New models

Z.ai’s open-source GLM-5.2 has been the one model that everyone is talking about this week. It’s too big to run yourself, unless you have your own data center, but it’s significantly cheaper than other frontier models and is being compared favourably to Opus 4.8 and GPT-5.5

🗞️ Other news

Fun with AI

Have you found yourself wondering whether LLMs are sentient? Well you can stop now, since a Microsoft researcher built a goat-powered LLM in a game. If LLMs are sentient, then so is 1999’s Age of Empires II. Speaking of games, Raymond Camden pointed Chrome’s built in prompt API at a game of Zork to see if it could win. I think the grues are winning so far.

Anthropic’s naming of model levels has been the most poetic, literally, and one data scientist wondered what would happen if you extrapolated their naming to enterprise-scale narrative objects.

🧑‍💻 Code & Libraries

🔦 Langflow Spotlight

Langflow Memory Bases provide AI agents with long-term, persistent memory across chat sessions using a vector-based storage layer. They use semantic search to retrieve relevant past context. Developers can filter memory by session, control ingestion timing, and use LLM preprocessing to filter out noise ensuring agents only remember useful, relevant details over time. Check out this video walkthrough of Memory Bases.

🗓️ Events

On Friday 26th June I’ll be speaking at AgentCon Perth on building MCP Apps.

From June 29th until 2nd July the AI Engineer World’s Fair is on in San Francisco. You can catch Tejas Kumar from the team speaking about Evals in AI on the first day.

The AI Coding Summit will be in London and online on July 6th and 7th with talks and workshops on MCP, agentic systems, AI-driven testing & debugging, and real-world best practices.

Use the promo code AI++ for a 10% discount on tickets.

Enjoy this newsletter? Forward it to a friend.

2755 Augustine Dr, 8th Floor, Santa Clara, CA 95054
Unsubscribe · Preferences

AI++ newsletter

Subscribe for all the latest news for developers on AI, Agents and MCP curated by the Langflow team.

Read more from AI++ newsletter

Claude Mythos is here, except it’s called Fable 5 and comes with a few restrictions. It appears to be the largest model released and, according to the benchmarks, the most accomplished. Even more so than Opus 4.8 that was only released 2 weeks ago. It’s also the most expensive model, so you might want to think twice before swapping it into your RAG support chat bot. While it’s impressive to see the frontier march forward, this week in AI++ we’ll take a look at some of the techniques people...

The last couple of weeks has seen students booing commencement speakers at graduation ceremonies in Florida and Arizona when they mentioned AI. This is a visceral reaction to what they see as a threat to their careers. Meanwhile, developers working with AI are burning out and getting “Brain Fry” from doing more work at higher intensities without the same fulfillment. I wrote a bit about this myself, sharing that I found it hard to be proud of a useful little app that I built. AI is changing...

Working with LLMs is weird, but I never thought it would be as weird as OpenAI having to specifically tell their models not to talk about goblins, gremlins, raccoons, trolls, ogres, or pigeons. It raises so many questions. Thankfully after someone spotted the instructions in the Codex base instructions, OpenAI did give an explanation as to where the goblins came from. They never mentioned why raccoons and pigeons got caught up in the fantasy creature fascination though. In this edition of...