Issue #372 - The ML Engineer 🤖

Claude Code is Not Replacing Devs, Agentic Monitoring Release of KAOS, OpenAI’s In-House Data Agent, DeepMind's Take on World Models, MeiTuan Bi-Lingual Image Model + more 🚀

Feb 01, 2026

We are releasing Agentic Monitoring for the K8s Agent Orchestration System 🚀

Check out the demo for the latest release 🔥👇 Post coming next week!

If you want to support the momentum, please do reshare, open an issue, and/or give the repo a star ⭐

github.com/axsaucedo/kaos 🔥

This week in ML Engineering:

Claude Code: It’s Not Replacing Devs
Agentic Monitoring Release of KAOS
OpenAI’s In-House Data Agent
DeepMind’s Take on World Models
MeiTuan Bi-Lingual Image Model
Open Source ML Frameworks
Awesome AI Guidelines to check out this week
+ more 🚀

Claude Code: It’s Not Replacing Devs

Excited to finally share my new article: “Claude Code: It’s not replacing devs. It’s moving them to a higher altitude.” 🚀 In this post I reflect on where the software engineering practice is going as a whole as it’s transformed by agentic coding tools.

TL;DR My biggest observation after building with agentic tooling is that the real unlock isn’t about “faster coding”; it’s about operating at a different cognitive level of abstraction. This creates a new reality, and when you can generate and integrate at a higher level, the differentiator becomes: 1) what you choose to build; 2) how precisely you specify it, and; 3) how efficiently you verify it.

Depending on the day of the week you may encounter one of these: “AI is making developers 10x”, “AI is making developers less productive”, “AI is coming for our jobs”, “AI enables coders”. Ironically I explore this with a meme that may give us the closest answer: specs as the age-old code abstraction.

And if specs become the primary artifact, then the next bottleneck is the execution engine. Namely the runtime that continuously reconciles intent vs reality (plans, delegates, validates, observes, and iterates); and “code” is just one of its outputs. It is now on us as practitioners and leaders to navigate this “shift upwards” in the cognitive stack; individuals can now do what teams could; teams can do what departments; and what follows should be able to invent the future.

OpenAI’s In-House Data Agent

OpenAI shares how they tackled text-to-sql at ChatGPT scale, and there are some learnings that most organisations can take: It is interesting that OpenAI is exploring the same domains that many organisations are when it comes to leveraging agentic systems to automate analytics and insights gathering. This is basically the usual translation of natural-language question into end-to-end analytics, including table discovery, SQL generation/execution, iterative self-correction, and synthesis. It seems that they tackle the same foundational basics, including schema/lineage and historical query patterns, with the ever-required human annotations. There are some interesting hints on Codex-derived code-level table semantics, which sounds like is what helps bring more annotated capabilities. It seems like their main advantage is the continuous eval systems they have built together with the curation from human annotations, but it seems like similar to the broader industry they are still also figuring out what works and what doesn’t. This is going to be quite interesting once we hit the claude-code moment for copilots in the space of analytics and BI; which we shouldn’t be too far at this stage!

DeepMind’s Take on World Models

This past week it seems more people are becoming aware of Foundation World models following DeepMind’s latest release of Genie: This basically encompasses models that are trained to predict images (aka worlds) from inputs for movement similar to that of a videogame. We have been covering various world models across the last few years, and it is mind blowing how fast these models are improving. One of the most exciting parts of these models is how at least it’s suggested that these may open a new way to capture more signal from the environment than other approaches (eg. laws of physics, interactions, etc). It’s great to see that we are now starting to see some of these models released as services as we’ll be able to see what actual value will be unlocked once it is taken to the test in the real world.

MeiTuan Bi-Lingual Image Model

Another interesting Large-Model from Chinese Giant MeiTuan, this time covering a bi-lingual text-to-image architecture that has impressive performance: LongCat-Image is an open-source bilingual diffusion foundation model that supports English and Mandarin. It is interesting to see the core model only at around 6B parameters which is a reasonable size for balancing VRAM, latency, and serving cost. The authors attribute most gains to an industrial-grade data and training pipeline which includes a 1.2B-sample corpus with heavy filtering/stratification plus explicit suppression of AI-generated contamination. It is also interesting to see that also now image models are seeing post-training alignment using similar techniques than LLMs such as multi-objective RL signals. It seems every week we get a new impressive model, but now it is not just about the size but also about performance as users want to be able to access it in commodity hardware.

Upcoming MLOps Events

The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below.

Events we are speaking at this year:

eTail Europe - March @ Berlin
World Summit AI Europe - September @ Amsterdam

Other relevant events:

KubeCon Europe - March @ Amsterdam
PyData Berlin - April @ Frankfurt
Databricks Summit - June @ San Francisco
World Developer Congress - July @ Berlin
EuroPython 2026 - July @ Prague
EuroSciPy 2026 - July @ Krakow
Code.Talks 2026 - Nov @ Hamburg
MLOps World 2026 - Nov @ Austin

In case you missed our talks, check our recordings below:

The State of AI in 2025 - WeAreDevelopers 2025
Prod Generative AI in 2024 - KubeCon AI Day 2025
The State of AI in 2024 - WeAreDevelopers 2024
Responsible AI Workshop Keynote - NeurIPS 2021
Practical Guide to ML Explainability - PyCon London
ML Monitoring: Outliers, Drift, XAI - PyCon Keynote
Metadata for E2E MLOps - Kubecon NA 2022
ML Performance Evaluation at Scale - KubeCon Eur 2021
Industry Strength LLMs - PyData Global 2022
ML Security Workshop Keynote - NeurIPS 2022

Open Source MLOps Tools

Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 20,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Here’s a few featured open source libraries that we maintain:

KAOS - K8s Agent Orchestration Service for managing the KAOS in large-scale distributed agentic systems.
Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
Production ML Tools - A curated list of tools to deploy, monitor and optimize machine learning systems at scale.
AI Policy List - A mature list that maps the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
Agentic Systems Tools - A new list that aims to map the emerging ecosystem of agentic systems with tools and frameworks for scaling this domain

Please do support some of our open source projects by sharing, contributing or adding a star ⭐

About us

The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning.

Check out our website

The Machine Learning Engineer

Ready for more?