A Cambrian Era for Science
From agentic citation verification to semi-autonomous frontier research
We made an open source agentic science colleague named Matilde and published novel frontier research in less than 24 hours.
We are entering an era where the engine of science churns with an unprecedented explosion of auditable and open agentic science.
This new epoch shouldn't be led by ivory tower labs in wealthy countries and walled garden publishing mafias, but by small home servers, collaborative transparency, local expertise and the spirit of curiosity.
What if the antiquated and elitist form of The Paper was replaced by interactive and customisable displays that anyone can critique, fork, or pick up where you left off? Public goods infrastructure like Wikipedia hands a torch to open science via apps like iNaturalist where people can share and validate each others findings. But what if science as a whole could move similarly? What if agents could drive the kind of thankless and unpublishable, but nonetheless imperative null hypotheses and finding replications our current system so desperately lacks? What if you could lend a negligeable bit of compute from your laptop overnight to a field of inquiry you care about and have expertise in?
We built this world, right now.
Matilde started as a simple citation verification agent to pushback against the LLM hallucinations currently undermining academic publishing. It verifies citations across 4 axes: existence, metadata matching, retraction status, and whether the URL is alive.
From there we moved on to agentic findings validation from open datasets. We were able to semi-autonomously reproduce the famous M100 result found via MEG, a technique that measures magnetic fields from brain activity.
Our hunger for knowledge unsated, we proceeded to conduct novel pharmacological research using established methodologies and publicly available data and then to share our findings via an interactive dashboard one-shotted by the agent itself.
Thousands of people are taking experimental peptides while relying on outdated if even existant research. However, they're also logging what they took, how much, and what happened to them publicly. We pointed the agentic science colleague at the peptide subreddits following a methodolgy pioneered by Seghal et al (2026). It read all ~3,000. A Signal group chat invoked a local open model on a Mac Mini living in a closet to tag every post by what the peptide actually did and compile the results.
The agent then checked the claims against published research. BPC-157 and tendons, Selank and Semax and the Russian trials nobody cites. Do the self-reports match the literature? Yes and no! Both interesting.
When OpenAI staggered the mathematics community with a novel counterexample to a notoriously difficult Erdős problem, they also refused to publish the entire chain of thought. We believe in science not enclosure, so we shared everything. The frank limitations of our study, the dataset, the encodings, the conversation, the tool calls, and even the reasoning itself.
To make this process even easier, we’ve made the Matilde base package a point and click option right in Hermes Swarm Map using the use-case templates from our last blogpost.
Our findings and methods here are not meant to be conclusive or exhaustive. They're meant to be a baton and everything the next runner needs to succeed where we fell short.
Follow your curiosity.Show your work.