In the last 3-4 months, AI models have made an immense jump in exploitation capabilities. Several talks and blog posts highlight the "new" capabilities of frontier AI models. The agents have learned from countless CTF writeups, research papers on exploitation techniques, and conference talks/demonstrations on how to automate diverse techniques.
In an agentic workflow, these models are incredibly skilled at advanced exploitation. The key finding is that they render tedious mitigations useless as the agent can incrementally improve the exploit on its own thereby weaponizing proof-of-concept crashes into full chains.
"Early" LLM systems (i.e., in 2025) were used as a static analysis. Essentially telling the chat bot to find bugs in a given file and to write a vulnerability report. This worked, sometimes, very well but, most of the time, produced AI slop that was not instantiateable. Bug bounty programs were flooded with this unvalidated slop and many open source developers spoke out against this type of contribution.
But with the rise of agentic workflows, LLMs can leverage feedback to improve. A simple workflow, similar to the one presented by Nicholas in his unprompted talk breaks down the exploit synthesis into several steps.
Seeding reports: The first step is to seeding potential bug candidates and asking the first agent to go, one module at a time, one file at a time, through the source base and to find potential security bugs. One may have to tell it that this is for defense purposes to side step some ethic mitigations but generally the LLMs comply well with this task. This will result in hundreds of mostly incomplete vulnerability reports for the average project. Instead of submitting them to bug bounty programs and DoSing developers, the next agent improves the reports. This step is most like a static analysis.
Reaching locations: The second agent instantiates the PoCs. Based on the vulnerability reports, the agent tries to infer a path that reaches the bug location. This will result in a few vulnerability reports with reachable bug locations and a set of false positives. This under-approximation already leverages a concrete execution environment in which the agent can validate its findings. This step is comparable to a poor man's symbolic execution engine.
Trigering the bug: The third agent tries to trigger the bug based on probable bugs from the previous steps. The advantage is that the earlier agent has already validated that the location is reachable and this agent can not concentrate on the mutation of the seed to trigger the bug. This step results in a few validates crashes and is akin to a fuzzer.
Exploitation: The fourth step takes the initial PoC crashes, analyzes them and bypasses any deployed mitigations. Some of the frontier models try to hold back in this step and one may have to convince its agent that they are playing a CTF.
The key advantage of LLMs and AI in this environment is the automation. This simple four-stage pipeline combines code review (static analysis), path synthesis (symbolic execution), seed mutation (fuzzing), and exploitation (bypassing mitigations). All of these steps required significant manual analysis before and are now promptable.
In their current state, LLM agents favor attackers. On one hand, this will destroy the market for exploits as they now becomes a cheap commodity. On the other hand, developers will be flooded with validated bug reports. Initially, this will be tough but hopefully, as projects embrace LLM-based bug search, we will see an improvement in code quality. The area I'm most excited about is how to further improve defensive capabilities and, most essentially, how to automate patching of the discovered bugs. Let me know if you have ideas!