Skip to content

Conversation

@dvyukov
Copy link
Collaborator

@dvyukov dvyukov commented Oct 27, 2025

Add infrastructure for defining LLM agent workflows. The workflows can be executed using the tools/syz-agent tool from the command line, or programmatically.

Add infrastructure for defining LLM agent workflows.
The workflows can be executed using the tools/syz-agent tool
from the command line, or programmatically.
@dvyukov
Copy link
Collaborator Author

dvyukov commented Oct 27, 2025

@sirdarckcat fyi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool to split this off to pkg/agent. I was wondering what was the convention for stuff like this!


adk "google.golang.org/adk/agent"
"google.golang.org/adk/agent/llmagent"
"google.golang.org/adk/agent/workflowagents/sequentialagent"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sequential looks cool! But I think we may need stuff more complicated. https://cloud.google.com/blog/products/ai-machine-learning/build-multi-agentic-systems-using-google-adk describes a more complex flow.

I was thinking it would be nice to have one agent (lets call him Hausarzt) call specialists (oncologists, neurologists, etc) and lab tests (x-ray, CT scan, MRI, biopsies, surgeries etc). And have the coordinator interpret just the final result of these specialists.

I feel that a sequential agent like the one I built before is prone to errors that propagate and don't get corrected. While having a main coordinator that is allowed to call out to other agents for help during it's thinking is more likely to reach a more informed opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do most things described in the doc.
Option 1: the plan is to add AgentTool that other agents can invoke as a normal tool.
Option 2: the sequential agent as I implemented it is slightly different from the ADK sequential agent, it only defines execution order, but not dataflow relation. So we can have a sequential agent consisting of sub-agents {A, B, C, D} where A/B/C only produce some results, but B does not use A's results, and C does not use A's and B's results, at the end only D (Hausarzt) uses A's, B's and C's results. From dataflow perspective A/B/C are effectively parallel.
We can easily add real ParallelAgent, but I am not sure yet it's a good idea (harder to assess/debug, and harder to share a common kernel checkout).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that work for the workflows you have in mind?

Copy link
Member

@sirdarckcat sirdarckcat Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I assume would yield better results:

  1. We give the Hausarzt a prompt to extract some data from the crash (say subsystems involved, crash type, etc) diagnose and a mechanism to invoke other agents. For example (just making this up now) if the bug looks like a null-ptr deref, then it needs to call two specialists, one that specializes in that subsystem and one that specializes in null-ptr deref.
    1. the subsystem specialist has a prompt that asks it to understand the subsystem, can query documentation, search for lkml emails and old commits/bugs in that subsystem, syzbot history and should answer how similar bugs/crashes have been fixed before.
    2. the null-ptr deref specialist has a prompt asking it to be an expert in NPDs and has a playbook and several good examples of how NPDs look like. It can read code and also has special tools for finding potential NPDs, and should answer some additional details about the NPD possibility, providing set of hypothesis and tests to either diagnose a specific NPD subtype or to eliminate NPD as a whole.
  2. With the information from the subsystem specialist and the NPD specialist, the Hautarzt, still within the initial prompt (step 2 was a function call within the initial prompt), decides to try a few tests proposed by the NPD specialist.
    1. One test requires a ftrace on the crash
    2. One test requires a breakpoint
    3. One test requires a breakpoint + some logic
  3. Based on the results of the experiments, the Hausarzt decides to either repeat the experiment (if it's not very reliable) or to make a diagnosis, or to give the results back to the specialist.
    ...

Right now what I have is that the agents each suggest different ideas on what could happen and then if they notice they are wrong, I kinda wish they restarted with the new hypothesis. So the idea I had was to have a Hausarzt that just has as a job to evaluate reports from specialists or a "coordinator" (as the docs call it) which just calls other agents. And since some of the results may benefit from being done in parallel (like multiple tests/experiments) which are slow anyway (like may require triggering the bug).

The alternative was to add backtracking but that doesn't seem easy either.

OutputKey: "explanation",
Instruction: debuggingInstruction,
Prompt: debuggingPrompt,
Tools: []agent.Tool{codesearch.Tool},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I was thinking is that maybe (same as what you did here) the agents can be configured as JSON. And then we can configure the prompts and tools available to each "specialist" on a JSON file.

The reason why a JSON file is slightly better than a struct in the code is because that way I could run the binary anywhere where the tools are available and iterate there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebuilding a Go program takes negligible time, so what does a json file give you? If you have go run command in your command line, you can edit the definition here, and re-run the same command. Same workflow you would have with a json file.
A json file has several downsides:

  • prompts will look extremely terrible since you would need to escape newlines and everything else
  • won't provide typed structs for programmatic use of the same pipeline
  • if you need to edit tools, you still need to edit code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because I run the agent on a different machine from the one I'm developing and the one I'm running this on doesn't have a good dev environment.

But you are right about the formatting, I could suggest YAML haha 😂. But anyway that's why I was using flags on my experiment, as the flags were easy to edit on every run.

Mainly the problem is that I think trying different prompts and experimenting with different tools available to different agents helps to see how deep or confused the agents get.

func main() {
var (
flagFlow = flag.String("workflow", "", "workflow to execute")
flagInput = flag.String("input", "", "input json file with workflow arguments")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one thing we need (which syzbot doesn't provide programmatically) is the kernel image and disk image to repro bugs.

What's on the JSON file anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think architecturally the pipeline should build artifacts it needs, rather than rely on somebody else to magically have everything it may possible need. Consider: before we determined the subsystem of the bug, we don't even know the right kernel tree to develop a fix, so it's not theoretically possible to provide a build for that. Also it still needs to build and test generated patches, so if it can do these builds, why it's asking for the initial build to be done by somebody else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's on the JSON file anyway?

It's the inputs object for the specified pipeline. For the patching pipeline it's serialized patching.Inputs struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can provide it. But it'll take a while to generate. It would be good to provide it as an option at least when available because at least for syzbot a lot of crashes come from the same commit so we don't need to rebuild every time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly thinking on the evaluation experience. Like if every test is going to take a few hours to run, and the first step is going to be to wait for a build. Then it will be impossible not to context switch and work on something else.

var (
flagFlow = flag.String("workflow", "", "workflow to execute")
flagInput = flag.String("input", "", "input json file with workflow arguments")
flagLargeModel = flag.Bool("large-model", false, "use large/expensive model")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be surprised if we have a usecase for the smaller model. The large/expensive one already seems not good enough. Maybe let's default to the expensive one and downgrade once we know it works well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants