Skip to content

feat(plugins): add on_agent_error_callback and on_run_error_callback lifecycle hooks#4974

Open
STHITAPRAJNAS wants to merge 3 commits intogoogle:mainfrom
STHITAPRAJNAS:feat/lifecycle-error-callbacks-4774
Open

feat(plugins): add on_agent_error_callback and on_run_error_callback lifecycle hooks#4974
STHITAPRAJNAS wants to merge 3 commits intogoogle:mainfrom
STHITAPRAJNAS:feat/lifecycle-error-callbacks-4774

Conversation

@STHITAPRAJNAS
Copy link

Summary

Fixes #4774

When an unhandled exception propagates out of an agent's _run_async_impl / _run_live_impl, or out of the runner's execution loop, the existing after_agent_callback / after_run_callback were silently skipped. This made fatal failures invisible to observability plugins (e.g. BigQuery analytics), inflating success rates and losing failure events entirely.

Changes

  • BasePlugin: add on_agent_error_callback(agent, callback_context, error) and on_run_error_callback(invocation_context, error) with safe no-op defaults.
  • PluginManager: add run_on_agent_error_callback / run_on_run_error_callback dispatch methods backed by a new _run_error_callbacks helper that fans out to all plugins (no early-exit) and logs — but does not propagate — individual plugin failures.
  • base_agent.py: wrap run_async / run_live generator loops in try/except; call run_on_agent_error_callback before re-raising.
  • runners.py: wrap the execute_fn generator loop in try/except; call run_on_run_error_callback before re-raising. after_run_callback is intentionally skipped on the error path so plugins can distinguish clean completions from fatal failures.

Design decisions

Decision Rationale
Error callbacks are fire-and-forget (no early-exit, no return value) They are pure observers — a broken plugin must not prevent others from recording the failure
after_run_callback is not called on error Allows plugins to distinguish success from failure; analogous to how on_model_error_callback vs after_model_callback work
Individual plugin failures are logged, not re-raised A broken observability plugin must never hide the original error
The original exception is always re-raised The framework does not suppress errors; this is notification only

Testing plan

  • tests/unittests/plugins/test_lifecycle_error_callbacks.py — 12 tests covering BasePlugin defaults, PluginManager fan-out semantics, argument forwarding, and no-early-exit behaviour
  • tests/unittests/runners/test_runner_error_callbacks.py — 7 integration tests covering runner error/success paths with real InMemoryRunner
  • tests/unittests/agents/test_agent_error_callbacks.py — 11 tests covering run_async and run_live error paths
  • All 688 existing plugin + agent + runner tests pass unchanged

…lifecycle hooks

Fixes google#4774

When an unhandled exception propagates out of an agent's _run_async_impl /
_run_live_impl, or out of the runner's execution loop, the existing
after_agent_callback / after_run_callback were silently skipped.  This made
fatal failures invisible to observability plugins (e.g. BigQuery analytics),
inflating success rates and losing failure events entirely.

Changes:
- BasePlugin: add on_agent_error_callback(agent, callback_context, error)
  and on_run_error_callback(invocation_context, error) with safe no-op defaults.
- PluginManager: add run_on_agent_error_callback / run_on_run_error_callback
  dispatch methods backed by a new _run_error_callbacks helper that fans out
  to ALL plugins (no early-exit) and logs — but does not propagate —
  individual plugin failures.
- base_agent.py: wrap run_async / run_live generator loops in try/except;
  call run_on_agent_error_callback before re-raising.
- runners.py: wrap the execute_fn generator loop in try/except;
  call run_on_run_error_callback before re-raising.
  after_run_callback is intentionally skipped on the error path so plugins
  can distinguish clean completions from fatal failures.

Tests (30 new, all passing):
- tests/unittests/plugins/test_lifecycle_error_callbacks.py
- tests/unittests/runners/test_runner_error_callbacks.py
- tests/unittests/agents/test_agent_error_callbacks.py
@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Mar 24, 2026
@rohityan rohityan self-assigned this Mar 24, 2026
@rohityan
Copy link
Collaborator

Hi @STHITAPRAJNAS , Thank you for your contribution! We appreciate you taking the time to submit this pull request.
Can you please fix the failing formatting tests. You can use autoformat.sh

@rohityan rohityan requested a review from Jacksunwei March 24, 2026 20:01
@rohityan
Copy link
Collaborator

Hi @Jacksunwei , can you please review this.

@STHITAPRAJNAS
Copy link
Author

Formatting done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Lifecycle Error Callbacks (on_agent_error, on_run_error) to ADK Framework

3 participants