Skip to content

Apply DIFC integrity filtering to pre-agentic gh CLI and actions/github-script steps #22792

@lpcox

Description

@lpcox

Summary

The MCP Gateway proxy (awmg proxy) now provides integrity filtering for GitHub API requests made by gh CLI (via GH_HOST) and actions/github-script (via base-url). This has been validated with 47 smoke tests covering REST, GraphQL, and search operations.

However, all current gh-aw compiled workflows only enable the proxy during the agent execution phase. Pre-agentic steps in the activation job — which fetch issue/PR data, add reactions, post comments, and gather metadata — make direct unfiltered API calls to api.github.com.

This issue tracks the work needed to extend integrity filtering to cover these pre-agentic steps.

Current State

What Works Today (tested in smoke-proxy-github-script.md)

The proxy supports filtering for:

  • gh CLI: Set GH_HOST=localhost:<port> → proxy strips /api/v3 prefix, applies DIFC pipeline
  • actions/github-script: Set base-url: "http://localhost:<port>" → octokit routes through proxy
  • GraphQL: Query classification, field injection for integrity labels, viewer/search scope enforcement
  • 47 smoke tests covering: list/read issues, PRs, commits, branches, tags, releases, labels, search (code/issues/repos), actions, file contents, and out-of-scope blocking

What's Not Covered

Pre-agentic steps in all 24+ production workflows make direct API calls without proxy protection:

activation job (PRE-AGENTIC — NO PROXY)
  ├─ generate_aw_info.cjs        → direct API (workflow metadata)
  ├─ add_reaction.cjs            → direct API (emoji reactions)
  ├─ check_workflow_timestamp.cjs → direct API (timestamp checks)
  ├─ compute_text.cjs            → direct API (issue/PR body fetch)
  ├─ add_workflow_run_comment.cjs → direct API (post comments)
  ├─ interpolate_prompt.cjs      → direct API (template rendering)
  └─ (various other github-script steps)
       ↓
agent job (AGENTIC — PROXY ENABLED via --enable-api-proxy)
  └─ Copilot CLI execution with full DIFC filtering

Proposed Changes

1. Compiler: Start proxy in activation job

The gh aw compile step should inject a proxy startup step early in the activation job, before any actions/github-script or gh CLI calls:

# Injected by compiler into activation job
- name: Start DIFC proxy
  run: |
    docker run -d --name awmg-proxy --network host \
      ghcr.io/github/gh-aw-mcpg:$VERSION proxy \
        --policy '${{ env.GUARD_POLICY }}' \
        --listen 0.0.0.0:18443 \
        --guards-mode filter

2. Compiler: Route actions/github-script through proxy

All actions/github-script steps in the activation job should get base-url injected:

- uses: actions/github-script@v8
  with:
    base-url: "http://localhost:18443"  # ← injected by compiler
    script: |
      // existing script unchanged

3. Compiler: Route gh CLI through proxy

Any gh CLI steps should get GH_HOST environment variable:

- name: Get release info
  env:
    GH_HOST: "localhost:18443"  # ← injected by compiler
  run: gh release view ...

4. Pre-agentic script compatibility

The existing .cjs scripts (generate_aw_info.cjs, add_reaction.cjs, etc.) use the octokit instance provided by actions/github-script. Setting base-url on the action should be sufficient — no script changes needed.

However, any scripts that construct their own HTTP clients or hardcode api.github.com would need updating.

5. Write operation handling

Currently the proxy only applies DIFC filtering to read operations (GET + POST to /graphql). Write operations (POST/PUT/PATCH/DELETE) pass through unfiltered. This is acceptable for the initial rollout since:

  • Pre-agentic writes are mostly reactions and comments (low risk)
  • The guard doesn't have write-ahead labeling capability
  • Write filtering can be added later as a separate track

Design Considerations

Activation job proxy lifecycle

  • Proxy must start before any API-calling step and stay running through the job
  • The agent job already has its own proxy via --enable-api-proxy; the activation proxy is separate
  • Proxy container should be cleaned up at job end (or use --rm with health check)

Policy derivation

  • The activation job needs the same policy as the agent job
  • Policy is already computed from workflow frontmatter — needs to be available as an env var or step output
  • Consider whether activation steps should use a more permissive policy (they're trusted code, not AI-generated)

TLS considerations

  • gh CLI with GH_HOST pointing to a non-TLS endpoint works for http://
  • actions/github-script base-url works with plain HTTP
  • For production, TLS may be desirable — the proxy supports --tls with auto-generated certs

Performance

  • Proxy adds ~2-5ms per request (local loopback)
  • Activation steps typically make 3-8 API calls total
  • Total overhead: <50ms — negligible

Testing Done

The following has been validated in CI (run 23503983029):

Category Tests Status
actions/github-script REST (in/out scope) 6
gh CLI REST with /api/v3 prefix 6
gh CLI GraphQL with field injection 4
Single-resource reads (issues, PRs, commits) 12
List operations (branches, tags, releases) 8
Search operations (code, issues, repos) 3
GraphQL expansions (PRs, commits, search) 4
Global endpoint blocking (get_me, viewer) 2
Out-of-scope blocking 2
Total 47

Proxy Route Coverage

The proxy currently maps 48 REST route patterns to 25 unique tool names, plus 13 GraphQL tool names. The Rust guard has integrity rules for 34+ tools. Full coverage details in gh-aw-mcpg proxy documentation.

Related

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions