Skip to content

Feature/add dependabot and codeql security checks#4479

Merged
danielhanchen merged 4 commits intounslothai:mainfrom
pkloehn1:feature/add-dependabot-and-codeql-security-checks
Mar 25, 2026
Merged

Feature/add dependabot and codeql security checks#4479
danielhanchen merged 4 commits intounslothai:mainfrom
pkloehn1:feature/add-dependabot-and-codeql-security-checks

Conversation

@pkloehn1
Copy link
Contributor

CI / SECURITY

Add automated security scanning and dependency update workflows.

CodeQL (.github/workflows/codeql.yml):

  • Scans Python and JavaScript/TypeScript on push and PR to main, plus weekly schedule
  • Uses build-mode: none — no compilation step required for either language

Dependabot (.github/dependabot.yml):

  • Weekly GitHub Actions updates, grouped into a single PR
  • Weekly pip updates for root, /studio/backend/plugins/data-designer-unstructured-seed, and /unsloth/kernels/moe
  • Weekly npm updates for /studio/frontend and /studio/backend/core/data_recipe/oxc-validator

Configure Dependabot to check for updates in various ecosystems weekly.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the project's security and maintainability by introducing automated dependency management. It configures Dependabot to regularly check for and update dependencies across various parts of the codebase, including GitHub Actions, Python (pip), and JavaScript (npm) packages, ensuring that the project stays current with the latest versions and security patches.

Highlights

  • Dependabot Configuration Added: A new dependabot.yml file was introduced to automate dependency updates across the repository.
  • GitHub Actions Updates: Dependabot was configured to perform weekly updates for GitHub Actions dependencies, grouped into a single pull request.
  • Python (pip) Dependency Updates: Weekly pip dependency updates were set up for the root directory, /studio/backend/plugins/data-designer-unstructured-seed, and /unsloth/kernels/moe.
  • JavaScript (npm) Dependency Updates: Weekly npm dependency updates were configured for /studio/frontend and /studio/backend/core/data_recipe/oxc-validator.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/codeql.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds Dependabot for automated dependency management, which is a great step towards improving repository security and maintenance. The Dependabot configuration is well-structured. I have one suggestion to further improve it by grouping dependency updates, which will help in managing the pull requests it creates by reducing their volume.

Comment on lines +12 to +37
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10

- package-ecosystem: "pip"
directory: "/studio/backend/plugins/data-designer-unstructured-seed"
schedule:
interval: "weekly"

- package-ecosystem: "pip"
directory: "/unsloth/kernels/moe"
schedule:
interval: "weekly"

- package-ecosystem: "npm"
directory: "/studio/frontend"
schedule:
interval: "weekly"
open-pull-requests-limit: 10

- package-ecosystem: "npm"
directory: "/studio/backend/core/data_recipe/oxc-validator"
schedule:
interval: "weekly"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and reduce the number of pull requests generated by Dependabot, it's a good practice to group dependency updates. You've already done this for github-actions. I suggest applying the same grouping strategy for all pip and npm package ecosystems. This will create a single pull request per directory for all its dependency updates, making them easier to manage.

  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "weekly"
    open-pull-requests-limit: 10
    groups:
      pip-root:
        patterns: ["*"]

  - package-ecosystem: "pip"
    directory: "/studio/backend/plugins/data-designer-unstructured-seed"
    schedule:
      interval: "weekly"
    groups:
      pip-data-designer-unstructured-seed:
        patterns: ["*"]

  - package-ecosystem: "pip"
    directory: "/unsloth/kernels/moe"
    schedule:
      interval: "weekly"
    groups:
      pip-moe-kernels:
        patterns: ["*"]

  - package-ecosystem: "npm"
    directory: "/studio/frontend"
    schedule:
      interval: "weekly"
    open-pull-requests-limit: 10
    groups:
      npm-frontend:
        patterns: ["*"]

  - package-ecosystem: "npm"
    directory: "/studio/backend/core/data_recipe/oxc-validator"
    schedule:
      interval: "weekly"
    groups:
      npm-oxc-validator:
        patterns: ["*"]

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e043baae0a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +27 to +30
- language: python
build-mode: none
- language: javascript-typescript
build-mode: none

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add actions to the CodeQL language matrix

Because the matrix only includes python and javascript-typescript, CodeQL will never analyze the repo's workflow files under .github/workflows/ (for example stale.yml and this new codeql.yml). GitHub's CodeQL action supports scanning GitHub Actions workflows separately, so workflow-specific security issues in CI remain completely uncovered despite this PR's goal of adding automated security checks.

Useful? React with 👍 / 👎.

…thai#4479

1. studio/frontend uses bun.lock not package-lock.json, so change npm to bun
2. Add missing studio/backend/requirements/ pip entry (consumed by studio/setup.sh)
3. Add groups with patterns ["*"] to all pip/bun/npm entries to batch updates
   and avoid 30+ individual Dependabot PRs on the first run
@danielhanchen
Copy link
Contributor

Pushed a fix commit to dependabot.yml addressing three issues:

1. studio/frontend should use bun, not npm
The directory has bun.lock and no package-lock.json. Using the npm ecosystem here would cause Dependabot to fail or produce incorrect results.

2. Missing studio/backend/requirements/ pip entry
This directory contains requirement .txt files consumed by studio/setup.sh (line 244). Without this entry, those dependencies would not be tracked.

3. Added groups with patterns: ["*"] to all pip/bun/npm entries
The github-actions entry already had grouping, but the rest did not. Without grouping, the first Dependabot run would create 30+ individual PRs (one per outdated dependency). With patterns: ["*"], updates are batched into one PR per ecosystem/directory.

No changes to codeql.yml -- that file looks good as-is.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0830768379

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

patterns: ["*"]

- package-ecosystem: "pip"
directory: "/"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove overlapping pip directories from Dependabot config

The pip update block at directory: "/" overlaps with the later pip blocks for /studio/backend/plugins/data-designer-unstructured-seed, /studio/backend/requirements, and /unsloth/kernels/moe, and Dependabot requires per-ecosystem directory scopes to be non-overlapping. In this state, the config can be rejected (or pip updates skipped), which defeats the new security-update automation for Python dependencies.

Useful? React with 👍 / 👎.

GitHub Dependabot forbids multiple same-ecosystem entries with
overlapping directories on the same branch. The root "/" directory
overlapped the 3 nested pip dirs. Merge all 4 pip blocks into one
using the `directories:` (plural) key.

Also remove redundant open-pull-requests-limit from the bun block
since grouping with patterns: ["*"] already limits PR count.
@danielhanchen
Copy link
Contributor

Follow-up commit pushed: consolidated the 4 separate pip blocks into a single block using directories: (plural key).

Problem: The root "/" pip directory overlapped the 3 nested pip directories (/studio/backend/plugins/..., /studio/backend/requirements, /unsloth/kernels/moe). GitHub Dependabot does not allow multiple same-ecosystem entries with overlapping directories on the same branch.

Fix: Merged all 4 pip entries into one block:

- package-ecosystem: "pip"
  directories:
    - "/"
    - "/studio/backend/plugins/data-designer-unstructured-seed"
    - "/studio/backend/requirements"
    - "/unsloth/kernels/moe"
  schedule:
    interval: "weekly"
  open-pull-requests-limit: 10
  groups:
    pip:
      patterns: ["*"]

Also removed redundant open-pull-requests-limit from the bun block since grouping with patterns: ["*"] already limits PR volume.

@danielhanchen
Copy link
Contributor

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces automated security scanning with Dependabot by adding a .github/dependabot.yml configuration file. The configuration is well-structured and covers GitHub Actions, pip, bun, and npm ecosystems. I've found a potential misconfiguration in the pip dependency paths that could cause Dependabot to miss some dependencies. My specific comment provides details on how to correct this.

Comment on lines +13 to +17
directories:
- "/"
- "/studio/backend/plugins/data-designer-unstructured-seed"
- "/studio/backend/requirements"
- "/unsloth/kernels/moe"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There seem to be a couple of issues with the configured pip directories that will prevent Dependabot from finding some of your dependencies:

  1. / (root directory): Based on the files in this pull request, there doesn't appear to be a requirements.txt or pyproject.toml file in the root directory. If one doesn't exist, Dependabot will not find any dependencies to update here. Please either add a manifest file to the root or remove this entry if it's not needed.

  2. /studio/backend/requirements: Dependabot looks for specific file names like requirements.txt or pyproject.toml. The file studio/backend/requirements/base.txt will not be detected. To fix this, you could rename base.txt to requirements.txt. If base.txt is included from another requirements file (e.g., in /studio/backend), you should point to that directory instead.

These misconfigurations will cause Dependabot to silently fail to update dependencies in these locations.

@pkloehn1
Copy link
Contributor Author

Follow-up commit pushed: consolidated the 4 separate pip blocks into a single block using directories: (plural key).

Problem: The root "/" pip directory overlapped the 3 nested pip directories (/studio/backend/plugins/..., /studio/backend/requirements, /unsloth/kernels/moe). GitHub Dependabot does not allow multiple same-ecosystem entries with overlapping directories on the same branch.

Fix: Merged all 4 pip entries into one block:

- package-ecosystem: "pip"
  directories:
    - "/"
    - "/studio/backend/plugins/data-designer-unstructured-seed"
    - "/studio/backend/requirements"
    - "/unsloth/kernels/moe"
  schedule:
    interval: "weekly"
  open-pull-requests-limit: 10
  groups:
    pip:
      patterns: ["*"]

Also removed redundant open-pull-requests-limit from the bun block since grouping with patterns: ["*"] already limits PR volume.

Thanks @danielhanchen - I didn't want to make assumptions about how the project wanted these types of tools to group findings.

@danielhanchen danielhanchen merged commit efedbe9 into unslothai:main Mar 25, 2026
1 check passed
@danielhanchen
Copy link
Contributor

Appreciate it @pkloehn1 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants