Feature/add dependabot and codeql security checks#4479
Conversation
Configure Dependabot to check for updates in various ecosystems weekly.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the project's security and maintainability by introducing automated dependency management. It configures Dependabot to regularly check for and update dependencies across various parts of the codebase, including GitHub Actions, Python (pip), and JavaScript (npm) packages, ensuring that the project stays current with the latest versions and security patches. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds Dependabot for automated dependency management, which is a great step towards improving repository security and maintenance. The Dependabot configuration is well-structured. I have one suggestion to further improve it by grouping dependency updates, which will help in managing the pull requests it creates by reducing their volume.
| - package-ecosystem: "pip" | ||
| directory: "/" | ||
| schedule: | ||
| interval: "weekly" | ||
| open-pull-requests-limit: 10 | ||
|
|
||
| - package-ecosystem: "pip" | ||
| directory: "/studio/backend/plugins/data-designer-unstructured-seed" | ||
| schedule: | ||
| interval: "weekly" | ||
|
|
||
| - package-ecosystem: "pip" | ||
| directory: "/unsloth/kernels/moe" | ||
| schedule: | ||
| interval: "weekly" | ||
|
|
||
| - package-ecosystem: "npm" | ||
| directory: "/studio/frontend" | ||
| schedule: | ||
| interval: "weekly" | ||
| open-pull-requests-limit: 10 | ||
|
|
||
| - package-ecosystem: "npm" | ||
| directory: "/studio/backend/core/data_recipe/oxc-validator" | ||
| schedule: | ||
| interval: "weekly" |
There was a problem hiding this comment.
To improve maintainability and reduce the number of pull requests generated by Dependabot, it's a good practice to group dependency updates. You've already done this for github-actions. I suggest applying the same grouping strategy for all pip and npm package ecosystems. This will create a single pull request per directory for all its dependency updates, making them easier to manage.
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
groups:
pip-root:
patterns: ["*"]
- package-ecosystem: "pip"
directory: "/studio/backend/plugins/data-designer-unstructured-seed"
schedule:
interval: "weekly"
groups:
pip-data-designer-unstructured-seed:
patterns: ["*"]
- package-ecosystem: "pip"
directory: "/unsloth/kernels/moe"
schedule:
interval: "weekly"
groups:
pip-moe-kernels:
patterns: ["*"]
- package-ecosystem: "npm"
directory: "/studio/frontend"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
groups:
npm-frontend:
patterns: ["*"]
- package-ecosystem: "npm"
directory: "/studio/backend/core/data_recipe/oxc-validator"
schedule:
interval: "weekly"
groups:
npm-oxc-validator:
patterns: ["*"]There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e043baae0a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| - language: python | ||
| build-mode: none | ||
| - language: javascript-typescript | ||
| build-mode: none |
There was a problem hiding this comment.
Add
actions to the CodeQL language matrix
Because the matrix only includes python and javascript-typescript, CodeQL will never analyze the repo's workflow files under .github/workflows/ (for example stale.yml and this new codeql.yml). GitHub's CodeQL action supports scanning GitHub Actions workflows separately, so workflow-specific security issues in CI remain completely uncovered despite this PR's goal of adding automated security checks.
Useful? React with 👍 / 👎.
…thai#4479 1. studio/frontend uses bun.lock not package-lock.json, so change npm to bun 2. Add missing studio/backend/requirements/ pip entry (consumed by studio/setup.sh) 3. Add groups with patterns ["*"] to all pip/bun/npm entries to batch updates and avoid 30+ individual Dependabot PRs on the first run
|
Pushed a fix commit to 1. 2. Missing 3. Added No changes to |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0830768379
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
.github/dependabot.yml
Outdated
| patterns: ["*"] | ||
|
|
||
| - package-ecosystem: "pip" | ||
| directory: "/" |
There was a problem hiding this comment.
Remove overlapping pip directories from Dependabot config
The pip update block at directory: "/" overlaps with the later pip blocks for /studio/backend/plugins/data-designer-unstructured-seed, /studio/backend/requirements, and /unsloth/kernels/moe, and Dependabot requires per-ecosystem directory scopes to be non-overlapping. In this state, the config can be rejected (or pip updates skipped), which defeats the new security-update automation for Python dependencies.
Useful? React with 👍 / 👎.
GitHub Dependabot forbids multiple same-ecosystem entries with overlapping directories on the same branch. The root "/" directory overlapped the 3 nested pip dirs. Merge all 4 pip blocks into one using the `directories:` (plural) key. Also remove redundant open-pull-requests-limit from the bun block since grouping with patterns: ["*"] already limits PR count.
|
Follow-up commit pushed: consolidated the 4 separate Problem: The root Fix: Merged all 4 pip entries into one block: - package-ecosystem: "pip"
directories:
- "/"
- "/studio/backend/plugins/data-designer-unstructured-seed"
- "/studio/backend/requirements"
- "/unsloth/kernels/moe"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
groups:
pip:
patterns: ["*"]Also removed redundant |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces automated security scanning with Dependabot by adding a .github/dependabot.yml configuration file. The configuration is well-structured and covers GitHub Actions, pip, bun, and npm ecosystems. I've found a potential misconfiguration in the pip dependency paths that could cause Dependabot to miss some dependencies. My specific comment provides details on how to correct this.
| directories: | ||
| - "/" | ||
| - "/studio/backend/plugins/data-designer-unstructured-seed" | ||
| - "/studio/backend/requirements" | ||
| - "/unsloth/kernels/moe" |
There was a problem hiding this comment.
There seem to be a couple of issues with the configured pip directories that will prevent Dependabot from finding some of your dependencies:
-
/(root directory): Based on the files in this pull request, there doesn't appear to be arequirements.txtorpyproject.tomlfile in the root directory. If one doesn't exist, Dependabot will not find any dependencies to update here. Please either add a manifest file to the root or remove this entry if it's not needed. -
/studio/backend/requirements: Dependabot looks for specific file names likerequirements.txtorpyproject.toml. The filestudio/backend/requirements/base.txtwill not be detected. To fix this, you could renamebase.txttorequirements.txt. Ifbase.txtis included from another requirements file (e.g., in/studio/backend), you should point to that directory instead.
These misconfigurations will cause Dependabot to silently fail to update dependencies in these locations.
Thanks @danielhanchen - I didn't want to make assumptions about how the project wanted these types of tools to group findings. |
|
Appreciate it @pkloehn1 ! |
CI / SECURITY
Add automated security scanning and dependency update workflows.
CodeQL (
.github/workflows/codeql.yml):main, plus weekly schedulebuild-mode: none— no compilation step required for either languageDependabot (
.github/dependabot.yml):/studio/backend/plugins/data-designer-unstructured-seed, and/unsloth/kernels/moe/studio/frontendand/studio/backend/core/data_recipe/oxc-validator