Skip to content

Fix #5847: Replace shell binary check with Kotlin script and tests#6168

Open
Sandesh282 wants to merge 1 commit intooppia:developfrom
Sandesh282:fix-binary-files-check-5847
Open

Fix #5847: Replace shell binary check with Kotlin script and tests#6168
Sandesh282 wants to merge 1 commit intooppia:developfrom
Sandesh282:fix-binary-files-check-5847

Conversation

@Sandesh282
Copy link
Collaborator

Fixes #5847

Explanation

Fixes #5847: The existing binary file check in pre-commit.sh uses file --mime | grep binary, which produces false positives on text files like .textproto, .json, .xml, and .md. This PR replaces that unreliable shell-based check with a robust Kotlin script (BinaryFileCheck.kt) that uses a two-layer detection approach:

  1. Extension allowlist: Files with known text extensions (.kt, .java, .xml, .json, .textproto, .proto, .md, .sh, .bazel, .pro, etc.) pass extension validation. Files with unknown extensions are flagged as binary (fail-safe).
  2. Content validation: Files with allowed extensions are scanned for binary content by checking that all characters are valid text characters (printable ASCII, Unicode whitespace, or any defined non-control Unicode character).

A filepath-based exemption list (binary_file_exemptions.textproto) is used to allow legitimate binary files (e.g., PNG images, GPG keys). Stale exemptions (paths that no longer exist) cause the check to fail, keeping the list up to date.

Changes to scripts/assets: Added binary_file_exemptions.textproto containing the 20 existing legitimate binary files in the repository (PNG images in drawable-nodpi/ and mipmap-*/, GPG key files in .gitsecret/, and the encrypted remote cache credentials config).

Essential Checklist

  • The PR title starts with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • The explanation section above starts with "Fixes #bugnum: " (If this PR fixes part of an issue, use instead: "Fixes part of #bugnum: ...".)
  • Any changes to scripts/assets files have their rationale included in the PR explanation.
  • The PR follows the style guide.
  • The PR does not contain any unnecessary code changes from Android Studio (reference).
  • The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
  • The PR is assigned to the appropriate reviewers (reference).

For UI-specific PRs only

Not applicable. This PR only modifies scripts, build configs, and CI workflows with no UI changes.

@Sandesh282 Sandesh282 requested review from a team as code owners March 20, 2026 08:57
@github-actions
Copy link

Coverage Report

Results

Number of files assessed: 1
Overall Coverage: 90.38%
Coverage Analysis: PASS

Passing coverage

Files with passing code coverage
File Coverage Lines Hit Status Min Required
BinaryFileCheck.ktscripts/src/java/org/oppia/android/scripts/binary/BinaryFileCheck.kt
90.38% 94 / 104 70%

To learn more, visit the Oppia Android Code Coverage wiki page

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Binary Files Check Throws False Positives

1 participant