Add resumable multipart upload support for S3#28506
Conversation
Accept uploadId, partNumber, and previousParts options on the S3 writer API to enable resuming interrupted multipart uploads. When uploadId is provided: - Skip CreateMultipartUpload (reuse the existing upload) - Start part numbering from the given partNumber - Include previousParts ETags in the CompletMultipartUpload XML - Never fall back to single PUT (stay in multipart mode) Validation: - partNumber > 1 requires uploadId - previousParts requires uploadId - partNumber must be 1-10000 - Each previousParts entry must have partNumber and etag
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughS3 multipart upload gained resumable support: options Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/bun.js/webcore/Blob.zig`:
- Around line 1130-1132: The review points out that several S3.uploadStream(...)
call sites in Blob.zig (e.g., the calls near where aws_options is parsed) still
pass the hardcoded arguments null, 1, &.{} instead of propagating the parsed
aws_options.resume fields; update every S3.uploadStream(...) invocation that
already builds aws_options to pass aws_options.resume.uploadId,
aws_options.resume.partNumber, and aws_options.resume.previousParts (or the
equivalent resume tuple used in this file) in place of null, 1, &.{} so the
uploadId/partNumber/previousParts validated earlier are forwarded to Bun.write,
Request/Response body locking, and pipeReadableStreamToBlob (fix similar call
sites at the other ranges mentioned).
In `@src/s3/credentials.zig`:
- Around line 340-347: The current validation allows resuming at part_number > 1
without earlier-part ETags (new_credentials.part_number,
new_credentials.previous_parts), which can lead to building
CompleteMultipartUpload from only new parts; update the check so that when
new_credentials.part_number > 1 you also require
new_credentials.previous_parts.items.len > 0 (or, if upload_id is present and
previous_parts is empty, call ListParts to hydrate multipart_etags before
proceeding) so that multipart_etags in client.zig is always seeded with
earlier-part ETags prior to completion.
In `@test/js/bun/s3/s3-multipart-resume.test.ts`:
- Around line 5-404: Add a test that asserts proper error propagation when the
server rejects a resumed uploadId (e.g., respond 404 for PUT with partNumber or
POST complete). Update test/js/bun/s3/s3-multipart-resume.test.ts by adding a
new it(...) that creates a S3Client and calls client.file(...).writer({
uploadId, partNumber, partSize, ... }), have the Bun.serve handler return 404
for requests whose url.search includes "uploadId=" or "partNumber=", then
attempt to write/end and expect the writer operation to throw/reject; reference
the existing symbols S3Client.file(...).writer, writer.write, writer.end, and
the test's server fetch handler to locate where to add the case. Ensure the
assertion checks the thrown error (or rejected promise) to confirm the error
bubbles to the caller.
- Around line 242-255: Update the test in s3-multipart-resume.test.ts to assert
specific error messages instead of generic throws: when calling
S3Client(...).file("test").writer({ uploadId: "abc", partNumber: 0 }) and
...partNumber: 10001, replace expect(...).toThrow() with assertions that check
the exact validation message (or a stable regex) emitted by the
writer/validation logic (identify the error raised by the writer method in the
S3Client implementation) so failures indicate which boundary check failed;
reference the S3Client constructor and the writer(...) call to locate the test
and match the actual error text your code throws.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: fedfbfed-9cde-449e-a14f-b11a8ea9602d
📒 Files selected for processing (6)
src/bun.js/webcore/Blob.zigsrc/bun.js/webcore/fetch.zigsrc/s3/client.zigsrc/s3/credentials.zigsrc/s3/multipart.zigtest/js/bun/s3/s3-multipart-resume.test.ts
- Add overlap validation: previousParts[].partNumber must be < partNumber - Add validation: partNumber > 1 requires previousParts - Skip rollback (AbortMultipartUpload) for resumed uploads on failure - Propagate resume fields through all uploadStream call sites - Force multipart path for write() when uploadId is set - Use small buffers in tests (resume bypasses partSize threshold) - Add server rejection error propagation test - Add specific error message assertions
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/bun.js/webcore/Blob.zig (1)
1109-1132:⚠️ Potential issue | 🔴 CriticalKeep empty resumed writes on the multipart path.
This only fixes the non-empty
.bytesbranch. Line 1005 still routes empty payloads ("",new Uint8Array(),new Blob([])) intowriteFileWithEmptySourceToDestination(), and the S3 arm there unconditionally callsS3.upload(""). WithuploadIdset,Bun.write(...)becomes a normal empty PUT instead of completing the existing multipart upload, which can overwrite the object and breaks the no-single-PUT guarantee.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bun.js/webcore/Blob.zig` around lines 1109 - 1132, The multipart resume case currently only handles non-empty `bytes` and routes empty payloads into `writeFileWithEmptySourceToDestination()` where the S3 branch calls `S3.upload("")`, causing an empty PUT to overwrite an in-progress multipart upload when `aws_options.upload_id` is set. Change the logic so that any operation with a non-null `aws_options.upload_id` (regardless of `bytes.len`) follows the multipart-resume path (the same path that uses `uploadStream` / multipart completion) instead of the empty-source shortcut; specifically, update the condition around `aws_options.upload_id` and `bytes.len` and/or add a guard in `writeFileWithEmptySourceToDestination` so that when `aws_options.upload_id != null` you do the multipart finalize/complete flow (not `S3.upload("")`) and use the existing multipart helpers (e.g., `S3.uploadStream`, multipart complete API) to finish the upload.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/s3/credentials.zig`:
- Around line 293-323: The parser prevents resuming/upload-completion when the
last uploaded part is the cap (10000) because previousParts entries must be
strictly less than new_credentials.part_number; change that check to allow
previousParts.partNumber <= new_credentials.part_number so a previous part equal
to the cap is allowed (update the comparison at the validation that returns
throwInvalidArguments inside the JSArrayIterator loop), and ensure the range
check using throwRangeError still enforces 1..10000 for both partNumber and
previousParts[].partNumber (references: new_credentials.part_number,
previousParts parsing block, jsc.JSArrayIterator.init, getOptional,
throwRangeError, throwInvalidArguments).
- Around line 278-291: The code silently ignores an explicit empty uploadId
string causing resume attempts to fall back to a fresh upload; in
src/s3/credentials.zig update the uploadId handling so that when
opts.getTruthyComptime(globalObject, "uploadId") yields a JS value that is a
string but empty (js_value.isString() and bun.String.fromJS yields str with tag
.Empty or str.toUTF8 results in an empty slice) you throw an invalid-argument
error instead of treating it as "not provided." Modify the branch around
opts.getTruthyComptime/globalObject and the js_value.isString() handling to
detect an explicitly empty string and call
globalObject.throwInvalidArgumentTypeValue("uploadId", "non-empty string",
js_value) (or similar) while keeping the existing path that accepts and assigns
non-empty strings to new_credentials._uploadIdSlice and
new_credentials.upload_id.
In `@test/js/bun/s3/s3-multipart-resume.test.ts`:
- Around line 404-447: The test currently only asserts that an error was thrown
by using expect(e).toBeDefined(); update the catch to assert the specific
surfaced failure from the mocked 404 response so we lock in propagation of the
S3 error (e.g. check that the error contains "NoSuchUpload" or that its HTTP
status is 404 or its message includes the XML error); locate the try/catch
around writer.end() in the "should propagate server rejection of resumed upload"
test and replace the generic expect with an assertion that inspects the caught
error (variable e) for the stable identifier ("NoSuchUpload" or 404) coming from
the server response.
- Around line 154-208: The test currently doesn't verify that no zero-byte PUT
for partNumber=4 was sent; modify the server fetch handler to record incoming
PUT requests whose URL.search includes "partNumber=" (e.g., push part numbers
into an array like observedParts) when req.method === "PUT", and after await
writer.end() assert that observedParts does not contain "4" (or assert
completionBody does not include "<PartNumber>4</PartNumber>" and no ETag for
part 4). Use the existing identifiers: the Bun.serve fetch handler, uploadId,
completionBody, completionCalled, and the writer returned from
client.file("test_complete_only").writer to locate where to add the tracking and
the new assertion.
---
Outside diff comments:
In `@src/bun.js/webcore/Blob.zig`:
- Around line 1109-1132: The multipart resume case currently only handles
non-empty `bytes` and routes empty payloads into
`writeFileWithEmptySourceToDestination()` where the S3 branch calls
`S3.upload("")`, causing an empty PUT to overwrite an in-progress multipart
upload when `aws_options.upload_id` is set. Change the logic so that any
operation with a non-null `aws_options.upload_id` (regardless of `bytes.len`)
follows the multipart-resume path (the same path that uses `uploadStream` /
multipart completion) instead of the empty-source shortcut; specifically, update
the condition around `aws_options.upload_id` and `bytes.len` and/or add a guard
in `writeFileWithEmptySourceToDestination` so that when `aws_options.upload_id
!= null` you do the multipart finalize/complete flow (not `S3.upload("")`) and
use the existing multipart helpers (e.g., `S3.uploadStream`, multipart complete
API) to finish the upload.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f364678b-0deb-45a3-8e38-7b82dd97a427
📒 Files selected for processing (6)
src/bun.js/webcore/Blob.zigsrc/bun.js/webcore/fetch.zigsrc/s3/client.zigsrc/s3/credentials.zigsrc/s3/multipart.zigtest/js/bun/s3/s3-multipart-resume.test.ts
- Validate uploadId for CRLF injection and length (max 1024) to prevent process crash from bufPrint overflow on user-supplied strings - Add duplicate partNumber detection in previousParts - Move previousParts-requires-uploadId check before the parsing loop for correct error messages - Allow partNumber up to 10001 for completion-only case (all 10000 parts already uploaded) - Add no-new-parts assertion in completion-only test - Strengthen server rejection test assertion
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/s3/s3-multipart-resume.test.ts`:
- Around line 264-277: The test uses partNumber: 10001 which the current
S3Client.writer logic treats as valid; update the failing-case to use 10002 so
the test actually hits the upper-bound validation. In the test block that
constructs new S3Client(...) and calls client.file("test").writer({ uploadId:
"abc", partNumber: ... }), replace the second expect's partNumber value with
10002 (and keep the expect toThrow("partNumber")) so the writer() validation for
partNumber range fails as intended.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c6e8619c-5fcd-436a-a5fe-f5a1dd82e7ff
📒 Files selected for processing (2)
src/s3/credentials.zigtest/js/bun/s3/s3-multipart-resume.test.ts
- Add uploadId, partNumber, previousParts to S3Options in bun-types - Validate etag values reject XML special characters (<, >, &) - Guard enqueuePart against partNumber > 10000 with clear error - Fix test: use 10002 for range boundary (10001 is valid completion-only)
| import { describe, expect, it } from "bun:test"; | ||
| import { randomUUID } from "node:crypto"; | ||
|
|
||
| describe("s3 - Resumable multipart upload", () => { |
There was a problem hiding this comment.
🟡 The describe( block on line 5 of the new test file should be describe.concurrent( per test/CLAUDE.md: 6 of the 12 tests each spawn an independent Bun.serve({ port: 0 }) server with no shared mutable state. The fix is trivial: change describe( to describe.concurrent( on line 5.
Extended reasoning...
The project's test/CLAUDE.md explicitly states: "When multiple tests in the same file spawn processes or write files, make them concurrent with test.concurrent or describe.concurrent unless it is very difficult to make them concurrent."
The new test file test/js/bun/s3/s3-multipart-resume.test.ts uses a plain sequential describe(...) block on line 5. The file contains 12 it(...) tests, of which 6 call Bun.serve({ port: 0, ... }) to start real HTTP servers: "should skip CreateMultipartUpload when uploadId is provided", "should include previousParts in CompleteMultipartUpload XML", "should complete with only previousParts and no new data", "should not use single PUT when resuming with small data", "should use correct uploadId in part upload and completion requests", and "should propagate server rejection of resumed upload".
All 6 server-spawning tests are independently isolated: each uses using server = Bun.serve({ port: 0 }) (ephemeral port, auto-stopped via using), creates its own randomUUID() upload ID, uses a distinct filename string, and asserts only on locally scoped variables. The shared s3Options constant is read-only config. There is no shared mutable state between tests. Making them concurrent requires exactly one word change.
The adjacent s3.test.ts file uses describe.concurrent in 11+ places, confirming this is the established pattern for S3 tests that spawn servers in this directory. This violation was introduced by the PR and is not pre-existing.
Addressing the refutation: One verifier noted this is a style/convention issue only — the tests are functionally correct whether run sequentially or concurrently. That observation is accurate. However, it does not argue for abstaining: CLAUDE.md's guidance is an explicit project requirement, not merely advisory preference. The refutation itself concedes the issue is "at best nit", which is exactly the severity filed here. Filing a nit for a clear, trivially-fixable guideline violation is appropriate and does not produce a self-contradicting comment.
Step-by-step proof that the fix is trivial: (1) Line 5 reads describe("s3 - Resumable multipart upload", () => {. (2) Change to describe.concurrent("s3 - Resumable multipart upload", () => {. (3) All 6 server-spawning tests use port: 0 so no port collisions occur. (4) All assertions reference locally scoped variables. (5) No expect.assertions(), toMatchSnapshot(), or shared state that would prevent concurrency. The change is one word.
| this.state = .finished; | ||
| try this.callback(.{ .failure = _err }, this.callback_context); | ||
|
|
||
| if (old_state == .multipart_completed) { | ||
| if (old_state == .multipart_completed and !this.is_resumed) { | ||
| // we are a multipart upload so we need to rollback | ||
| // will deref after rollback | ||
| try this.rollbackMultiPartRequest(); |
There was a problem hiding this comment.
🟡 The else branch comment in fail() (multipart.zig:447) still reads '// single file upload no need to rollback', but this PR introduced is_resumed so the else branch now also covers resumed multipart uploads where old_state == .multipart_completed and is_resumed == true. Update the comment to reflect both cases, e.g. '// single file upload, or resumed upload (we did not initiate the upload, so skip rollback)'.
Extended reasoning...
The fail() function in multipart.zig guards rollback with if (old_state == .multipart_completed and !this.is_resumed). The else branch therefore fires for two distinct cases: (1) old_state != .multipart_completed, meaning a single-file upload that never entered multipart mode, and (2) old_state == .multipart_completed and this.is_resumed == true, meaning a resumed multipart upload whose upload_id was user-supplied.
Before this PR, is_resumed did not exist and the else branch was only reachable for single-file uploads, so the comment '// single file upload no need to rollback' was accurate. This PR added is_resumed and the 'and !this.is_resumed' guard, expanding the else branch to include resumed uploads, but the comment was not updated alongside that change.
The comment now misleads readers into thinking the else branch implies no upload_id is present. In reality, when is_resumed == true, upload_id is non-empty (it holds the user-provided ID) and the upload IS multipart. A developer reading this comment could incorrectly assume the branch is only reachable in single-file contexts and make wrong assumptions about the state of upload_id or multipart_etags.
One verifier refuted this as not a bug since it has no runtime impact. That is technically correct — the code logic is sound and skipping rollback for resumed uploads is intentional. However, the misleading comment is a direct consequence of this PR's changes to the else guard condition; it represents a maintainability hazard introduced by this PR.
Proof: Before this PR, the guard was simply if (old_state == .multipart_completed), and the else case was exclusively single-file uploads. After this PR added and !this.is_resumed, the else case now includes resumed multipart uploads with non-empty upload_id. The comment at line 447 still describes only the pre-PR semantics. The fix is a one-line comment update: '// single file upload, or resumed upload (we did not initiate the upload, so skip rollback)'.
Closes #28503
Problem
Large S3 uploads can fail due to network drops, app restarts, or system sleep. When that happens, there's no way to resume safely — uploads restart from scratch, wasting time, bandwidth, and cost.
Solution
Accept
uploadId,partNumber, andpreviousPartsoptions on the S3 writer API to enable resuming interrupted multipart uploads.Usage
Behavior
When
uploadIdis provided:partNumberValidation
partNumber > 1requiresuploadIdpreviousPartsrequiresuploadIdpartNumbermust be 1–10000previousPartsentry must havepartNumber(number) andetag(string)Changes
src/s3/credentials.zig— ParseuploadId,partNumber,previousPartsfrom JS options with validationsrc/s3/client.zig— Accept resume params inwritableStream()anduploadStream(), initializeMultiPartUploadwith resume statesrc/s3/multipart.zig— Handle resume incontinueStream()(skip tomultipart_completedstate when upload_id is set)src/bun.js/webcore/Blob.zig— Thread resume data from writer options to S3 upload functionssrc/bun.js/webcore/fetch.zig— Pass default (no-resume) params to updateduploadStreamsignatureVerification