From xberg-enterprise
Guides upload of large files (>50 MB) using presigned URLs: presign, PUT, confirm. Avoids base64 overhead for cloud extraction.
How this skill is triggered — by the user, by Claude, or both
Slash command
/xberg-enterprise:presigned-uploadsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
For files larger than about 50 MB, skip the base64-in-JSON body of
For files larger than about 50 MB, skip the base64-in-JSON body of
POST /v1/extract and use the three-step presigned-upload flow instead.
The client uploads bytes directly to object storage, then tells the API to
start processing.
1. POST /v1/uploads/presign → batch_id + per-file presigned PUT URLs
2. PUT <upload_url> → upload each file's bytes directly
3. POST /v1/uploads/confirm → start extraction, returns job_ids
Step 1 returns one upload_url per document. Step 3 cannot run until
every PUT in step 2 succeeds.
curl -X POST https://api.xberg.io/v1/uploads/presign \
-H "Authorization: Bearer $KREUZBERG_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{"filename": "scan.pdf", "mime_type": "application/pdf"},
{"filename": "report.docx", "mime_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document"}
],
"config": {"output_format": "markdown"},
"webhook": {"url": "https://hooks.example.com/x"}
}'
{
"batch_id": "batch_550e8400-e29b-41d4-a716",
"uploads": [
{
"job_id": "550e8400-...",
"upload_url": "https://storage.googleapis.com/kreuzberg-dev-uploads/...",
"object_key": "projects/abc123/uploads/550e8400-...",
"method": "PUT",
"expires_in_secs": 3600
},
{
"job_id": "660e9400-...",
"upload_url": "https://storage.googleapis.com/kreuzberg-dev-uploads/...",
"object_key": "projects/abc123/uploads/660e9400-...",
"method": "PUT",
"expires_in_secs": 3600
}
]
}
Keep the batch_id — you need it for step 3. URLs expire in 3600 seconds
(1 hour); upload before then.
The presigned URL is signed by Google Cloud Storage; PUT directly to it,
without an Authorization header. Set Content-Type to match the
mime_type declared in step 1:
curl -X PUT "<upload_url>" \
-H "Content-Type: application/pdf" \
--data-binary @scan.pdf
A successful upload returns 200 OK with no body. Do this for every
entry in uploads before moving on.
curl -X POST https://api.xberg.io/v1/uploads/confirm \
-H "Authorization: Bearer $KREUZBERG_API_KEY" \
-H "Content-Type: application/json" \
-d '{"batch_id": "batch_550e8400-e29b-41d4-a716"}'
{
"job_ids": ["550e8400-...", "660e9400-..."],
"status": "processing"
}
These are the same job_id values returned in step 1's uploads array.
From here, the flow is identical to offloading-extraction — poll
GET /v1/jobs/{id} or wait for the webhook.
#!/usr/bin/env bash
set -euo pipefail
API="https://api.xberg.io"
KEY="$KREUZBERG_API_KEY"
FILE="scan.pdf"
# 1. Presign
resp=$(curl -fsS -X POST "$API/v1/uploads/presign" \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{"documents":[{"filename":"'"$FILE"'","mime_type":"application/pdf"}]}')
batch_id=$(echo "$resp" | jq -r .batch_id)
upload_url=$(echo "$resp" | jq -r '.uploads[0].upload_url')
# 2. PUT
curl -fsS -X PUT "$upload_url" \
-H "Content-Type: application/pdf" \
--data-binary "@$FILE"
# 3. Confirm
curl -fsS -X POST "$API/v1/uploads/confirm" \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{"batch_id":"'"$batch_id"'"}' | jq .
| Status | Where | Cause |
|---|---|---|
400 | presign | Empty documents, bad MIME, missing filename. |
403 | PUT | URL expired (>1h since presign) or Content-Type mismatch. |
400 | confirm | One or more uploads missing in storage. |
401 | presign/confirm | Bad Bearer token. |
If confirm returns 400 complaining about a missing upload, retry the
PUT for that specific object_key — confirmation requires every file to
be present in storage first.
For files under ~5 MB, the JSON data field is simpler and lower-latency
(one round trip instead of three). See the offloading-extraction skill.
npx claudepluginhub xberg-io/pluginsCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.