Ever tried parsing a PDF that secretly holds five contracts or ten claim forms?
This isn’t unusual—insurance submissions, bank statements, and legal contracts often arrive bundled together in one file.
With Document Split (Beta) in Upstage Information Extract , you can automatically separate multiple documents in one PDF—no custom scripts, fewer errors, and faster processing at scale.
When is it useful?
- Insurance: Multiple claim forms in one submission
- Banking/Accounting: Several account statements scanned together
- Legal: Bundled contracts requiring individual processing
Example: An insurance PDF with multiple claim forms can now be automatically split and processed per document.
Request
extraction_response = client.chat.completions.create(
model="information-extract",
messages=[ ... ],
response_format={ ... },
extra_body={
"doc_split": True # Enable document splitting
}
)
Response (simplified)
[
{ "bank_name": "First National Bank" },
{ "bank_name": "Global Trust Bank" },
{ "bank_name": "Metro Financial Bank" }
]
Note: The actual API response returns results under a choices array, with each document’s result returned in message.content as a JSON string. The above is simplified for readability.
Why it matters
Document Split eliminates manual splitting, improves accuracy, and makes it easy to process complex, multi-document files.
Next in this series: Location Coordinates (Beta)—a feature that lets you trace every extracted value back to its exact spot in the document.
Document Split is now available in beta. Try them in the Upstage Console, or read more in the developer documentation.