/v1/api/documents/{id}/download, and a set of classification fields that a worker fills in automatically after upload. This guide covers uploading (both the inline JSON and the multipart variants), how classification works and how to re-trigger it, listing and searching, reading and updating metadata, downloading the raw bytes, and deleting.
All requests go to https://api.hq.zone and authenticate with a personal access token:
Read operations (list, search, get, download) require the
documents:read scope. Write operations (upload, update, reclassify, delete) require documents:write.Visibility
Every read enforces a visibility filter, so you only ever see documents you are allowed to see. There are three scopes:private— only the owner sees it.channel— members of the document’schannel_idsee it.team— everyone in the workspace sees it.
private) and can be changed later with an update.
Upload a document
There are two upload paths. They share the same dedup, storage, and classification pipeline; they differ only in how the bytes reach the server.- Inline JSON + base64 —
POST /v1/api/documents. Bytes are base64-encoded in the JSON body. The JSON body is capped at 16 MiB, which lands the raw file ceiling at roughly 12 MiB after the base64 tax. - Multipart form-data —
POST /v1/api/documents/upload. Bytes ride amultipart/form-datafile part, so they don’t pay the base64 tax. This path accepts larger files (up to 64 MiB) and is what browser drag-and-drop uses.
documents:write scope and both return the same response shape.
Variant A — inline JSON (base64)
POST it as JSON
Send Upload a document. The required fields are A 200 returns the document id, download URL, hash, and size:
filename, content_type, and body_b64; scope defaults to private, and channel_id, category, tags, and caption are optional. channel_id is required when scope is channel, and rejected otherwise.Variant B — multipart form-data
POST the file as multipart/form-data
Send Upload a document (multipart). The file bytes go in a part named The response shape is identical to the inline path:
file (the part name body is also accepted); the filename and content type are read from that part. Optional text fields are scope, channel_id, category, tags (comma-separated), and caption.document_id, download_url, sha256, size_bytes, deduplicated, and enriched.The file part must carry a filename (via
Content-Disposition: filename=...). If the part has no Content-Type, the server falls back to application/octet-stream rather than rejecting the upload. The tags field here is a single comma-separated string, whereas the inline JSON path takes a real JSON array.How classification works
When a document is saved, a background classifier worker analyzes it and writes back auto-derived metadata. You don’t trigger it — it picks up newly created documents on its own. Its progress and output show up as fields on the document row:classification_status— one ofpending,queued,processing,completed,failed, orskipped. A UI typically shows a pill until this reachescompleted.auto_categorized—trueonce the worker has written back an auto-derived category, summary, and tags. (This is distinct fromcategorybeing set, because users can also set a category manually.)category— the document category (nullable; may be set by the worker or by you).summary— an auto-generated summary (nullable).document_date— the date the document refers to (issue / publication / statement date), not when it was ingested (nullable).language— ISO 639-1 code such asenorsv(nullable).reference_numbers— extracted document-specific IDs (invoice / case / order numbers).entities— an LLM-extracted JSON blob of entities (e.g. person names, company names, identifiers, amounts, dates).analysis_strategy— which classifier pipeline produced the row (nullable on older rows).profile_version— the classifier schema version that wrote the row.classification_confidence— the model’s self-reported certainty in the range 0–1 (nullable). Confidence below0.6is typically flagged as “needs review”.
Re-trigger classification
If the model returned a poor category, the document was uploaded before the worker was available, or it terminally failed, you can force another pass with Reclassify a document. This resetsclassification_status to queued and re-queues the document for the worker. Owner-only.
403 means you are not the owner; a 404 means there is no such document in your workspace.
List and search
List documents
List documents returns the documents visible to you, newest first. Optional query parameters narrow the set: The response carries the
scope, category, channel (a channel id, useful with scope=channel), and tags (comma-separated, matching documents that carry all listed tags). limit defaults to 50 and is capped at 200.documents array plus rollup categories and tags arrays — the distinct categories and tags across your whole visible set, for building filter sidebars:Search documents
Search documents runs a case-insensitive substring search over filename, caption, summary, and tags. The The response has the same shape as the list endpoint, but the
q parameter is required; limit defaults to 50 and is capped at 200.categories and tags rollup arrays come back empty for search.Get and update metadata
Get one document
Get a document returns the full metadata for a single document by id —
filename, content_type, size_bytes, sha256, scope, owner fields, category, tags, caption, summary, the classification fields described above, download_url, and is_owner (whether you own it, which drives edit/delete affordances). A document not visible to you returns 404.Update metadata
Update a document edits the mutable metadata: The response is the updated document row.
scope, channel_id, category, tags, and caption. Only fields present in the body change. For channel_id, category, and caption, an explicit null clears the value while omitting the field leaves it untouched. Only the owner may update (others get 403); a missing document returns 404.Download the bytes
A document’s metadata gives youdownload_url. Resolve it against the base URL and call Download a document. The response is raw binary — the document bytes streamed back as an attachment with the original Content-Type, the original filename in Content-Disposition, and the byte count in Content-Length — not JSON. Save it to a file rather than printing it. Visibility is enforced the same way as on the read endpoints; a document you can’t see returns 404.
The download is streamed, so large documents don’t have to fit in memory on the server. The response carries
Cache-Control: private, max-age=300.Delete a document
Delete a document removes a document. Only the owner may delete it (others get403); an unknown document returns 404.
204 No Content with an empty body.
The stored bytes are removed only when no other document in the workspace still references the same content. Deduplicated copies are therefore preserved — deleting your copy of a shared file won’t pull the bytes out from under someone else’s row.