Information Retrieval
This section provides the API endpoints and data structures for retrieving information about your Imports, Files, and Documents. For real-time updates on processing status, use webhooks instead of polling these endpoints.
When to use these APIs:
- Initial synchronization with current state
- Recovery from missed webhook events
- Bulk status checking for dashboards
- Retrieving detailed entity structures and metadata
Import API
Retrieve information about your import operations using our REST API.
List Imports: GET /v1/imports/
Get Import: GET /v1/imports/{id}
Import Structure
Notes:
- Import returns a list of all files, including ZIP files and their extracted contents
- The
documentsfield provides a total count of all documents extracted across all files - The
channelfield indicates how the import was created (API, web upload, or email) - Each file in the
filesarray follows the File structure documented below - Compressed files (ZIP) have
compressed: trueand typically have an emptydocumentIdsarray - Files extracted from ZIP archives include
parentFileIdreferencing the original compressed file clientDatacontains custom metadata provided during upload (empty object if none)- Rejected files include an
errorfield containing the error code for programmatic handling
For Import state definitions and transition logic, see Import Processing.
File API
Access file processing information and status updates.
File Structure
Notes:
- Each file includes an array of
documentIds(following naming convention: array of IDs uses singular + “Ids” suffix) - The
mimeTypefield indicates the file format (required field) - Compressed files (ZIP) have
compressed: trueand emptydocumentIdsarray - Files extracted from ZIP include
parentFileIdfield referencing the original compressed file - Rejected files (status: “rejected”) include an
errorfield containing the error code for programmatic handling - All responses use
idfield as primary identifier (not_id) clientDatacontains custom metadata provided during upload (empty object if none)
For File state definitions and transition logic, see File Processing.
Documents API
Document processing can introduce latency when retrieving information by Document ID. Use webhooks for real-time updates.
List Documents: GET /v1/documents/
Get Document: GET /v1/documents/{id}
Document Structure
Full document structure is available in the API reference documentation.
For Document state definitions and transition logic, see Document Processing.
Technical Implementation
Best Practices
- Use webhooks first - API polling should supplement webhook notifications, not replace them
- Implement pagination - Large result sets are paginated for performance
- Handle rate limits - Implement exponential backoff for rate limit responses
Error Handling
The API returns standard HTTP status codes:
- 200 - Success
- 400 - Bad request (validation errors)
- 401 - Unauthorized (check API key)
- 404 - Resource not found
- 429 - Rate limit exceeded
- 500 - Internal server error
Error responses include detailed information to help with debugging and resolution.
Polling Strategies
Not recommended in production environments. Use webhooks for real-time updates instead.
When webhooks are not available, use these polling patterns:
- Initial load - Poll every 5-10 seconds for active processing
- Background sync - Poll every 30-60 seconds for status updates
- Exponential backoff - Gradually increase polling intervals (e.g., 5s → 10s → 20s → 40s → 80s → 160s → 300s) up to 5 minutes maximum, as document processing can be complex and time-consuming
- Batch requests - Use list endpoints to reduce API calls