Information Retrieval | Invofox

This section provides the API endpoints and data structures for retrieving information about your Imports, Files, and Documents. For real-time updates on processing status, use webhooks instead of polling these endpoints.

When to use these APIs:

Initial synchronization with current state
Recovery from missed webhook events
Bulk status checking for dashboards
Retrieving detailed entity structures and metadata

Import API

Retrieve information about your import operations using our REST API.

List Imports: GET /v1/ingest/imports/
Get Import: GET /v1/ingest/imports/{id}

Import Structure

1 type Import = {
2   id: string;
3   accountId: string;
4   environmentId: string;
5   channel: "api" | "upload" | "email";
6   status: "processing" | "processed";
7   createdAt: string; // ISO 8601 timestamp
8   documents: number; // Total count of documents extracted from all files
9   files: File[]; // Array of files (see File structure below)
10   clientData?: object; // Custom metadata from upload
11   providerInfo?: object; // Provider-specific information
12 };

Notes:

Import returns a list of all files, including ZIP files and their extracted contents
The documents field provides a total count of all documents extracted across all files
The channel field indicates how the import was created (API, web upload, or email)
Each file in the files array follows the File structure documented below
Compressed files (ZIP) have compressed: true and typically have an empty documentIds array
Files extracted from ZIP archives include parentFileId referencing the original compressed file
clientData contains custom metadata provided during upload (empty object if none)
Rejected files include an error field containing the error code for programmatic handling

For Import state definitions and transition logic, see Import Processing.

File API

Access file processing information and status updates.

File Structure

1 type File = {
2   id: string; // Unique identifier
3   accountId: string;
4   environmentId: string;
5   companyId?: string;
6   importId: string;
7   filename: string;
8   mimeType: string; // MIME type (e.g., "application/pdf", "image/jpeg")
9   status: "pending" | "processing" | "processed" | "rejected";
10   compressed: boolean; // true for ZIP files
11   documentIds: string[]; // Array of document IDs extracted from this file
12   parentFileId?: string; // Reference to parent ZIP file (if extracted from ZIP)
13   error?: string; // Error code when status is "rejected" (e.g., "ERR_UNSUPPORTED_FILE_FORMAT")
14   clientData?: object; // Custom metadata from upload
15   createdAt: string; // ISO 8601 timestamp
16   updatedAt: string; // ISO 8601 timestamp
17 };

Notes:

Each file includes an array of documentIds (following naming convention: array of IDs uses singular + “Ids” suffix)
The mimeType field indicates the file format (required field)
Compressed files (ZIP) have compressed: true and empty documentIds array
Files extracted from ZIP include parentFileId field referencing the original compressed file
Rejected files (status: “rejected”) include an error field containing the error code for programmatic handling
All responses use id field as primary identifier (not _id)
clientData contains custom metadata provided during upload (empty object if none)

For File state definitions and transition logic, see File Processing.

Documents API

Document processing can introduce latency when retrieving information by Document ID. Use webhooks for real-time updates.

List Documents: GET /v1/documents/ Get Document: GET /v1/documents/{id}

Document Structure

Full document structure is available in the API reference documentation.

For Document state definitions and transition logic, see Document Processing.

Technical Implementation

Best Practices

Use webhooks first - API polling should supplement webhook notifications, not replace them
Implement pagination - Large result sets are paginated for performance
Handle rate limits - Implement exponential backoff for rate limit responses

Error Handling

The API returns standard HTTP status codes:

200 - Success
400 - Bad request (validation errors)
401 - Unauthorized (check API key)
404 - Resource not found
429 - Rate limit exceeded
500 - Internal server error

Error responses include detailed information to help with debugging and resolution.

Polling Strategies

Not recommended in production environments. Use webhooks for real-time updates instead.

When webhooks are not available, use these polling patterns:

Initial load - Poll every 5-10 seconds for active processing
Background sync - Poll every 30-60 seconds for status updates
Exponential backoff - Gradually increase polling intervals (e.g., 5s → 10s → 20s → 40s → 80s → 160s → 300s) up to 5 minutes maximum, as document processing can be complex and time-consuming
Batch requests - Use list endpoints to reduce API calls