Making Files LLM-Ready

Introduction

UiForm’s document processing pipeline automatically converts various file types into LLM-ready formats, eliminating the need for custom parsers. This guide explains how to process different document types and understand the resulting output format.

Supported File Types

UiForm supports a wide range of document formats:

Text Documents: PDF, DOC, DOCX, TXT
Spreadsheets: XLS, XLSX, CSV
Emails: EML, MSG
Images: JPG, PNG, TIFF
Presentations: PPT, PPTX
And more: HTML, XML, JSON

Basic Document Processing

Here’s how to convert a document into an LLM-ready format:

from uiform import UiForm

uiclient = UiForm()
doc_msg = uiclient.documents.create_messages(
    document = "path/to/your/document.jpg"
)

The create_messages method returns a standardized message format:

{
    "id": "doc_dd003f95-81ce-4a55-9180-00c5a58d82ec",
    "object": "document.message",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Document content here..."
                }
            ]
        }
    ],
    "created": 1736524416,
    "modality": "text",
    "document": {
        "id": "cf908729402d0796537bb91e63df5e339ce93b4cabdcac2f9a4f90592647e130",
        "name": "document.jpg",
        "mime_type": "image/jpeg"
    }
}

Document Processing Options

Image Settings

You can configure various image settings:

doc_msg = uiclient.documents.create_messages(
    document = "document.jpg",
    prompting_context = {
        "image_settings": {
            "correct_image_orientation": True,
            "dpi": 72,  # or specific integer value
            "image_to_text": "ocr",  # or "llm_description"
            "browser_canvas": "A4"  # "A3", "A4", or "A5"
        }
    }
)

The image settings support these configurations:

correct_image_orientation: Automatically rotates images to correct orientation if need
dpi: Set image DPI resolution
image_to_text: Choose text extraction method:
- ocr: Traditional OCR processing
- llm_description: AI-generated image description
browser_canvas: Set document canvas size:
- A3: 11.7in x 16.54in
- A4: 8.27in x 11.7in (default)
- A5: 5.83in x 8.27in

Modality Control

You can specify the document processing modality using the modality parameter:

response = uiclient.documents.extract(
    document = "document.pdf",
    json_schema = schema,
    modality = "native"  # or "text"
)

Available modalities from the endpoints:

native: Default processing mode
text: Text-only processing mode
image: Image-only processing mode
image+text: Native processing mode (text or image depending on the document type) + text content

The chosen modality will be reflected in the response under the modality field.

Supported Models

You can use any of these supported models:

Claude-3 series (claude-3-5-sonnet-latest, claude-3-opus-20240229, etc.)
GPT-4o series (gpt-4o, gpt-4o-mini, etc.)
Gemini series (gemini-1.5-pro, gemini-1.5-flash, etc.)
Grok-2 series (grok-2-vision-1212, grok-2-1212)

List available models using:

uiclient = UiForm()
models = uiclient.models.list()

Rate Limits

The API has the following rate limits:

300 requests per 60-second window
Applies to document processing endpoints (create_messages and extractions)
Returns 429 status code when exceeded

Get Started

SDK

Making Files LLM-Ready

Introduction

Supported File Types

Basic Document Processing

Document Processing Options

Image Settings

Modality Control

Supported Models

Rate Limits

Go further

Get Started

SDK

​Introduction

​Supported File Types

​Basic Document Processing

​Document Processing Options

​Image Settings

​Modality Control

​Supported Models

​Rate Limits

​Go further

Introduction

Supported File Types

Basic Document Processing

Document Processing Options

Image Settings

Modality Control

Supported Models

Rate Limits

Go further