Overview
Many LLMs are limited to processing text, and in some cases images and audio. This plugin extends native LLM capabilities by pre-processing documents into content that LLMs can understand. The plugin automatically detects document URLs or data URIs in your messages and converts them into clean, structured markdown text. The plugin uses Microsoft’sMarkItDown
library to convert documents into markdown, which is then sent to the LLM for processing.
Features
- Multiple Formats: Supports PDF, Word (DOCX, DOC), PowerPoint (PPTX, PPT), and Excel (XLSX, XLS) documents
- Markdown Conversion: Converts documents into clean, structured markdown with preserved formatting
- Headings, lists, and tables are maintained
- Proper paragraph breaks and text emphasis
- All readable text content extracted
- Image Handling: Optional LLM-powered image descriptions for documents containing images or complex layouts
- Flexible Input: Download from URLs or process base64-encoded data URIs
- Automatic Format Detection: Intelligent content-type and file format detection
Installation
- Add the plugin to your Datawizz endpoint configuration
- Set the endpoint URL to:
https://your-service-url/plugin/document
- Configure the Authorization header with your secret token:
- Header name:
Authorization
- Header value:
Bearer YOUR_SECRET_TOKEN
- Header name:
- Optionally configure default settings (see Configuration below)
Configuration
You can specify configurations to control how the document is processed:Parameter | Type | Description | Default |
---|---|---|---|
url | string | The URL of the document to process | (required or data) |
data | string | The base64 encoded content of the document. If provided, this will be used instead of the URL. Should be a data URI (data:application/pdf;base64,... ) | None |
use_llm_image_description | boolean | Whether to use the LLM’s image description capabilities to generate descriptions for images found in documents. This is useful for documents that contain images or complex layouts. Note: Enabling this may increase processing time and incur additional LLM API costs | false |
Usage
Send document attachments as part of a message to the LLM (similar to sending images):Example 1: Document from URL
Input Message:Example 2: Document from Data URI
Input Message:Message Format Requirements
The plugin ONLY processes structured multimodal content with explicitdocument
type. Plain string URLs like "content": "https://example.com/doc.pdf"
will NOT be processed.
Documents must be in this format:
Supported Document Types
- PDF:
.pdf
- Word:
.doc
,.docx
- PowerPoint:
.ppt
,.pptx
- Excel:
.xls
,.xlsx
Example Configuration
Performance Notes
- Processing time varies by document size and complexity
- Enabling
use_llm_image_description
requires an OpenAI API key configured on the server and may increase processing time and costs - Large documents (100+ pages) may take longer to process
- Scanned PDFs (image-only) may not extract text without OCR capabilities
- The plugin gracefully handles errors - if processing fails, the original message is preserved
Configuration Schema
Supported Phases
- Request Phase: Supports processing during the REQUEST phase