Skip to main content

Overview

The Knowledge API lets you manage knowledge bases of documents that are processed asynchronously. Upload files or URLs, organize them into folders, digitize and rewrite content, and read embedding status for search/retrieval readiness.

Typical Flow

  1. Create a knowledge base for your partner account
  2. Upload documents (files or URLs) into the knowledge base
  3. Documents are automatically digitized (OCR/parsing)
  4. Optionally rewrite documents using LLM processing
  5. Final rewritten content is embedded automatically
  6. Poll documents or knowledge-base embedding status until ready
  7. Retrieve processed content or download rewritten bundles

Features

  • File and URL upload: Upload documents directly or provide URLs for automatic fetching
  • Batch URL import: Submit up to 1,000 URLs in a single request
  • Automatic digitization: Supported uploads are extracted, OCRed, or transcribed into text
  • LLM rewriting: Documents are rewritten using configurable content-type templates (article, youtube, book)
  • Automatic embedding: Rewritten documents are queued for embedding automatically; clients do not need to start KB embedding manually
  • Folder organization: Organize documents into folders within knowledge bases
  • Embedding status: Track per-document and knowledge-base embedding progress for search and retrieval
  • Bulk operations: Rewrite all documents in a folder or knowledge base at once
Important Considerations
  • Supported uploads: PDF, DOC/DOCX, PPT/PPTX, XLS/XLSX, EPUB, MD/MARKDOWN, HTML/HTM, TXT/PY/JS/CSS, CSV/JSON/XML/YAML/YML, MP4/MOV/M4V/AVI, MP3/WAV/M4A, and JPG/JPEG/PNG/TIF/TIFF/GIF/WebP. Archives such as ZIP, TAR, GZ/TGZ, RAR, 7Z, BZ2, and XZ are blocked.
  • File size limit: 100MB per upload by default
  • Async processing: Digitization, rewriting, and embedding run in the background — poll document and KB status
  • Embedding source: Only final rewritten content is embedded. Updating rewritten content automatically refreshes embeddings.
  • Audio/video access: Audio and video transcription can be disabled per partner. If disabled, audio/video documents fail digitization with processing_error.
  • Rate limits: Requests are rate-limited