Docs
Data Connectors
Supported File Formats

Supported File Formats

We support an extensive array of file formats to ensure broad compatibility and efficient information extraction for a variety of document types. Our continually expanding list of supported formats includes:

  • Word Documents: .doc, .docx, .docm, .dot, .dotm
  • Open Document Format: .odt, .ott
  • eBooks: .epub
  • Presentations: .ppt, .pptx
  • Spreadsheets: .xls, .xlsx
  • Miscellaneous: .rtf, .xps, .pcl, .md, .flatopc, .pdf, .txt
  • Email Formats: .pst, .msg, .eml, .emlx

Our team is dedicated to improving our parsers, aiming to deliver a parsing quality that surpasses that of any open-source solution, particularly for documents with complex structures and tables. We are currently focusing our efforts on enhancing the parsing capabilities for .docx (Microsoft Word) and Adobe .pdf files, given their widespread use.

For optimal results, we recommend using the .docx format.

Special Focus on Tables

Recognizing the importance of tables in documentation, especially for our enterprise customers, we understand that most existing chatbot solutions struggle with accurate table parsing. The Varex platform places a special emphasis on table parsing, ensuring the highest quality responses for queries related to data within tables. Our commitment is to provide reliable and precise information retrieval from structured data, enhancing the user experience.

Was this page useful?

Questions? We're here to help

Subscribe to updates