Convert a Document

Change your document from one format to another. Choose what you want to create, pick your options, upload your file, and we handle the rest. Read the convert guide

This page converts your documents between formats. You can extract plain text from almost any file, or create an accessible web page ready for publishing. Each step below walks you through the process.

Step 1: What do you want to create?

Which Pipeline conversion?

Which option should I choose?

Choose "Plain text" when you want to:

  • Pull the text out of a Word, Excel, PowerPoint, or PDF file
  • Get a simple version you can read, search, or copy-paste
  • Feed a document into an AI tool or text analysis pipeline
  • Create a starting point for writing accessible Markdown documentation

Accepts: Word (.docx), Excel (.xlsx, .xls), PowerPoint (.pptx), PDF (.pdf), HTML, CSV, JSON, XML, ePub (.epub), and ZIP files.

You get back: a .md (Markdown) file -- plain text with simple formatting marks for headings, lists, and links. Opens in any text editor.

Powered by: MarkItDown (Microsoft). This tool is optimized for extracting readable text and preserving document structure (headings, lists, tables, links) as lightweight Markdown. It handles the widest range of input formats but produces plain text, not a finished web page.

Choose "Accessible web page" when you want to:

  • Turn a Markdown, Word, or other text document into a page people can read in a browser
  • Create an HTML version that already meets ACB Large Print and WCAG accessibility standards
  • Get a file you can upload to a website, email as an attachment, or paste into a content management system
  • Publish meeting minutes, agendas, reports, or other documents online in an accessible format

Accepts: Markdown (.md), Word (.docx), reStructuredText (.rst), OpenDocument (.odt), Rich Text (.rtf), and ePub (.epub) files.

You get back: a standalone .html web page with accessibility formatting already built in -- Arial font, large text, proper spacing, high contrast, and all the other ACB requirements. Just open it in a browser or upload it to your website.

Powered by: Pandoc (John MacFarlane, UC Berkeley). Pandoc is the gold standard for document-to-HTML conversion. It understands heading hierarchy, semantic lists, tables, footnotes, and cross-references -- and produces clean, well-structured HTML. We then embed our ACB Large Print CSS so the output meets accessibility standards out of the box. Pandoc accepts fewer input formats than MarkItDown, but produces much higher quality HTML output.

Choose "Word document" when you want to:

  • Deliver an editable document to someone who uses Microsoft Word
  • Convert Markdown or other text formats into a Word file for collaboration or review
  • Create a starting point for a Word document that you will format with ACB styles later
  • Share content with people who cannot open HTML or EPUB files

Accepts: Markdown (.md), reStructuredText (.rst), OpenDocument (.odt), Rich Text (.rtf), and ePub (.epub) files. (Does not accept .docx input -- the file is already a Word document.)

You get back: a .docx Word document with headings, lists, tables, and links preserved. You can then open it in Word and apply ACB Large Print styles using our desktop tool or the Word add-in.

Powered by: Pandoc. Pandoc produces clean, well-structured Word documents that use Word's built-in heading styles (Heading 1, Heading 2, etc.), making them easy to format and audit for accessibility.

Choose "EPUB 3 e-book (via Pandoc)" when you want to:

  • Create an e-book from Markdown, Word, or other text documents
  • Read the content on an e-reader, phone, or tablet
  • Get an EPUB with ACB Large Print CSS already embedded
  • Quickly produce an EPUB without needing the DAISY Pipeline service running

Accepts: Markdown (.md), Word (.docx), reStructuredText (.rst), OpenDocument (.odt), and Rich Text (.rtf) files. (Does not accept .epub input -- the file is already an EPUB.)

You get back: an .epub file with ACB Large Print CSS embedded. The EPUB uses the same Arial font, large text, and accessibility formatting as the HTML output.

Powered by: Pandoc. Pandoc produces EPUB 3 files with proper navigation documents and chapter structure. This option is always available when Pandoc is installed -- unlike the Pipeline option below, it does not require a separate DAISY Pipeline service.

Choose "Accessible PDF" when you want to:

  • Create a fixed-layout document that looks the same on every screen and printer
  • Print meeting agendas, reports, or newsletters in ACB Large Print format
  • Share a document that cannot be easily edited (read-only distribution)
  • Produce a print-ready file with proper margins, binding gutters, and page breaks

Accepts: Markdown (.md), Word (.docx), reStructuredText (.rst), OpenDocument (.odt), Rich Text (.rtf), and ePub (.epub) files.

You get back: a .pdf file with ACB BOP print formatting -- Arial 18-point body text, 22-point headings, 20-point subheadings, 1.15 line spacing (ACB print standard), 1-inch margins, flush-left alignment, and underline-only emphasis. If you check the binding margin option, the left margin increases to 1.5 inches.

Powered by: Pandoc and WeasyPrint (CourtBouillon, BSD license). Pandoc converts your document to HTML, then WeasyPrint renders that HTML with our ACB print CSS into a PDF. WeasyPrint is a CSS-based PDF engine, so the same ACB BOP typography and spacing rules that work in the browser also produce pixel-perfect print output.

Choose "EPUB or DAISY format" when you want to:

  • Create an accessible EPUB 3 e-book from a Word document or HTML page
  • Distribute content to people who use e-readers, refreshable braille displays, or DAISY players
  • Convert an EPUB to DAISY 2.02 talking book format for playback on dedicated DAISY hardware
  • Convert an EPUB to DAISY 3 / DTBook format for structure-preserving archival

Accepts: depends on the specific conversion -- Word (.docx), HTML (.html), or ePub (.epub).

You get back: an .epub file (for EPUB conversions) or a .zip file containing the DAISY output folder (for DAISY conversions).

Powered by: DAISY Pipeline 2 (DAISY Consortium). Pipeline is purpose-built for accessible publishing. Unlike the other two tools, Pipeline produces packaged publications -- EPUB e-books and DAISY talking books -- designed for assistive technology. It adds proper EPUB accessibility metadata, navigation documents, and reading order. Use Pipeline when the end result needs to work on an e-reader, a refreshable braille display, or a DAISY player, rather than in a web browser.

How do I decide between them?

Think about who will read it and how they will read it:

  • Reading on screen or sharing online? Use "Accessible web page" -- you get an HTML file with ACB formatting that looks right in any browser.
  • Printing on paper? Use "Accessible PDF" -- you get a fixed-layout document with ACB BOP print formatting (18pt Arial, 1.15 line spacing, 1-inch margins) that prints identically on any printer.
  • Sharing an editable document? Use "Word document" -- you get a .docx file people can open and edit in Microsoft Word.
  • Reading on an e-reader or phone? Use "EPUB 3 e-book" -- the content reflows to fit small screens and works with e-reader accessibility features.
  • Reading on a DAISY player or braille display? Use "EPUB or DAISY format" (Pipeline) -- these are the formats those devices expect.
  • Editing, searching, or feeding to an AI tool? Use "Plain text" -- you get raw content in a simple format that works everywhere.
  • Pasting into WordPress or Drupal? Use Export to HTML instead -- it has a CMS Fragment mode that avoids conflicts with your site's theme.

Can I chain them together?

Yes, and this is often the best approach for complex workflows:

  1. Start with Plain text to extract content from a PDF or PowerPoint into Markdown.
  2. Edit the Markdown in any text editor to clean up headings, fix links, and improve structure.
  3. Then convert the Markdown to an accessible web page, Word document, EPUB, or PDF using one of the other options.

This two-step process often produces better results than a single direct conversion, because you can review and improve the content between steps.

Another common workflow: convert a Word document to HTML for your website, then also convert the same source to PDF for a print handout and to EPUB for e-reader distribution -- all from the same source file.

What is the difference between "Export to HTML" and this page?

Export (in the main navigation) is specifically for Word documents and gives you extra options like CMS fragments for pasting into WordPress or Drupal. This page handles a wider range of input formats and always produces a complete, standalone web page with all the accessibility styling included.

Step 2: Formatting options

These options control how your output is formatted. The defaults follow the ACB Large Print Guidelines. Change them only if you have a specific reason.

What do these formatting options do?

ACB Large Print formatting

When checked (recommended), your output follows the American Council of the Blind Board of Publications guidelines:

  • Arial font throughout the document
  • 18-point minimum for body text, 22-point for headings, 20-point for subheadings
  • Flush left alignment (never justified or centered)
  • WCAG 2.2 AA line height (1.5 times the font size) for digital viewing
  • Letter spacing of 0.12em and word spacing of 0.16em for readability
  • High contrast (near-black text on white background)
  • Underline for emphasis (never italic, never bold in body text)
  • 1-inch margins on all sides

When unchecked, the document uses the browser's default styling. This is useful when you plan to add your own CSS later.

Binding margin

Shifts all content 0.5 inches to the right so text near the left edge is not hidden by a binding. Only needed for documents that will be physically printed and bound. The total left margin becomes 1.5 inches (1 inch standard plus 0.5 inch binding).

Print stylesheet

Adds CSS rules that activate only when the document is printed (using your browser's Print function or Ctrl+P). The print version uses ACB's print-specific line height of 1.15 (tighter than the 1.5 used on screen) and hides screen-only elements like navigation links.

Step 3: Document title (optional)

Sets the document title used by browsers, screen readers, and PDF metadata. Leave blank to use the filename as the title. Applies to HTML, Word, EPUB, and PDF output.

Step 4: Upload your file

Maximum file size: 500 MB. The accepted file types depend on which output format you chose in Step 1.

Which file types are accepted?

For plain text (Markdown) output

You can convert almost any document format:

  • Word (.docx) -- full text, headings, lists, tables, and links are preserved
  • Excel (.xlsx) -- each sheet becomes a section with its data in table format
  • PowerPoint (.pptx) -- slide titles, text content, and speaker notes are extracted
  • PDF (.pdf) -- text is extracted from tagged and untagged PDFs (quality depends on the PDF structure)
  • HTML (.html) -- text content extracted from web pages
  • ePub (.epub) -- full text content from e-books
  • Data files (.csv, .json, .xml) -- structured data converted to readable text
  • ZIP (.zip) -- text extracted from files inside the archive

For accessible web page (HTML) output

Best results come from text-oriented document formats:

  • Markdown (.md) -- best input format; heading structure, lists, links, and tables convert cleanly
  • Word (.docx) -- headings, lists, tables, and basic formatting are preserved
  • ePub (.epub) -- e-book content converted to a single accessible web page
  • reStructuredText (.rst) -- technical documentation format
  • OpenDocument (.odt) -- LibreOffice and Google Docs export format
  • Rich Text (.rtf) -- legacy text format with basic formatting

For Word document (.docx) output

All text-oriented formats except Word itself (since the file is already a .docx):

  • Markdown (.md) -- best input; heading styles, lists, and tables map directly to Word styles
  • ePub (.epub) -- e-book content extracted into a Word document
  • reStructuredText (.rst) -- technical documentation format
  • OpenDocument (.odt) -- LibreOffice and Google Docs export format
  • Rich Text (.rtf) -- legacy text format with basic formatting

For EPUB 3 e-book (via Pandoc) output

All text-oriented formats except EPUB itself:

  • Markdown (.md) -- best input; heading structure becomes EPUB chapters and navigation
  • Word (.docx) -- headings, lists, tables, and basic formatting are preserved
  • reStructuredText (.rst) -- technical documentation format
  • OpenDocument (.odt) -- LibreOffice and Google Docs export format
  • Rich Text (.rtf) -- legacy text format with basic formatting

For accessible PDF output

All text-oriented formats supported by Pandoc:

  • Markdown (.md) -- best input; produces the cleanest PDF with proper heading hierarchy
  • Word (.docx) -- headings, lists, tables, and basic formatting are preserved
  • ePub (.epub) -- e-book content rendered as a print-ready PDF
  • reStructuredText (.rst) -- technical documentation format
  • OpenDocument (.odt) -- LibreOffice and Google Docs export format
  • Rich Text (.rtf) -- legacy text format with basic formatting

For EPUB or DAISY output (via DAISY Pipeline)

Each Pipeline conversion accepts a specific input type:

  • Word to EPUB 3 -- accepts Word (.docx). Converts through DTBook intermediate format for structure preservation.
  • HTML to EPUB 3 -- accepts HTML (.html). Packages a web page into a portable, accessible e-book.
  • EPUB to DAISY 2.02 -- accepts ePub (.epub). Creates a talking book format for dedicated DAISY players.
  • EPUB to DAISY 3 -- accepts ePub (.epub). Creates a DTBook structured text format.

DAISY conversions that produce a folder of files (DAISY 2.02, DAISY 3) are packaged as a ZIP download.

What happens to my file?

Your document is converted on the server and the result is sent back to you immediately as a download. Your original file is deleted from the server right after the conversion is complete. We do not store, read, or share your documents.

Tips for getting the best results