Mammoth is designed to convert .docx documents, such as those created by Microsoft Word, and convert them to HTML. Mammoth aims to produce simple and clean HTML by using semantic information in the document, and ignoring other details. For instance, Mammoth converts any paragraph with the style
h1 elements, rather than attempting to exactly copy the styling (font, text size, colour, etc.) of the heading. This allows you to paste from Word documents without the usual mess.
There's a large mismatch between the structure used by .docx and the structure of HTML, meaning that the conversion is unlikely to be perfect for more complicated documents. Mammoth works best if you only use styles to semantically mark up your document.
The following features are currently supported:
Tables. The formatting of the table itself, such as borders, is currently ignored, but the formatting of the text is treated the same as in the rest of the document.
Footnotes and endnotes.
Bold, italics, superscript and subscript.
Text boxes. The contents of the text box are treated as a separate paragraph that appears after the paragraph containing the text box.
By default, Mammoth maps some common .docx styles to HTML elements. For instance, a paragraph with the style name
Heading 1 is converted to a
h1 element. If you have a document with your own custom styles, you can use an embedded style map to tell Mammoth how those styles should be mapped. For instance, you could convert paragraphs with the style named
h1 elements with
class="warning" with the style mapping:
p[style-name='WarningHeading'] => h1.warning:fresh
An online tool can be used to embed style maps into an existing document. Details of how to write style maps can be found on the mammoth.js documentation.
Add basic Gutenberg support.
Update mammoth.js to 1.4.7.
Allow MAMMOTH_OPTIONS to override idPrefix.
Update mammoth.js to 1.4.0. This includes improved handling of hyperlinks, and converts table headers into thead elements.
Handle unsuccessful image uploads where the HTTP request succeeds, but WordPress rejects the file. Fixes an issue where documents with EMF images couldn't be imported.
Update mammoth.js to 1.3.2. This includes a fix for documents where images are referenced by a URI relative to the base URI.
Update mammoth.js to 1.3.1. This includes new ways to map styles, such as style name prefixes.
Improve styling of preview to match the editor.
Fix a bug where images wouldn't upload on certain server configurations.
Allow options to be passed to mammoth.js through a MAMMOTH_OPTIONS global variable.
Include wp-image-* class when inserting images. This allows the WordPress editor to correctly identify the image and show appropriate editing options.
If an image has an alt text description in the original document, set the alt text in the media library when uploading that image.
If an image has an alt text description in the original document, use it to generate the filename.
Set image filename extension based on the image content type.
Show a message while the document is being inserted.
Update mammoth.js to 0.3.30. This includes better support for lists made in LibreOffice.
Update mammoth.js to 0.3.18. Includes support for hyperlinks to bookmarks in the same document.
Add support for CKEditor.
Support any post type that supports the WordPress editor.
Generate consistent footnote and endnote IDs based on the post ID.
Update mammoth.js to 0.3.15.
Fix preview rendering on Chrome.
Update mammoth.js to 0.3.12.
Update to mammoth.js 0.2.2
Pretty print HTML output
Hide inline image data in raw HTML preview