The .ODT word processor documents produced by LibreOffice and Apache OpenOffice are in fact ZIP archives consisting of XML files, other files, and directories. It is actually straightforward to crack open the archive and get at the files and directories within. This can be helpful for fixing bugs, or just for exploring. I recently had cause to do just that. Here is how to open up .ODT documents, and then turn those files and directories back into a document file.

Though LibreOffice and Apache OpenOffice are different word processing and office suites, they share a common “Open Document Format”. Their files are marked with the extension .ODT (think, “Open Document Text”). That format is described in the large, multi-part specification, Open Document Format for Office Applications (OpenDocument) Version 1.3. The compressed archive structure of the document files are described in Part 2: Packages, section 2.2.1 OpenDocument Package. There are important details in section 3.3 MIME Media Type, about the contained text file named mimetype . The Wikipedia article, OpenDocument technical specification, is also helpful.

Before we go through how to crack open these archives, let’s consider the reasons to avoid it, and look at an alternative. There are multiple files inside the archive. The Open Document Format is complex. The relationships between these files are complex. If you make the right change within those files, you can repair or improve them, but if you make the wrong change, you might mess the document up irrecoverably. There is an alternative.

LibreOffice allows you to save a document as a “Flat XML ODF Text Document”. This document type is available in the “File Type” menu in the LibreOffice “File” “Save As…” menu item. Files like this typically get the extension .FODT. (I’m not discussing Apache OpenOffice further, to keep this post manageable.) The result is a single XML file, which you can view directly with a text editor. There are relationships within this file which need preserving, but there are no other files with which to relate. Section 3 Document Structure, Section 3.1.2 <office:document> (Single OpenDocument XML Files) is an entry point to its technical specification. You can open this format of document back up in LibreOffice, and it will let you save it as a normal document (that is, a compressed archive). Consider doing your work on a Flat XML document instead of opening up the .ODT package.

But if you still want to open up that .ODT package, here is how to do it.

First, choose a ZIP archiving and extraction tool. I like to use the zip and unzip command-line tool on macOS. If you prefer using an application with a window-based graphical interface, there are several: 7zip, WinRar, and many more.

Put the document’s .ODT file in a directory where you can work on it and keep its parts together.

Uncompress the archive by using a command like this. example.odt is the name of the document to open up. example.unzipped/ is the name of a directory to receive the parts of the document package.

% unzip example.odt -d example.unzipped/ 
Archive:  example.odt
 extracting: example.unzipped/mimetype  
  inflating: example.unzipped/meta.xml  
  inflating: example.unzipped/settings.xml  
  inflating: example.unzipped/styles.xml  
  inflating: example.unzipped/manifest.rdf  
   creating: example.unzipped/Configurations2/accelerator/
   creating: example.unzipped/Configurations2/statusbar/
   creating: example.unzipped/Configurations2/floater/
   creating: example.unzipped/Configurations2/menubar/
   creating: example.unzipped/Configurations2/popupmenu/
   creating: example.unzipped/Configurations2/images/Bitmaps/
   creating: example.unzipped/Configurations2/toolbar/
   creating: example.unzipped/Configurations2/progressbar/
   creating: example.unzipped/Configurations2/toolpanel/
  inflating: example.unzipped/layout-cache  
 extracting: example.unzipped/Thumbnails/thumbnail.png  
  inflating: example.unzipped/META-INF/manifest.xml  
  inflating: example.unzipped/content.xml  

You can explore these files and directories. What each does, and how to change them safely, is a long story, told by the Open Document Format. It is beyond our scope here. In my case, content.xml was the file containing the material which I wanted to edit.

Once you have made any changes you want to make, remake the document package in two steps. First, add the mimetype file, uncompressed, to a new archive. It must be first in the archive, and it must be in uncompressed form. Second, add the remaining files to the archive. They may be compressed.

These are the commands I used. My new document is example2.odt. The “../” locates it next to and outside the package directory. Note that you must run the zip command from within the package directory, so that the files in the root of the package directory end up in the root of the package archive. zip -0 adds a file without compressing. --recurse-paths tells the archiver to include all the subdirectories and their contents as well. --exclude mimetype tells the archiver to skip the mimetype file this time, because it is already in the archive.

% cd example.unzipped/
% zip -0 ../example2.odt mimetype 
  adding: mimetype (stored 0%)
% zip --recurse-paths ../example2.odt * --exclude mimetype
  adding: Configurations2/ (stored 0%)
  adding: Configurations2/menubar/ (stored 0%)
  adding: Configurations2/images/ (stored 0%)
  adding: Configurations2/images/Bitmaps/ (stored 0%)
  adding: Configurations2/statusbar/ (stored 0%)
  adding: Configurations2/toolbar/ (stored 0%)
  adding: Configurations2/progressbar/ (stored 0%)
  adding: Configurations2/popupmenu/ (stored 0%)
  adding: Configurations2/accelerator/ (stored 0%)
  adding: Configurations2/toolpanel/ (stored 0%)
  adding: Configurations2/floater/ (stored 0%)
  adding: META-INF/ (stored 0%)
  adding: META-INF/manifest.xml (deflated 73%)
  adding: Thumbnails/ (stored 0%)
  adding: Thumbnails/thumbnail.png (deflated 5%)
  adding: content.xml (deflated 92%)
  adding: layout-cache (deflated 55%)
  adding: manifest.rdf (deflated 71%)
  adding: meta.xml (deflated 53%)
  adding: settings.xml (deflated 86%)
  adding: styles.xml (deflated 91%)

Now, you can open example2.odt as an ordinary document. LibreOffice should not give you any error messages.

One easy mistake to make is to create the entire archive in one step, with a command like zip --recurse-paths ../example3.odt * . LibreOffice can open a document like this, but it will warn you that the document is corrupted and needs repair. By including the mimetype file first, uncompressed, you don’t cause this warning.

Another approach to getting a single file from the package, then replacing it, is to use a zip command option to freshen just the files you changed, within a copy of the document file. Here is how to uncompress the archive (as before), copy the document file to a new example4.odt, then use zip -f to “freshen” the changed content.xml file.

% unzip example.odt -d example.unzipped/ 
Archive:  example.odt …
% # make edits as desired. e.g. say we changed content.xml file
% cp example.odt example4.odt
% zip -f example4.odt example.unzipped/content.xml

Some of the tools with windows user interfaces (7zip among them) allow you to open an archive as if it were a directory in your Finder (File Explorer), without actually extracting them from the archive. Then you an run a text editor on the contained files, such as content.xml . When you save the file in the editor, the archive tool asks if you want to replace that file in the archive. Answer yes, then save the archive. It is now ready to open with LibreOffice.

This blog post expands on answers to a question, “How can I uncompress a LibreOffice document to get its XML internals, then make a new document from them?“, which I posted at the LibreOffice support site. I am grateful to contributor Regina. Better answers and tips might accumulate there after this blog post is published. Also, the discussion at LibreOffice Bug 131025 – “Writer document with tables lost data in cells (apparently) replacing with 0” recommended cracking open document archives as a way of repairing documents affected by a bug. Mark van Rossum’s comments there are particularly helpful.