Recently, I remodelled a Joomla template for one of my clients.  As I was installing newer versions of the template, I noticed an error message started appearing on installation:

XML Parsing Error at 1:1. Error 4: Empty document

However, the installation seemed to be successful. And I had a valid templateDetails.xml file in my template’s .zip file. There was a Joomla! forum thread “XML Parsing Error at 1:1. Error 4: Empty document” dating from December 28, 2008 about this problem, so it wasn’t just me.

Here is my diagnosis of the problem, and my solution.  Since this is one of those problems that drove me crazy, I’m going to put my findings here in hopes that search engines will find it and bring it to others who might benefit from the tip. Hence, it’s tagged “robobait”.

The short answer: the error message comes from a metadata file named ._templateDetails.xml, which the Mac OX 10.5.x Finder inserted into the ZIP archive. This file didn’t have XML content, but the Joomla installer interpreted it as such. And the content it did have, happened to be the kind of invalid content that provoked an error message instead of silent failure. The immediate solution is to generate the ZIP file in such a way that there is no metadata file named ._templateDetails.xml . Long-term, it would be nice if Joomla! would not display an error message in this situation.

Read on for more about the problem, the diagnosis, and possible solutions.

The installable archive structure

Joomla templates (and other extensions like modules and plugins) are structured as a compressed archive (.zip, .tar.gz, or .tar.bz2 formats all work) containing the various files of the extension, plus an XML-format file which gives meta-information about the extension. This meta-information includes the name of the extension, the version number, a list of the files used by the extension, and so on.

As I made revisions of the template, I generated the compressed archive installable using Mac OS X 10.5.x Finder. I selected all the files and directories of the template, right-clicked on one of the file icons, and selected “Compress”. Finder produced a file “Archive.zip”, which I renamed.

I had forgotten that Mac OS X archive utilities like compress and tar use the AppleDouble file format by default. AppleDouble stores metadata and resource forks for files in sidecar files with a “._” prefix followed by the main file’s name. They put these in a subdirectory tree __MACOSX. Thus my archives had an 82-byte file __MACOSX/._templateDetails.xml in addition to the correct templateDetails.xml. See OSX Considered harmful for someone else who found this behaviour an obstacle. The Mac OS X archive utilities look at a shell environment variable “COPYFILE_DISABLE” (was “COPY_EXTENDED_ATTRIBUTES_DISABLE” in Mac OS 10.3) to disable the generation of the AppleDouble sidecar files. See the blog Resource forks and tar for more on the subject.

The Joomla installer code

The error, “XML Parsing Error at 1:1. Error 4: Empty document”, appears to be thrown by routine JSimpleXML::_handleerror(), which is called by JSimpleXML::_parse(), which is a thin wrapper around PHP’s xml_parse(), which in turn is a port of the expat XML parsing library. See more at the JSimpleXML class documentation.

In particular, JSimpleXML::_handleerror() line 256 contains the statement:


    JError::raiseWarning( 'SOME_ERROR_CODE' , 'XML Parsing Error at '.$line.':'.$col.'. Error '.$code.': '.xml_error_string($code));

where $line, $col, and $code are all variables passed back from PHP’s xml_parse error routines. Code 4 appears to be expat’s XML_ERROR_INVALID_TOKEN per PHP’s XML parsing documentation, and the expat.h header file. However, PHP’s xml_error_string() gives the string “Empty document” for this code. Reasonable, perhaps; it looks like expat returns XML_ERROR_INVALID_TOKEN for empty files and invalid UTF-8 code sequences, and an empty file might be the more common occurrence of the two.

Thus, it looks like it is a file which is empty or has an invalid UTF-8 code sequence that is provoking the error message. But my templateDetails.xml was valid UTF-8 (valid US-ASCII, in fact). So what file was provoking the error?

I believe (but haven’t confirmed by stepping through with a debugger) that when you ask Joomla to install a template, Joomla creates a JInstaller object, and calls JInstaller::install(”/path/to/install/file.zip”) or similar. This seems evenually to call a method JInstaller::_findManifest(). (There is also a JInstallerTemplate class, with a method JInstallerTemplate::install(). This method also seems eventually to call the same _findManifest().)

JInstaller::_findManifest() finds all files ending in the extension “.xml”. Thus it would find both the correct templateDetails.xml, and the bogus ._templateDetails.xml. From line 1103 in that method:

// Get an array of all the xml files from teh installation directory
$xmlfiles = JFolder::files($this->getPath('source'), '.xml$', 1, true);

The method then calls JSimpleXML::_parse() on each one of the files. It looks like there was something about the ._templateDetails.xml file which caused the parser to throw an error, not just return a failure code. This provokes JSimpleXML to produce the message, XML Parsing Error at 1:1. Error 4: Empty document. The method notes the error message, and crosses that file from its list. Then the method parses the correct templateDetails.xml file to complete the installation.

I observed this error message in my client’s site, which runs Joomla! 1.5.9 site. Just for fun I tried installing that very same template in an experimental Joomla 1.5.3 site which is close to a bare install of the basic Joomla distribution. It installed with no error message. I then upgraded that experimental site to Joomla 1.5.9 using the 1.5.3-> 1.5.9 upgrader. I uninstalled the template, then installed it again. This time, the error message occured. This makes me think that Joomla! 1.5.9 behaves differently than 1.5.3 in some way which displays an error for my ._templateDetails.xml file. I’m guessing that 1.5.9 is more diligent about passing error messages on to the user.

Solutions

What could Joomla! do to prevent this problem?  I think it would be helpful if Joomla would extend this error message to include a file name. That alone would have saved me hours of debugging: thinking my real templateDetails.xml file was at fault, I stripped it down, checked its validity, and of course found nothing. Joomla could also perhaps not pass on such error messages as long as one of the XML files in the archive is valid.  It could even change its pattern to exclude the AppleDouble metadata files. Since the second parameter to JFolder::files is eventually passed to PHP’s preg_match(), maybe “^[^\.][^_].*xml$” would do the trick. (This regexp also excludes file names like “a.xml”, but those aren’t legal names for a Joomla! manifest anyway.)

To fix this problem as a template developer, try one of:

  • Use some platform other than Mac OS X to generate your installable archives.
  • On Mac OS X, compress from the shell. Execute export COPYFILE_DISABLE=true first, then your compress command like zip -r ../mytemplate.zip *.

    May no-one else have to spend the hours I spent diagnosing this problem!