I use the GnuCash free software accounting software. Like many accounting tools, it can import bank or credit card transactions, and has a way to learn the correct mapping from transaction data to my own account structure. And, sometimes the tool gets the mapping wrong, and needs to be reset.  Here is how I was able to perform this reset. I post it here in the hopes it will help others. Fasten your seatbelt, it involves some awfully technical command-line tools, including XSLT processing.

The GnuCash currently (as of version 2.6) has the ability to import bank or credit card transactions in the form of OFX transaction files. You invoke this via the menu option File… Import… Import OFX/QFX…. GnuCash lists the transactions from the file, and provides a way for you to indicate, for each transaction, to which of your accounts the transaction matches. It also tries to learn from each of your indications what transaction wording matches which account, so that it can suggest the right mapping for future imports. This mapping uses a Bayesian probability formula, and is described (tersely) in the “Bayes” article of the GnuCash wiki.

A drawback of this adaptive mapping is that GnuCash never lets go of anything it learns. If you at first match the wrong account to a transaction, that wrong mapping continues to favour wrong mapping the future. The mapping data is tied to the specific account tree and account names in your GnuCash files. For more information, see GnuCash bug 745896, Bayes Import Map does not use GUID but account name, leading to outdated entries. If you rename or reorganise your accounts, the mapping data will be ineffective, but it will still overshadow new matches to the new or corrected account names. Thus, it is helpful to be able to list the mapping data in GnuCash’s file, and to clear out mapping data relating to account names which are wrong or out of date.

The GnuCash format (tersely)

GnuCash currently (as of version 2.6) stores its accounting data as either a database or a compressed XML format file. GnuCash’s format is described (tersely) in the “GnuCash XML Format” article of the GnuCash wiki. At the risk of being terse myself, the transaction mapping data for the Bayes formula for a chequing or credit card account is stored in the corresponding <gnc:account>/<gnc:slots> element, in a <slot> where the <key> element is “import-map-bayes”, and the <value type=”frame”> element contains the Bayes data. The XML format is standardised, and many tools exist for editing XML format files in automated ways.

It is possible to get GnuCash to store the accounting data as an uncompressed XML file. You do this by opening the GnuCash preferences, going to the General tab, then the Files section, and unchecking the option “Compress files”. Close the preferences. Then invoke the menu option File… Save As…. At the top of the Save As dialogue, there is a pull-down menu labelled “Data Format”. Be sure the “xml” option is selected here. Enter a file name (I suggest using the extension .gnucash.xml), and finish the Save As dialogue. The result is an uncompressed XML format file for which you can edit the mapping data.  Open the file with a text editor, to be sure it is really uncompressed and really in XML format. You need not make any edits with the text editor.

Install an XSLT processor. I used xsltproc, which is part of the libxslt package, which I installed via Mac Ports. XSLTproc takes an XSLT stylesheet, with instructions, and an input xml file, and it writes an output XML file, the result of applying the instructions to the input XML file.

list-import-map-bayes-accounts.xslt

The following XSLT stylesheet lists the Bayes mapping data for each account in a gnucash XML file. This lets you see which target accounts (e.g. expense “Groceries” or income “Interest”) has matched, and which receiving account (e.g. a chequing or credit card account) received that transaction.

Usage: xlstproc  list-import-map-bayes-accounts.xslt my_bookkeeping.gnucash.xml >account_list.txt

Account_list.txt will have entries like:

Income:Interest <tab> MyBank Chequing
<?xml version=”1.0″ encoding=”utf-8″ ?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”
    version=”1.0″
    xmlns:gnc=”http://www.gnucash.org/XML/gnc”
    xmlns:act=”http://www.gnucash.org/XML/act”
    xmlns:slot=”http://www.gnucash.org/XML/slot”
    >
    <!– list-import-map-bayes-accounts.xslt
        Last updated 2016-02-15. For GnuCash 2.6.11 and similar.
        Reads a GnuCash data file (in uncompressed XML format), printing
        some history data used by the Generic Importer feature,
        for Bayesian matching of imported transactions to accounts.
        It gives you a sorted list the name of each target account that
        has matched to transactions, and the name of the account which
        receved the import.

            Target_account_A  <tab>  Receiving_account_1
            Target_account_A  <tab>  Receiving_account_2
            Target_account_B  <tab>  Receiving_account_1
            Target_account_B  <tab>  Receiving_account_3

        This is especially helpful if you have changed names or parents
        of accounts, but the history data still refers to the old
        account names and overwhelms the history data of new names.
        (See bug https://bugzilla.gnome.org/show_bug.cgi?id=745896
        ”Bayes Import Map does not use GUID but account name, leading to outdated entries.”)

        Copyright Jim DeLaHunt (jdlh.com) 2016.
        Licensed under the GNU General Public License, Version 2, or
        (at your option) Version 3.  No liability and no warranty. –>
    <xsl:output method=”xml” omit-xml-declaration=”yes” indent=”no” encoding=”utf-8″ />

    <xsl:key name=’text-ancestor’
        match=’/gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
              slot[slot:key/text()=”import-map-bayes”]/slot:value[@type=”frame”]/
              slot/slot:value[@type=”frame”]/slot/slot:key’
        use=’concat(text(), ”	“, ancestor::gnc:account[1]/act:name/text())’
    />

    <!– Prevent text nodes from being copied out. –>
    <xsl:template match=’text()’ />

    <!– Print specific account targets from under import-map-bayes –>
    <xsl:template match=’/'>
        <xsl:apply-templates select=’/gnc-v2//
            gnc:account[@version=”2.0.0″]/act:slots/
            slot[slot:key/text()=”import-map-bayes”]/slot:value[ @type=”frame”]/
            slot/slot:value[ @type=”frame” ]/slot/slot:key[
                generate-id() = generate-id( key(”text-ancestor”,
                    concat(text(), ”	“,
                        ancestor::gnc:account[1]/act:name/text())
                    )[1] )
            ]’
        >
            <xsl:sort select=’text()’ />
            <xsl:sort select=’ancestor::gnc:account[1]/act:name/text()’ />
        </xsl:apply-templates>
    </xsl:template>

    <xsl:template  match=’/gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
        slot[slot:key/text()=”import-map-bayes”]/slot:value[@type=”frame”]/
        slot/slot:value[@type=”frame”]/slot/slot:key’
    >
        <xsl:value-of select=’text()’ /><xsl:text>	</xsl:text>
        <xsl:value-of select=’ancestor::gnc:account[1]/act:name/text()’ />
                <xsl:text> </xsl:text>
    </xsl:template>

</xsl:stylesheet>

reset-import-map-bayes-history.xslt

The following XSLT stylesheet resets the import mapping, by deleting all the Bayes mapping data for every account in a gnucash XML file.

Usage: xlstproc -o my_bookkeeping_new.gnucash.xml reset-import-map-bayes-history.xslt my_bookkeeping.gnucash.xml

<?xml version=”1.0″ encoding=”utf-8″ ?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”
    version=”1.0″
    xmlns:gnc=”http://www.gnucash.org/XML/gnc”
    xmlns:act=”http://www.gnucash.org/XML/act”
    xmlns:slot=”http://www.gnucash.org/XML/slot”
    >
    <!– reset-import-map-bayes-history.xslt
    	Last updated 2016-02-15. For GnuCash 2.6.11 and similar.
    	Transforms a GnuCash data file (in uncompressed XML format) by
    	removing all history data used by the Generic Importer feature,
    	for Bayesian matching of imported transactions to accounts.
    	Result is an equivalent GnuCash data file, with all accounts and
    	transactions intact, and only the history data gone.
    	Copyright Jim DeLaHunt (jdlh.com) 2016.
    	Licensed under the GNU General Public License, Version 2, or
    	(at your option) Version 3.  No liability and no warranty. –>
    <xsl:output method=”xml” omit-xml-declaration=”no” indent=”yes” encoding=”utf-8″ />

    <xsl:template match=”node() | @*”>
        <xsl:copy>
            <xsl:apply-templates select=”node() | @*” />
        </xsl:copy>
    </xsl:template>

    <!– Drop slot with slot:key import-map-bayes entirely –>
	<xsl:template match=’gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
		slot[slot:key/text()=”import-map-bayes”]’ />
	<!– Drop text node following dropped slot, to close up the blank line –>
	<xsl:template match=’gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
		text()[following-sibling::slot[1]/slot:key/text()=”import-map-bayes”]’ />
	<!– The redundancy in match patterns can’t be helped. We can’t put the
	common part in a variable. XSLT 1.0 spec says, “It is an error for
	the value of the match attribute to contain a VariableReference.” –>

</xsl:stylesheet>

prune-import-map-bayes-accounts.xslt

The following XSLT stylesheet prunes the import mapping data for certain target accounts in a gnucash XML file. You need to edit the XSLT stylesheet to specify which account names to prune, using a plain-text editor. There are comments in the XSLT stylesheet which show a 5-line block of text to duplicate and adapt for each account you want to prune.

Usage: xlstproc -o my_bookkeeping_new.gnucash.xml prune-import-map-bayes-accounts.xslt my_bookkeeping.gnucash.xml

Note: this stylesheet leaves behind blank lines where the pruned elements used to be. I haven’t been able to figure out how to prevent those blank lines. This is the subject of a StackOverflow question, How can my XSLT filter avoid leaving blank lines in output XML when deleting elements, without changing indentation otherwise? If you, gentle reader, know how to do this, I would appreciate the answer on the StackOverflow page or as a comment to this post.

<?xml version=”1.0″ encoding=”utf-8″ ?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”
    version=”1.0″
    xmlns:gnc=”http://www.gnucash.org/XML/gnc”
    xmlns:act=”http://www.gnucash.org/XML/act”
    xmlns:slot=”http://www.gnucash.org/XML/slot”
    >
    <!– prune-import-map-bayes-accounts.xslt
    	Last updated 2016-02-15. For GnuCash 2.6.11 and similar.

    	Transforms a GnuCash data file (in uncompressed XML format) by
    	removing some history data used by the Generic Importer feature,
    	for Bayesian matching of imported transactions to accounts.
    	This is especially helpful if you have changed names of accounts,
    	or moved them to different parents, but the history data still
    	refers to the old account names, and overwhelms the history
    	data of new names. (See bug https://bugzilla.gnome.org/show_bug.cgi?id=745896
    	“Bayes Import Map does not use GUID but account name, leading to outdated entries.”)
    	Result is an equivalent GnuCash data file, with all accounts and
    	transactions intact, and the history data without the deleted entries.

    	In order to use this tool, you must edit this file to supply
    	names of accounts you want to drop from the matching history.
    	Look for the comment “Template: copy this whole xsl:template…”
    	below. Copy the 5-line template, and paste the copy right after
    	the template. Modify the string in double-quotes to be the text
    	at the start of the target account name you want to prune. (It
    	can even be the entire account name.)
    	Save this xslt file.  Then run your GnuCash uncompressed XML file
    	and this XSLT styesheet through an XML processor, generating a
    	pruned GnuCash uncompressed XML file.

    	Copyright Jim DeLaHunt (jdlh.com) 2016.
    	Licensed under the GNU General Public License, Version 2, or
    	(at your option) Version 3.  No liability and no warranty. –>
    <xsl:output method=”xml” omit-xml-declaration=”no” indent=”yes” encoding=”utf-8″ />

	<xsl:template match=”node() | @*”>
        <xsl:copy>
        	<xsl:apply-templates select=”node() | @*” />
        </xsl:copy>
    </xsl:template>

    <!– Prune import-map-bayes child slots, which match these patterns  –>
    <!– Template: copy this whole xsl:template (5 lines), then modify the copy
             to have your account name, in quotes, in place of “Imbalance”. –>
	<xsl:template match=’gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
		slot[slot:key/text()=”import-map-bayes”]/slot:value[@type=”frame”]/
		slot/slot:value[@type=”frame”]/slot[starts-with(slot:key/text(),
			“Imbalance”
	)]’ />
	<xsl:template match=’gnc-v2//gnc:account[@version=”2.0.0″]/act:slots/
		slot[slot:key/text()=”import-map-bayes”]/slot:value[@type=”frame”]/
		slot/slot:value[@type=”frame”]/slot[starts-with(slot:key/text(),
			“Orphan”
	)]’ />

	<!– TODO: Close up the blank lines where the slots used to be. –>

</xsl:stylesheet>

Usage

Use list-import-map-bayes-accounts.xslt to list the account mapping data in your GnuCash file.

Decide for which accounts to prune the mapping data. Edit prune-import-map-bayes-accounts.xslt to specify those accounts. Then run prune-import-map-bayes-accounts.xslt on your GnuCash file, to get a new GnuCash file with that import mapping data gone.

Or, it may be simpler to throw out all the mapping data, and retrain the Bayes import mapper.  In that case, run reset-import-map-bayes-history.xslt on your GnuCash file, to get a new GnuCash file with all import mapping data gone.

Then, open this new GnuCash file with GnuCash. Check that its contents look fine. Use the menu option File… Save As… to put the new data file in place of the old. Adjust the Data Format setting as necessary. Finally, reselect the Preferences option “Compress files” if you wish.