An XSLT to turn XML into CSV

By Confusion on Friday 29 August 2008 21:23 - Comments (3)
Categories: Software engineering, XML, Views: 10.866

Today I needed to turn an XML file into a CSV file. I was sure someone would have solved this problem before, but I could not find an appropriate XSLT. The problem can be seperated into two subproblems: one is 'flattening' the XML, by which I mean turning it from

XML:
1
2
3
4
5
6
7
8
9
10
11
12
<root>
  <element>
    <foo>1</foo>
    <bar>
      <baz>2</baz>
      <fooz>3</fooz>
    </bar>
  </element>
  <element>
    <foo>1</foo>
  </element>
</root>


into

XML:
1
2
3
4
5
6
7
8
9
10
<root>
  <element>
    <foo>1</foo>
    <bar.baz>2</bar.baz>
    <bar.fooz>3</bar.fooz>
  </element>
  <element>
    <foo>1</foo>
  </element>
</root>


considering I am interested in converting each 'element' into a CSV line.
The 'namespaced' element names are required, because they serve as the CSV column headers and they are required to be unique (which, for our case, is guaranteed by this approach).
The other subproblem is converting XML to CSV, of which the main challenge was making sure the last element is not followed by a comma.

In the end, I came up with the templates below to print the values of the 'childless' elements.

XML:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<xsl:template match="//element">
        <xsl:apply-templates select="*" />
        <xsl:text>&#x0A;</xsl:text>
    </xsl:template>

    <xsl:template match="//element//*">
        <xsl:choose>
            <xsl:when test="count(child::*) > 0">
                <xsl:apply-templates select="*" />
            </xsl:when>
            <xsl:otherwise>
                <xsl:text>"</xsl:text>
                <xsl:value-of select="."/>
                <xsl:text>"</xsl:text>
            </xsl:otherwise>
        </xsl:choose>
        <xsl:if test="position() != last()">
            <xsl:text>,</xsl:text>
        </xsl:if>
    </xsl:template>


The upper template applies the lower template to the child nodes of element nodes called 'element', one 'element' at a time. The lower template determines whether the node has any child elements. If it doesn't, it prints the node. Otherwise, it recursively applies this template to the child nodes that were present. Finally, a comma is placed for each element that isn't the last in the node-set. What's a bit tricky here is that no comma is placed for the last grandchild of an 'element', so you might expect those to be missing, but that one is provided by the comma after the element itself.

Volgende: Running a Java 1.3.1 JDK on Debian 09-'08 Running a Java 1.3.1 JDK on Debian
Volgende: What you could be 08-'08 What you could be

Comments


By Tweakers user pasz, Saturday 30 August 2008 16:54

Thanx for the solution, but why does every company have the need to export CSV files....

By Tweakers user Confusion, Sunday 31 August 2008 14:02

There are two major reasons:
1) They have been in business for quite a while, started exporting data in CSV format long ago (when XML was not mainstream yet) and still have to support customers that haven't switched (we are talking business to business here) or
2) They are dealing with new customers that wish to receive data in CSV format, because they can not (or do not want to) handle XML

By Maolis Tsilikidis, Tuesday 26 July 2011 16:51

I have to convert xml files to .csv format. I can write the xslt code for the transformation but how do I run it?? I am new in all this stuff and I would apreciate any help given. Thank you.

Comments are closed