XSLT - remove whitespace from template

XmlXslt

Xml Problem Overview


I am using XML to store a small contact list and trying to write a XSL template that will transform it into a CSV file. The problem I am having is with whitespace in the output.

The output:

Friend, John, Smith, Home,
		123 test,
	   Sebastopol,
	   California,
	   12345,
	 Home 1-800-123-4567, Personal [email protected]

I have indented/spaced both the source XML file and the associated XSL Template to make it easier to read and develop, but all that extra white space is getting itself into the output. The XML itself doesn't have extra whitespace inside the nodes, just outside of them for formatting, and the same goes for the XSLT.

In order for the CSV file to be valid, each entry needs to be on it's own line, not broken up. Besides stripping all extra white space from the XML and XSLT (making them just one long line of code), is there another way to get rid of the whitespace in the output?

Edit: Here is a small XML sample:

<PHONEBOOK>
	<LISTING>
		<FIRST>John</FIRST>
		<LAST>Smith</LAST>
		<ADDRESS TYPE="Home">
			<STREET>123 test</STREET>
			<CITY>Sebastopol</CITY>
			<STATE>California</STATE>
			<ZIP>12345</ZIP>
		</ADDRESS>
		<PHONE>1-800-123-4567</PHONE>
		<EMAIL>[email protected]</EMAIL>
		<RELATION>Friend</RELATION>
	</LISTING>
</PHONEBOOK>

And here is the XSLT:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />

 <xsl:template match="/">
   <xsl:for-each select="//LISTING">
	<xsl:value-of select="RELATION" /><xsl:text>, </xsl:text>
	<xsl:value-of select="FIRST" /><xsl:text>, </xsl:text>
	<xsl:value-of select="LAST" /><xsl:text>, </xsl:text>

	<xsl:if test="ADDRESS">
	 <xsl:for-each select="ADDRESS">
	   <xsl:choose>
		<xsl:when test="@TYPE">
		 <xsl:value-of select="@TYPE" />,
		</xsl:when>
			<xsl:otherwise>
			<xsl:text>Home </xsl:text>
			</xsl:otherwise>
	   </xsl:choose>
	   <xsl:value-of select="STREET" />,
	   <xsl:value-of select="CITY" />,
	   <xsl:value-of select="STATE" />,
	   <xsl:value-of select="ZIP" />,
	 </xsl:for-each>
	</xsl:if>

	<xsl:for-each select="PHONE">
	  <xsl:choose>
	   <xsl:when test="@TYPE">
		<xsl:value-of select="@TYPE" />  
	   </xsl:when>
	   <xsl:otherwise><xsl:text>Home </xsl:text></xsl:otherwise>
	  </xsl:choose>
	 <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
	</xsl:for-each>

	<xsl:if test="EMAIL">
	 <xsl:for-each select="EMAIL">
	  <xsl:choose>
	   <xsl:when test="@TYPE">
		<xsl:value-of select="@TYPE" /><xsl:text  > </xsl:text> 
	   </xsl:when>
	   <xsl:otherwise><xsl:text  >Personal </xsl:text></xsl:otherwise>
	  </xsl:choose>
	  <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
	 </xsl:for-each>
	</xsl:if>
	<xsl:text>&#10;&#13;</xsl:text>
   </xsl:for-each>
 </xsl:template>

</xsl:stylesheet>

Xml Solutions


Solution 1 - Xml

In XSLT, white-space is preserved by default, since it can very well be relevant data.

The best way to prevent unwanted white-space in the output is not to create it in the first place. Don't do:

<xsl:template match="foo">
  foo
</xsl:template>

because that's "\n··foo\n", from the processor's point of view. Rather do

<xsl:template match="foo">
  <xsl:text>foo</xsl:text>
</xsl:template>

White-space in the stylesheet is ignored as long as it occurs between XML elements only. Simply put: never use "naked" text anywhere in your XSLT code, always enclose it in an element.

Also, using an unspecific:

<xsl:apply-templates />

is problematic, because the default XSLT rule for text nodes says "copy them to the output". This applies to "white-space-only" nodes as well. For instance:

<xml>
  <data> value </data>
</xml>

contains three text nodes:

  1. "\n··" (right after <xml>)
  2. "·value·"
  3. "\n" (right before </xml>)

To avoid that #1 and #3 sneak into the output (which is the most common reason for unwanted spaces), you can override the default rule for text nodes by declaring an empty template:

<xsl:template match="text()" />

All text nodes are now muted and text output must be created explicitly:

<xsl:value-of select="data" />

To remove white-space from a value, you could use the normalize-space() XSLT function:

<xsl:value-of select="normalize-space(data)" />

But careful, since the function normalizes any white-space found in the string, e.g. "·value··1·" would become "value·1".

Additionally you can use the <xsl:strip-space> and <xsl:preserve-space> elements, though usually this is not necessary (and personally, I prefer explicit white-space handling as indicated above).

Solution 2 - Xml

By default, XSLT templates have <xsl:preserve-space> set, which will keep whitespace in your output. You can add <xsl:strip-space elements="*"> to tell it to where to delete whitespace.

You may also need to include a normalize-space directive, like so:

<xsl:template match="text()"><xsl:value-of select="normalize-space(.)"/></xsl:template> 

Here is an example for preserve/strip space from W3 Schools.

Solution 3 - Xml

As far as removing tabs but retaining separate lines, I tried the following XSLT 1.0 approach, and it works rather well. Your use of version 1.0 or 2.0 largely depends on which platform you're using. It looks like .NET technology is still dependant on XSLT 1.0, and so you're limited to extremely messy templates (see below). If you're using Java or something else, please refer to the much cleaner XSLT 2.0 approach listed towards the very bottom.

These examples are meant to be extended by you to meet your specific needs. I'm using tabs here as an example, but this should be generic enough to be extensible.

XML:

<?xml version="1.0" encoding="UTF-8"?>
<text>
    	adslfjksdaf
        
    			dsalkfjdsaflkj
          
    		lkasdfjlsdkfaj
</text>

...and the XSLT 1.0 template (required if you use .NET):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet  
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">   
 <xsl:template name="search-and-replace">
   <xsl:param name="input"/>
   <xsl:param name="search-string"/>
   <xsl:param name="replace-string"/>
   <xsl:choose>
    <xsl:when test="$search-string and 
                    contains($input,$search-string)">
       <xsl:value-of
           select="substring-before($input,$search-string)"/>
       <xsl:value-of select="$replace-string"/>
       <xsl:call-template name="search-and-replace">
         <xsl:with-param name="input"
               select="substring-after($input,$search-string)"/>
         <xsl:with-param name="search-string"
               select="$search-string"/>
         <xsl:with-param name="replace-string"
               select="$replace-string"/>
       </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$input"/>
    </xsl:otherwise>
   </xsl:choose>
  </xsl:template>                
  <xsl:template match="text">
   <xsl:call-template name="search-and-replace">
     <xsl:with-param name="input" select="text()" />
     <xsl:with-param name="search-string" select="'&#x9;'" />
     <xsl:with-param name="replace-string" select="''" />
   </xsl:call-template>    
  </xsl:template>
</xsl:stylesheet>

XSLT 2.0 makes this trivial with the replace function:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="xs"
      version="2.0">
 <xsl:template match="text">
  <xsl:value-of select="replace(text(), '&#x9;', '')" />
 </xsl:template>
</xsl:stylesheet>

Solution 4 - Xml

Others have already pointed out the general problem. Specific one for your stylesheet is that you forgot <xsl:text> for commas:

   <xsl:choose>
    <xsl:when test="@TYPE">
     <xsl:value-of select="@TYPE" />,
    </xsl:when>
    <xsl:otherwise>Home </xsl:otherwise>
   </xsl:choose>
   <xsl:value-of select="STREET" />,
   <xsl:value-of select="CITY" />,
   <xsl:value-of select="STATE" />,
   <xsl:value-of select="ZIP" />,

This makes whitespace following every comma significant, and so it ends up in the output. If you wrap each comma in <xsl:text>, the problem disappears.

Also, get rid of that disable-output-escaping. It doesn't do anything here, since you're not outputting XML.

Solution 5 - Xml

My previouse answer is wrong, all commas must be output via tag 'text'

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:template match="/PHONEBOOK">
        <xsl:for-each select="LISTING">
            <xsl:value-of select="RELATION" /><xsl:text>, </xsl:text>
            <xsl:value-of select="FIRST" /><xsl:text>, </xsl:text>
            <xsl:value-of select="LAST" /><xsl:text>, </xsl:text>
            
                <xsl:for-each select="ADDRESS">
                    <xsl:choose>
                        <xsl:when test="@TYPE">
                            <xsl:value-of select="@TYPE" /><xsl:text>,</xsl:text>
                        </xsl:when>
                        <xsl:otherwise><xsl:text>Home </xsl:text></xsl:otherwise>
                    </xsl:choose>
                <xsl:value-of select="STREET/text()" /><xsl:text>,</xsl:text>
                    <xsl:value-of select="CITY/text()" /><xsl:text>,</xsl:text>
                    <xsl:value-of select="STATE/text()" /><xsl:text>,</xsl:text>
                    <xsl:value-of select="ZIP/text()" /><xsl:text>,</xsl:text>
                </xsl:for-each>
            
            <xsl:for-each select="PHONE">
                <xsl:choose>
                    <xsl:when test="@TYPE">
                        <xsl:value-of select="@TYPE" />  
                    </xsl:when>
                    <xsl:otherwise><xsl:text>Home </xsl:text></xsl:otherwise>
                </xsl:choose>
                <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
            </xsl:for-each>
            
            <xsl:if test="EMAIL">
                <xsl:for-each select="EMAIL">
                    <xsl:choose>
                        <xsl:when test="@TYPE">
                            <xsl:value-of select="@TYPE" /><xsl:text  > </xsl:text> 
                        </xsl:when>
                        <xsl:otherwise><xsl:text  >Personal </xsl:text></xsl:otherwise>
                    </xsl:choose>
                    <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
                </xsl:for-each>
            </xsl:if>
            <xsl:text>&#10;&#13;</xsl:text>
        </xsl:for-each>
    </xsl:template>
    <xsl:template match="text()|@*">
        <xsl:text>-</xsl:text>
    </xsl:template>
    
</xsl:stylesheet>

Solution 6 - Xml

This answer may not direct answer to the problem. But a general way solve this issue. Create a template rule:

<xsl:template name="strip-space">
    <xsl:param name="data"/>
    <xsl:value-of select="normalize-space($data)"/>
</xsl:template>

Now call it to remove excess white-space:

<xsl:template match="my-element">
    <xsl:call-template name="strip-space">
        <xsl:with-param name="data">
            <xsl:apply-templates/>
        </xsl:with-param>
    </xsl:call-template>
</xsl:template>

For example, consider the below XML fragment:

<?xml version="1.0" encoding="UTF-8"?>
<test>
    <my-element>
        <e1>some text</e1> <e2>some other text</e2> <e3>some other text</e3>
    </my-element>
</test>

And if someone likes to convert it to below text:

{test{my-element{e1some text} {e2some other text} {e3some other text}}}

Now comes the stylesheet:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" />
    
    <xsl:template match="/">
        <xsl:apply-templates mode="t1"/>
        <xsl:text>&#xa;</xsl:text>
        <xsl:apply-templates mode="t2"/>
    </xsl:template>
    
    <xsl:template match="*" mode="t1">
        <xsl:text>{</xsl:text>
        <xsl:value-of select="local-name()"/>
        <xsl:call-template name="strip-space">
            <xsl:with-param name="data">
                <xsl:apply-templates mode="t1"/>
            </xsl:with-param>
        </xsl:call-template>
        <xsl:text>}</xsl:text>
    </xsl:template>
    
    <xsl:template match="*" mode="t2">
        <xsl:text>{</xsl:text>
        <xsl:value-of select="local-name()"/>
        <xsl:value-of select="."/>
        <xsl:text>}</xsl:text>
    </xsl:template>
    
    <xsl:template name="strip-space">
        <xsl:param name="data"/>
        <xsl:value-of select="normalize-space($data)"/>
    </xsl:template>
    
</xsl:stylesheet>

After applying the stylesheet, it produce:

{test{my-element{e1some text} {e2some other text} {e3some other text}}}

{test
    
        some text some other text some other text
    
}

The output describes how @mode="t1" (<xsl:value-of select="."/> approach) differs from the @mode="t2" (xsl:call-template approach). Hope this helps somebody.

Solution 7 - Xml

Add one template into your xslt

<xsl:template match="text()"/>

Solution 8 - Xml

Modify the code which we used to format raw xml file by removing below lines will remove extra blank white spaces added in exported excel.

While formatting with indented property system is adding those extra blank white spaces.

Comment lines related to formatting xml like below line and try.

xmlWriter.Formatting = System.Xml.Formatting.Indented;

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRobert DeBoerView Question on Stackoverflow
Solution 1 - XmlTomalakView Answer on Stackoverflow
Solution 2 - XmlNoah HeldmanView Answer on Stackoverflow
Solution 3 - XmlDavid AndresView Answer on Stackoverflow
Solution 4 - XmlPavel MinaevView Answer on Stackoverflow
Solution 5 - XmlNick GroznykhView Answer on Stackoverflow
Solution 6 - XmlCylianView Answer on Stackoverflow
Solution 7 - XmlNick GroznykhView Answer on Stackoverflow
Solution 8 - XmlTejas SawantView Answer on Stackoverflow