How do I include &, <, > etc in XML attribute values
JavaXmlEntityreferenceXml AttributeJava Problem Overview
I want to create an XML file which will be used to store the structure of a Java program. I am able to successfully parse the Java program and create the tags as required. The problem arises when I try to include the source code inside my tags, since Java source code may use a vast number of entity reference and reserved characters like &
, <
,>
, &
. I am not able to create a valid XML.
My XML should go like this:
<?xml version="1.0"?>
<prg name="prg_name">
<class name= "class_name>
<parent>parent class</parent>
<interface>Interface name</interface>
.
.
.
<method name= "method_name">
<statement>the ordinary java statement</statement>
<if condition="Conditional Expression">
<statement> true statements </statement>
</if>
<else>
<statement> false statements </statement>
</else>
<statement> usual control statements </statement>
.
.
.
</method>
</class>
.
.
.
</prg>
Like this, but the problem is conditional expressions of if
or other statements have a lot of &
or other reserved symbols in them which prevents XML from getting validated. Since all this data (source code) is given by the user I have little control over it. Escaping the characters will be very costly in terms of time.
I can use CDATA to escape the element text but it can not be used for attribute values containing conditional expressions. I am using Antlr Java grammar to parse the Java program and getting the attributes and content for the tags. So is there any other workaround for it?
Java Solutions
Solution 1 - Java
You will have to escape
" to "
' to '
< to <
> to >
& to &
for xml.
Solution 2 - Java
In XML attributes you must escape
" with "
< with <
& with &
if you wrap attribute values in double quotes ("
), e.g.
<MyTag attr="If a<b & b<c then a<c, it's obvious"/>
meaning tag MyTag
with attribute attr
with text If a<b & b<c then a<c, it's obvious
- note: no need to use '
to escape '
character.
If you wrap attribute values in single quotes ('
) then you should escape these characters:
' with '
< with <
& with &
and you can write "
as is.
Escaping of >
with >
in attribute text is not required, e.g. <a b=">"/>
is well-formed XML.