org.xml.sax.SAXParseException: Premature end of file for *VALID* XML

JavaXmlXml Parsing

Java Problem Overview


I am getting very strange "Premature end of file." exception for last few days on one of our servers. The same configuration XML works fine on another server. We are using Tomcat 5.0.28 on both these servers. This code has been working for ages (7+ years), only after recent server crash, we faced this problem on one of the servers. There is no change in XML as well as Java parsing code. :(

The only difference I can see is in Java versions -

Problem Server java version "1.6.0_16" Java(TM) SE Runtime Environment (build 1.6.0_16-b01) Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

Working Server java version "1.6.0_07" Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)

Here is the Java code that has been working for several years -

private void readSource(final InputSource in ) {
	try {
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		DocumentBuilder db = dbf.newDocumentBuilder();
		Document doc = db.parse(in);
		Element elt = doc.getDocumentElement();
	
		this.readElement( elt );
	} catch ( Exception ex ) {
		ex.printStackTrace();
		throw new ConfigurationException( "Unable to parse configuration information", ex );
	}
}

And here is the exception.

[Fatal Error] :-1:-1: Premature end of file.
org.xml.sax.SAXParseException: Premature end of file.
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at com.circus.core.Configuration.readSource(Configuration.java:706)

I have already tried validating XML and found no errors there. Any idea where else can I look for possible problem?

Any pointers would be highly appreciated!

TIA,

  • Manish

Java Solutions


Solution 1 - Java

It is a problem with Java InputStream. When the stream is read once the file offset position counter is moved to the end of file. On the subsequent read by using the same stream you'll get this error. So you have to close and reopen the stream again or call inputStream.reset() to reset the offset counter to its initial position.

Solution 2 - Java

This is resolved. The problem was elsewhere. Another code in cron job was truncating XML to 0 length file. I have taken care of that.

Solution 3 - Java

This exception only happens if you are parsing an empty String/empty byte array.

below is a snippet on how to reproduce it:

String xml = ""; // <-- deliberately an empty string.
ByteArrayInputStream xmlStream = new java.io.ByteArrayInputStream(xml.getBytes());
Unmarshaller u = JAXBContext.newInstance(...)
u.setSchema(...);
u.unmarshal( xmlStream ); // <-- here it will fail

Solution 4 - Java

Please make sure that you are not consuming your inputstream anywhere before parsing. Sample code is following: the respose below is httpresponse(i.e. response) and main content is contain inside StringEntity (i.e. getEntity())in form of inputStream(i.e. getContent()).

InputStream rescontent = response.getEntity().getContent();
tsResponse=(TsResponse) transformer.convertFromXMLToObject(rescontent );

Solution 5 - Java

If input stream is not closed properly then this exception may happen. make sure : If inputstream used is not used "Before" in some way then where you are intended to read. i.e if read 2nd time from same input stream in single operation then 2nd call will get this exception. Also make sure to close input stream in finally block or something like that.

Solution 6 - Java

Are you sure that the XML file is in the correct character encoding? FileReader always uses the platform default encoding, so if the "working" server had a default encoding of (say) ISO-8859-1 and the "problem" server uses UTF-8 you would see this error if the XML contains any non-ASCII characters.

Does it work if you create the InputSource from a FileInputStream instead of a FileReader?

Solution 7 - Java

NEW Occurence of this same error on DECEMBER 6 2021!!!

Sample traces:

XmlBeanDefinitionStoreException: Line -1 in XML document from ServletContext resource [<here a reference to spring context .xml file>] is invalid; nested exception is org.xml.sax.SAXParseException; Premature end of file.
Caused by: org.xml.sax.SAXParseException; Premature end of file.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:201)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:175)
at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:398)
at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:325)
at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:282)
at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:204)
at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(SchemaParsingConfig.java:576)
at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(SchemaParsingConfig.java:679)
at org.apache.xerces.impl.xs.opti.SchemaDOMParser.parse(SchemaDOMParser.java:527)
at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument(XSDHandler.java:2148)
at org.apache.xerces.impl.xs.traversers.XSDHandler.parseSchema(XSDHandler.java:557)

The HINT:

  1. Apache CAMEL team decided to reorganize CAMEL schema files as usually referenced in XML Spring contexts for instance. Instead of just removing (which would have entailed a neat exception, easy to spot) the http://camel.apache.org/schema/spring/camel-spring-2.16.4.xsd usual schema location, they implemented a HTTP 301 redirect response to HTTPS.

  2. The Apache XERCES library does not throw an error in case of HTTP 301, but assumes it has received an empty file!! That entails weird and erroneous exceptions / putting you on the wrong track as these report the failure linked to the top XML file and not the actually offending schema descriptor

  3. the complete revalidation of all schema descriptors and their XSD dependencies at WAR reload/deploy time (from the data/content repo cache) in our app server platform was totally unexpected... and is totally useless! worse: creating a runtime public network dependency for just reloading already validated components and descriptors (at build time)

1+2+3 above, and BANG! major service disruption: the production server was unable to reload any component with a dependency on CAMEL

Fixings:

Two candidates:

a) in xsi:schemaLocation attributes in XML simply add 's' to the http://camel.apache.org/schema/etc becoming https://.... but accept to live with a public network dependency on every component reload

b) replace all http://... shema locations by classpath:. You will have downloaded all XSD and sub-dependent schemas, and deploy them along with the WAR ensuring a visibility to the classloader. E.g. put the files in java/main/resources/somename.xsd and supply classpath:somename.xsd as schema location path

Solution 8 - Java

In our case it was an empty AndroidManifest.xml.

While upgrading Eclispe we ran into the usual trouble, and AndroidManifest.xml must have been checked into SVN by the build script after being clobbered.

Found it by compiling from inside Eclipse, instead of from the command line.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionManishView Question on Stackoverflow
Solution 1 - Javadadob051View Answer on Stackoverflow
Solution 2 - JavaManishView Answer on Stackoverflow
Solution 3 - JavaZo72View Answer on Stackoverflow
Solution 4 - JavaAmit TyView Answer on Stackoverflow
Solution 5 - JavasupernovaView Answer on Stackoverflow
Solution 6 - JavaIan RobertsView Answer on Stackoverflow
Solution 7 - JavaberhauzView Answer on Stackoverflow
Solution 8 - JavaDanny SchoemannView Answer on Stackoverflow