Bug in XMLStreamConverter for encoding

VA Smalltalk is a "100% VisualAge compatible" IDE that includes the original VisualAge technology and the popular VA Assist and WidgetKit add-ons.

Moderators: Eric Clayberg, wembley, tc, Diane Engles, solveig

Bug in XMLStreamConverter for encoding

Postby jduff » Tue Jun 15, 2010 8:33 am

The method AbtXmlStreamConverter>>convertStream:withEncoding: has a bug in it.

Per w3.org's own doc on XML http://www.w3.org/TR/REC-xml/#charencoding....

In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8. Note that since ASCII is a subset of UTF-8, ordinary ASCII entities do not strictly need an encoding declaration.



But the code in that method (and other places) (plus the explicit comment " No special encoding type found. Answer the stream as is ") ignore this requirement and process the XML in whatever the code page of the system is in.

An example is this SOAP payload that only indicates the xml version:

<?xml version="1.0" ?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/"><S:Body>.......and so on.



The additional twist is that the HTTP header for the web service call DID respond with an encoding but that doesn't get passed in to the AbtXmlMappingParser methods.
e.g.
SstHttpResponseHeader{
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/xml;charset=utf-8
Date: Mon, 14 Jun 2010 23:27:26 GMT
Content-Length: 121800
}
jduff
 
Posts: 15
Joined: Wed Jan 02, 2008 8:59 am

Re: Bug in XMLStreamConverter for encoding

Postby jduff » Tue Jun 15, 2010 9:34 am

Additional notes:

AbtXmlStreamConverter class>>encodingFromStream:

Code: Select all
   |  streamEncoding xmlString scanStream convertedString streamCP xmlEncoding |   
   " An encoding will always be returned.  "
   convertedString := xmlString := aStream copyFrom: 1 to: ( 300 min: aStream size ).
   " See if the string is readable in the current Locale.  The first character must be a tag marker, and the 2nd must be non-null "
   ( convertedString size > 1
      and: [ convertedString first = $<
      and: [ ( convertedString at: 2) value ~= 16r00 ]] )
      ifFalse: [
         streamEncoding := self primEncodingFromStream: aStream.
         streamCP := self ibmCodePageForEncoding: streamEncoding.
   " For unicode encodings, no additional processing is required "
         ( self isUnicodeEncoding: streamEncoding )
            ifTrue: [ ^streamEncoding ]].



Based upon the comment, the ifFalse: should probably be ifTrue. :?:
Plus #primEncodingFromStream: seems to handle all of the guard clause things anyway.
jduff
 
Posts: 15
Joined: Wed Jan 02, 2008 8:59 am


Return to VA Smalltalk 7.0, 7.5 & 8.0

Who is online

Users browsing this forum: No registered users and 1 guest