Microsoft Knowledge Base Article
This article contents is Microsoft Copyrighted material.
©2005-©2007 Microsoft Corporation. All rights reserved.
Terms
of Use |
Trademarks
Article ID: 315580 - Last Review: July 17, 2003 - Revision: 1.2
PRB: Error Message When an XML Document Contains Low-Order ASCII Characters
This article was previously published under Q315580
When you attempt to use versions 3.0 or later of the MSXML
parser to parse XML documents that contain certain low-order non-printable
ASCII characters (that is, characters below ASCII 32), you may receive the
following error message:
An Invalid character was found
in text content.
Versions 3.0 and later of the MSXML parser strictly enforce
the valid XML character ranges that are defined by the World Wide Web
Consortium (W3C) XML language specification. XML documents that are parsed
using versions 3.0 or later of MSXML cannot contain characters that fall
outside the defined valid XML character ranges. The low-order non-printable
ASCII characters in the ranges that are listed in the "More Information"
section are not valid XML characters. An XML document that contains instances
of these characters is not conformant with the W3C specifications and cannot be
parsed successfully with versions 3.0 and later of MSXML.
To resolve this problem, either remove instances of the
low-order non-printable ASCII characters, or replace the characters with an
alternate valid character such as the space character (ASCII 32, hex #x20).
This solution makes the XML document compliant with the W3C specifications.
However, removing or replacing instances of these characters may affect other
applications that use the data and to which the characters are significant.
Such additional impact can only be identified by testing and will need to be
addressed by implementing a fix or workaround that is appropriate for a
specific situation.
This
behavior is by design.
Versions 2.6 and earlier of the MSXML parser permit XML
documents to contain low-order non-printable ASCII characters that fall outside
the W3C valid XML character ranges. However, the design of versions 3.0 and
later of the MSXML parser has been changed to strictly enforce the valid XML
character ranges that are defined in the W3C XML language specification. This
design change is required to be able to identify non-conformant XML documents.
The following are the valid XML characters and character ranges (hex
values) as defined by the W3C XML language specifications 1.0:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
The following are the character ranges for low-order non-printable
ASCII characters that are rejected by MSXML versions 3.0 and later:
#x0 - #x8 (ASCII 0 - 8)
#xB - #xC (ASCII 11 - 12)
#xE - #x1F (ASCII 14 - 31)
This design change may affect the following users and applications:
- Internet Explorer users: Users who have been using Internet Explorer versions 5.5 and
earlier (and who did not install MSXML 3.0 in Replace mode) to browse and view
XML documents that contain one or more instances of the specified low-order
non-printable ASCII characters encounter the error message after upgrading to
Internet Explorer 6.0 because Internet Explorer 6.0 installs MSXML 3.0 SP2 in
Replace mode and uses it to parse XML documents.
- MDAC and ADO users: Developers and users who load ADO-persisted XML documents that
contain one or more instances of the specified low-order non-printable ASCII
characters into ADO Recordset objects encounter the error message after
upgrading to MDAC 2.7 because MDAC 2.7 installs MSXML 3.0 SP2, which is the
version of the MSXML parser that the ADO 2.7 Recordset object uses.
- Applications that use the MSXML Document Object Model (DOM): Applications that use version independent PROGIDs to instantiate
MSXML DOM objects that are used to parse XML documents generate the specified
error when MSXML 3.0 or one of its service packs is installed in Replace mode
or when the code is modified to use the MSXML 3.0 or 4.0 version specific
PROGIDs.
For additional information on other known causes and
workarounds for the error message that is specified in the 'Symptoms' section,
click the article numbers below to view the articles in the Microsoft Knowledge
Base:
238833Â
(http://kbalertz.com/Feedback.aspx?kbNumber=238833/EN-US/
)
PRB: XML Parser: Invalid Character Was Found in Text Content
275883Â
(http://kbalertz.com/Feedback.aspx?kbNumber=275883/EN-US/
)
INFO: XML Encoding and DOM Interface Methods
APPLIES TO
- Microsoft XML Parser 3.0
- Microsoft XML Parser 3.0 Service Pack 1
- Microsoft XML Parser 3.0 Service Pack 2
- Microsoft XML Core Services 4.0
- Microsoft Data Access Components 2.8
Community Feedback System
Very often, it takes hours to solve a problem. Very often, you've looked high
and low, and have tried a lot of solutions. When you finally found it, chances
are, it was because someone else helped you. Here's your chance to give back.
Use our community feedback tool to let others know what worked for you and what
didn't.
Please also understand that the community feedback system is not warranted to be
correct, it's simply a system that we've built to let people try and help each
other. If something in a feedback response doesn't make sense to you, or you're
not comfortable making changes that the feedback talks about (like registry
edits), please consult a professional.
Thank you for using kbAlertz.com Feedback System.
-- Scott Cate