Microsoft Knowledge Base Email Alertz

(238833) - When parsing XML that contains special characters using the Microsoft XML parser (MSXML), the parser may report the following error message at the line and position of the first special character: An Invalid character was found in text content.

Search KbAlertz

Advanced Search

Receive Microsoft Knowledge Base articles by E-Mail?

Every night we scan the Microsoft Knowledge Base. If technologies you're interested in are updated, we'll send you an e-mail. You only get one e-mail a day, and only when new articles are added.

Click here to create a
FREE account
Already have an account?
[Click here to Login]











Microsoft Knowledge Base Article

This article contents is Microsoft Copyrighted material.
©2005-©2007 Microsoft Corporation. All rights reserved. Terms of Use | Trademarks

Article ID: 238833 - Last Review: July 18, 2003 - Revision: 2.2

PRB: XML Parser: Invalid Character Was Found in Text Content

This article was previously published under Q238833

SYMPTOMS

When parsing XML that contains "special characters" using the Microsoft XML parser (MSXML), the parser may report the following error message at the line and position of the first special character:
An Invalid character was found in text content.

CAUSE

The XML document is not marked with the proper character encoding scheme.

RESOLUTION

Specify the proper encoding scheme in the XML processing instruction.

- or -

Re-encode the XML data as proper UTF-8.

STATUS

This behavior is by design.

MORE INFORMATION

"Special character" refers to any character outside the standard ASCII character set range of 0x00 - 0x7F, such as Latin characters with accents, umlauts, or other diacritics. The default encoding scheme for XML documents is UTF-8, which encodes ASCII characters with a value of 0x80 or higher differently than other standard encoding schemes.

Most often, you see this problem if you are working with data that uses the simple "iso-8859-1" encoding scheme. In this case, the quickest solution is usually the first listed prior in the RESOLUTION section. For example, use the following XML declaration:
   <?xml version="1.0" encoding="iso-8859-1" ?>
   <rootelement>
   ...XML data...
   </rootelement>
				
Alternatively, you can encode each of those characters using the numeric entity reference. For example, you can take the special character á, use <test>&#225;</test> (decimal version) or <test> &#x00E1;</test> (hex version).

APPLIES TO
  • Microsoft Internet Explorer 5.0
  • Microsoft Internet Explorer 5.5
  • Microsoft XML Parser 3.0
  • Microsoft XML Parser 3.0 Service Pack 1
  • Microsoft XML Core Services 4.0
Keywords: 
kbintl kbintldev kbprb kbfaq KB238833
       

Community Feedback System

Very often, it takes hours to solve a problem. Very often, you've looked high and low, and have tried a lot of solutions. When you finally found it, chances are, it was because someone else helped you. Here's your chance to give back. Use our community feedback tool to let others know what worked for you and what didn't.

Please also understand that the community feedback system is not warranted to be correct, it's simply a system that we've built to let people try and help each other. If something in a feedback response doesn't make sense to you, or you're not comfortable making changes that the feedback talks about (like registry edits), please consult a professional.

Thank you for using kbAlertz.com Feedback System.

-- Scott Cate