| Overview |
| |
| In this chapter you will learn |
- How to escape characters?
- What is CDATA?
- How parsing is different for CDATA?
|
Escape Characters for XML
If you want to use any of the special characters, such as, &, <, >," , etc., as normal characters, you must "escape" them by using the general entities that present them. To escape a character means to conceal it from a subsequent software or process. It is often used in computing terms to refer to prefixing certain characters in programming languages with a special character string to prevent them from being interpreted as special characters. In the following table, some pre-defined general entities are shown :
| Character |
Replacement |
| & |
& |
| ' |
' |
| > |
> |
| < |
< |
| " |
" |
What is CDATA?
CDATA is an acronym for Character DATA. CDATA section is a part of an XML document in which markup is not interpreted as markup, but is passed to the application as it is. In other words, CDATA sections are used to escape blocks of text containing characters which would otherwise be recognized as markup. You can escape markup characters by using the predefined entities and character references. Replacing every markup character in a piece of text could be a long and tedious process. Besides, there might be cases when you want to keep all those characters exactly as they are. And the way to do this is to use a CDATA section.
All tags and entity references are ignored by an XML processor that treats them just like any character data.
CDATA blocks has been provided as a convenience measure when you want to include large blocks of special characters as character data.
| Example |
<![CDATA[ This is a text containing <5 lines> character data &!%# and it leaves the XML processor alone!]]> |
You cannot put one CDATA section inside another.
Nothing that appears between the opening tag (<![CDATA[ ) and the closing tag ( ]]> ) will be recognized as markup.
Comments are not recognized in a CDATA section.
CDATA does not work in HTML.
All text in an XML document will be parsed by the parser. Only text inside a CDATA section will be ignored by the parser.
How Parsing is different for CDATA?
The Parsing of CDATA is different from any other data as the XML processor does not parse what is inside a CDATA section, except to look for the CDATA section's closing delimiter ']]>'. So, the user can include text just as he wants it to appear.
Data inside a CDATA section is just plain character data, which is unparsed data. The XML parser skips the text within the CDATA section, pastes the enclosed text block into its output, and then "forgets" a CDATA section ever existed.
| |
| Summary |
| |
In this chapter you have learnt:
- Escaping special characters while using them.
- About CDATA.
- Parsing for CDATA
|
| |
| Review Questions |
| |
Fill in the Blanks
- ________ is an acronym for character DATA.
- > character is replaced with _______, while escaping characters.
- _______ is used as CDATA section closing delimiter.
Solutions
- CDATA
- >
- ]]>
True or False
- Anything written inside a CDATA section is escaped by the parser.
- Mark-up characters cannot be escaped.
- CADTA section can be nested.
- Comments are not recognized in a CDATA section.
- HTML supports CDATA.
Solutions
- True
- False
- False
- True
- False
|
|
| |
|
| |
|