Jesper Tverskov, November 11, 2009
All text books about XML have long chapters about how to style XML with CSS. This is misleading because it is used very little in the real world. It is fair to say that CSS for XML is only relevant as en exception to the rule. Students of XML should know that we can style XML with CSS but actually doing it is mostly a waste of time.
In year 2000 it was still up in the air if XML were going to be a common file format on the Internet styled with CSS to be viewed as webpages. Simon St. Laurent made a series of articles about the state of the art of "XML browsing" at www.xml.com. At that time it was a common misunderstanding that webpages should be made with XML styled with CSS. 10 years after, it is fair to say that the dream of XML browsing didn't come through.
Before we look at how styling XML with CSS is actually used, let me show a simple example of how to apply CSS to an XML file.
Look at the following xml file. Note, <?xml-stylesheet type="text/css" href="products.css"?>, a so-called Processing Instruction pointing to a CSS stylesheet. A very small W3C standard defines how an XML document can include a CSS or an XSLT stylesheet, Associating Style Sheets with XML documents.
xmlplease:xml |
<?xml version="1.0" encoding="UTF-8"?> |
In products.xml above, I have added two CSS class attributes to make a few experiments.
xmlplease:xml |
/* This CSS stylesheet is an example of how we can style XML with CSS */ |
The above CSS stylesheet is just some experiments to show how an XML file can be style with CSS. The end result can be seen below.
We have used Google Chrome as example but products.xml is rendered more or less the same in all browsers.
When we look at an XML file in browsers like Internet Explorer, Firefox and Opera, they use a build in JavaScript to format the XML with CSS displaying an indented XML tree. Google Chrome regard XML as unknown markup, and just displays the text nodes.
If the XML file uses a Processing Instruction, <?xml-stylesheet type="text/css" href="products.css"?>, pointing to a CSS or an XSLT stylesheet, the browser uses that stylesheet to display the document.
An XML file not pointing to a stylesheet is rendered like above in most browsers. It is important to note that what we see above is not the source code but a transformed view made with JavaScript and CSS. The coloring is not in the source code, the option to close and open the XML tree is not in the source code, etc.
The source code of the "products" XML file as seen in Notepad.
When an XML file is not well-formed, it is not XML! The XML standard says that if an XML file is not well-formed the application making use of the file, e.g. a browser, must show an error message: "Validating and non-validating processors alike MUST report violations of this specification's well-formedness constraints...". All modern browsers have a build-in XML processor.
We could have used any browser, but the Opera browser as shown above has the best error messages, if an XML file is not well-formed.
When RSS and Atom web feeds started, they were rendered by browsers like above: some build-in JavaScript create a webpage formatted with CSS displaying an indented XML tree.
I liked that because you could see that the news feeds were XML. But as XML news feed syndication became mainstream, browsers shipped with an additional build-in JavaScript rendering news feeds not as an XML tree but as a standard webpage displayed with headings, paragraphs, etc. At the moment all major browser do that except Google Chrome.
The image above shows the source code of some RSS file from New York Times. Note that the XML file has Processing Instruction pointing to an XSLT stylesheet. The idea is that browsers must use this stylesheet to render the page. NYT probably thinks that it is too confusing for some users to render the source code as an XML tree or there are other reasons for wanting to style XML with XSLT and CSS.
In the following we will show how this source code is rendered in Internet Explorer, Firefox, Opera and Google Chrome. Please note that News Feeds are updated fast, so the images of the browsers are not showing exactly the same news feed.
Despite the fact that the XML news feed above contains a Processing Instruction (most often they don't) in order to load an XSLT stylesheet at NYT to transform and format the page, Internet Explorer ignores this PI and uses a build in JavaScript to style the page with CSS, the way IE want it to look!
The introduction with yellow background-color and the form elements to the left are not in the source code.
Despite the fact that the XML news feed above contains a Processing Instruction (most often they don't) in order to load an XSLT stylesheet at NYT to transform and format the page, Firefox ignores this PI and uses a build in JavaScript to style the page with CSS, the way Firefox want it to look!
The introduction with yellow background-color and the form elements are not in the source code.
Despite the fact that the XML news feed above contains a Processing Instruction (most often they don't) in order to load an XSLT stylesheet at NYT to transform and format the page, Opera ignores this PI and uses a build in JavaScript to style the page with CSS, the way Opera want it to look!
Note that the CSS layout of Opera for this page is more advanced than in IE and Firefox.
At the moment Google Chrome does not ship with a build in JavaScript to display XML with CSS. Google just displays the text nodes following the rules of what a browser should do, if it don't recognize the markup.
But the RSS XML news feed from New York Times contains a Processing Instruction. Google Chrome uses the XSLT stylesheet of New York Times to transform and display the page, the way NYT wants it to look. E.g.: the form elements are added by XSLT and styled with CSS. The source code is exactly the same XML document as in the other examples.
The following is a great example showing that styling XML with CSS is not only possible but can be very useful. Prince is a computer program that converts XML and HTML into PDF documents. Prince can read many XML formats, including XHTML and SVG. Prince formats documents according to style sheets written in CSS.
PRINCExml is one way of creating PDF from XML. The standard method supported in most professionel XML Editors is to use XSLT to create XSL-FO and then to convert XSL-FO to PDF using an XSL-FO processor.
PRINCExml, using CSS in a process to convert XML to PDF, is one of the best examples proving that styling XML with CSS can be useful.
CSS is good for styling: apply colors, font-size, margins, padding, font-family, etc. But CSS can not add or rearrange markup, can not group or sort the content. For that we need some client-side transformation language like JavaScript or XSLT or some server-side scripting using XSLT or a number of other programming languages.
Turning some XML markup into a webpage using only CSS is almost never possible because even in the most simple cases we always need to add a few details that can not be done with CSS alone. And as soon as we start using some technology of transformation, we can just as well transform the XML input to XHTML or HTML output made for the web, made to be displayed in browsers.
Almost all examples that look like XML being displayed as webpages using CSS, turn out to be transformations using JavaScript, XSLT, etc, and what we see in the browser is actually XHTML or HTML styled with CSS, it is not the original XML input file styled with CSS directly. Even PRINCExml will have to use some form of transformation to get something useful to style with CSS except if the input is XHTML, DocBook, etc.
XML files in general are not made for display. It seems more natural to use an XML application like XHTML made to display content on the web. That is to transform XML to XHTML.
Because XHTML is known by the browsers, the browsers ship with a default CSS stylesheet to display it. The webpage author can also make use of CSS but usually only need to do it for a few details that should be different than browser default.
When styling some homegrown XML with CSS, on the other hand, the browsers don't have a default CSS stylesheet. The webpage author must style much more than for XHTML just to get the document rendered as a usable webpage.
XHTML like HTML markup is poor in semantics but at least we have a handful of easy to understand elements like headings, paragraphs, lists, tables, links, images, etc. This makes it easier for webcrawlers to index a webpage. Words found in metatags, title, headings and links ought to tell us more about what a page is all about than words found in a paragraph or a table cell. Search engines use the content of the title element as text for links to the webpage.
Web crawlers know nothing about CSS but only make use of the source code. If we just use any markup, webcrawlers will find it more difficult to index content in a meaningful way.
Even screen readers, used by people wanting to get a webpage read aload, like blind people, only read the source code. If you make a headline, a list or a table with unknown markup styled with CSS, a screen reader has no way of reporting, "this is a headline", "this is a list of 5 items", etc.
Instead of using any XML for webpages, XHTML, an XML application made for the web to be viewed in browsers, began competing with HTML. Even XHTML, in widespread use today, has only been partly successful. As long as an XHTML page is not well-formed, it is not XML. Most XHTML webpages are not well-formed but because they use mimetype "text/html" the browsers show them anyway as if they were HTML.
If mime-type "application/xhtml+xml", not yet supported by IE, is used, the browser must show an error message if the webpage is not well-formed. If we serve real XHTML to Internet Explorer a black screeen is displayed. As an XML evangelist, I have used XHTML as XML for webpages for very many years, and served "application/xhtml+xml" to browsers understanding it, and mimetype "text/html" to IE. This page is an example.
Update, 2011-08-03, this webbage is now XHTML5 polyglot.
Updated: 2011-08-09