TinkerTech.net Logo - resources for the aspiring web designer
Google Search
wwwsite

 

Making Documents XHTML Compliant

This document is my attempt to explain to my students how they can create XHTML compliant documents. I feel that it is important that I walk the walk and talk the talk, so as I write this tutorial I am also implementing XHMTL compliance on this site. I will note issues that I encountered during the process and steps that I took to correct those problems.

Definition of Terms

HTML = Hypertext Markup Language. The coding language used to create web pages. It uses a set of predefined codes that you use to create and format web pages.

XML = Extensible Markup Language. Because HTML is limited in its capabilities, XML was created. Developers can actually create their own tags and specify how data is handled. XML works like a database, XML only includes markup (structure) and content. There are no styles applied within an XML document. Visit Learn XML in 11.5 minutes for a very brief tutorial on XML. Below is a sample of some XML code.

Sample XML Code

<cdtitle>Kind of Blue</cdtitle>
<artist>Miles Davis</artist>
<contents>
<track>So What (9:22)</track>
<track>Freddie Freeloader (9:46)</track>
<track>Blue in Green (5:37)</track>
<track>All Blues (11:33)</track>
<track>Flamenco Sketches (9:26)</track>
</contents>

XHTML = Extensible HyperText Markup Language - XHTML is the new version of HTML. It is a combination of HTML and XML. It consists of a set of predefined tags that are used to create web pages. XHTML is the successor of HTML.

What is XHTML?

XHTML is "a reformulation of HTML 4 as an XML 1.0 application." (W3.org, XHTML 1.0 W3C Recommendation, http://www.w3.org/TR/xhtml1/). Essentially, XHTML is a standard for writing a web page, it is the successor to HTML. The W3C has created (and continues to modify) a set of rules or guidelines on how a web pages should be written. XHTML combines HTML 4 with XML (eXtensible Markup Language) and it has a stricter set of rules. By using XHTML in combination with CSS (Cascading Style Sheets) developers are able to separate content from style and layout.

Traditionally in HTML we apply style within the HTML document. Font tags, bold tags, and other formatting are all applied within the HTML code. XHTML moves in the direction of removing the style from the HTML document while applying a stricter set of rules for creating web documents. XHTML documents are more accessible because they use a set of rules (standards) that browsers can easily interpret.

Separating content from style becomes very important with more new devices being developed to surf the web. Cell phones, PDAs, televisions, refrigerators, and other devices have been and are being developed with Internet access and web browsing capabilities. In order to view web documents on these various devices different style and layout rules (CSS) must be applied for each device. Think about it, a cell phone has a very tiny display. It's impossible to fit all of the navigation, images, etc., that are normally displayed on a 15" monitor on a 1.5" display area. In order to create different style rules for different devices the formatting and layout must be removed from XHTML documents.

Finally, coding to XHTML standards ensures accessibility. By conforming to XHTML and other standards, your pages are readable and consistent in standards compliant browsers.

How to make documents XHTML compliant

Step 1 - The Doctype Statement

Fortunately making a document XHTML compliant does not mean learning an entirely new language. Because XHTML is combination of HTML and XML you will just need to learn a few new "rules" in order to make your documents XHTML compliant.

When creating valid XHTML documents you must specify the DTD or Document Type Definition at the beginning of your document. This DTD is also called a Doctype. The Doctype code tells the browser how to interpret the remainder of the code on the page. There are three Doctypes that you can use when working with XHTML. They are strict, transitional and frameset. Doctype statements should be placed at the beginning of the XHTML document before any other code. These Doctype statements were created by the W3C and they specify the language being used in the XHTML document. In essence they tell the browser how it should behave when it displays the web page. For a complete list of Doctype statements visit the W3C's list of valid DTDs. Here are the three Doctype statements for XHTML:

Strict - <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Transitional - <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Frameset - <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

Doctype Statements Explained

Strict - Use the "Strict" document type definition when your document is using exact XHTML standards and no style is included within the document. This means that you cannot use tags like font and you can't use height and width attributes in a table; the target attribute is also not allowed in an <a href> tag. Deprecated tags cannot be used in XHTML Strict (see deprecated tags below). You must be prepared to use entirely CSS to style your page if you are going to use the Strict Doctype.

Transitional - This is the Doctype statement that most people use. It is a little more flexible, allowing you to use some style codes within your document.

Frameset - If you plan to create XHTML documents that use framesets you should use the frameset Doctype.

Again, the doctype statement is the first statement in your XHTML document. The DocType is case sensitive so you should use the exact casing that I displayed in the above samples. To learn more, please read this article on Rendering Mode and DocType Switching.

You will also want to include the namespace just below the DOCTYPE statement. So a complete block of opening code in an XHTML document may look something like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
 <title>Title Goes Here</title>
 </head>

Defining the Character Set

In addition to the doctype statement you must also define the character encoding used in the document (required). This is done in a meta tag that is placed between the <head></head> tags. This character set definition looks like this:

<meta http-equiv= "Content-Type" content= "text/html; charset=UTF-8" /> or

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

Common character encodings used are UTF-8 and iso-8859-1. Evolt has a really good Character Entity Chart that I would recommend bookmarking.

Step 2 - HTML, Head, Title, and Body Tags are Mandatory

While some browsers may be able to display pages without the html, head, title, and body tags, your documents must include them if you want them to be XHTML compliant.

Step 3 - All Tags and Attributes Should Be Lowercase

<TITLE>This is my web page</TITLE> is incorrect XHTML format, the correct format is

<title>This is my web page</title> The text within the tags can be mixed case, but the tags themselves and any attributes within the tags (between the <> tags) must be lowercase. CSS should also be lowercase.

Step 4 - Tags Should Not Overlap - Coding should be well-formed

This means that tags must be placed in a certain order. The first tag you use should be the last tag you close.

<p>Bad formatting example, <strong><i>not</strong></i> properly formatted.</p>

<p>Proper formatting example, <strong><i>properly</i></strong> formatted.</p>

Step 5 - Always Close Tags

All tags must have ending tags or must be properly closed.

Incorrect example:

In HTML it is acceptable to code a list like this:

<ul>
<li>This is a list item
<li>This is a list item
</ul>

Correct example:

In XHTML we must code the list like this:

<ul>
<li>This is a list item</li>
<li>This is a list item</li>
</ul>

If the tag does not have an ending tag use a space and a slash at the end of the tag to make it self-closing like this:

<img src="mypicture.gif" />
<br />
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<input type="submit" name="btn" value="Search" />
<link href="../../css/newstyle.css" rel="stylesheet" type="text/css" />
<hr />

Step 6 - Attributes Should Always be in Quotes

In the examples below submit is a value of the attribute type. Notice that the = (equal) symbol separates the attribute/value pair. Values must always be placed in quotes. In XHTML usually whatever comes after the = (equal) symbol is a value and must be placed in quotes.

Incorrect example:

<input type=submit name=btn value=Search /> - No quotes used

Correct example:

<input type="submit" name="btn" value="Search" /> - Quotes used around the values

Step 7 - Don't Nest Links

<a href="mypage.htm">here is a <a href="page.htm">nested link</a></a>

The above example is incorrect and it wouldn't work.

Step 8 - Don't Use the Ampersand (&) Alone in Your Code

Instead of using the & symbol a special character definition should be used in your code. The character definition for an & symbol is &amp; this should be used in your code instead of the & symbol. This can be quite tricky when you're linking to other pages. While trying to make this site XHTML compliant I have found this to be my biggest issue. So many of the pages that I link to include an & symbol in the URL.

Incorrect example:

http://pub25.bravenet.com/guestbook/show.php?usernum=2120422499&cpv=1

Correct example:

http://pub25.bravenet.com/guestbook/show.php?usernum=2120422499&amp;cpv=1

While Dreamweaver's Find and Replace tool can be used to search for an & symbol in source code and it replace with &amp; this will sometimes cause the URL to no longer work. If you would like to use Dreamweaver to Find and Replace the & symbol, click here for a screen capture of Dreamweaver's Find and Replace tool. I ended up manually fixing most of the links that included the & symbol on the site. Many of the links would not work when I replaced the & symbol with &amp;. Since I had to check all of the links that were changed to ensure that they still worked, I opted to manually alter the code at the same time.

Character definitions and entities should always be used when you want to insert a symbol into your code. For a listing of standard character entities visit HTML & XHTML The Complete Reference. Be aware that XHTML Strict only allows certain Special Character Definitions.

Step 9 - Don't Use HTML Comments Inside Script or Style Containers

Strict version only: HTML comments <!- are frequently used to "hide" JavaScript code from older browsers that do not understand JavaScript. This can no longer be done in XHTML Strict. You can now enclose your scripts this way if you want to hide them from older browsers.

<script type="text/javascript"><!--//--><![CDATA[//><!--
script goes here
//--><!]]></script>

Note that the language attribute value is not set in the above statement. Language is used for backwards compatibility with Netscape and IE browsers that predate version 3.0.

You can find more information on using CDATA and scripts at Que Publishing.

Step 10 - No One Word Attributes

Some attributes can be used as a single word in HTML. For example a checkbox can have an attribute of checked like the example below:

<input type="checkbox" checked> - Incorrect

In properly formatted XHTML documents you must use the following attribute value pair format:

<input type="checkbox" checked="checked">

A few other examples of these shortcut attributes are: selected, noshade, and compact.

Step 10 - You Cannot Nest Forms

A form tag cannot contain another form tag.

Step 11 - Use ID and Name to Identify Elements

While ID has replaced the Name attribute (name has been deprecated), some browsers still use the Name attribute. For the time being use both attributes in forms to ensure that they work properly.

Step 12 - Always Use Alt tags

An img (image) tag should always include an alt attribute.

<img src="mypicture.gif" alt ="image of an elephant - 3kb" />

If the image does not warrant a description, for example a spacer.gif, use a space between the quotes to indicate a null value.

<img src="spacer.gif" alt =" " />

If you're using Dreamweaver, I would recommend that you change the program's accessibility settings to automatically prompt you to insert alt tags when you insert an image or other objects that requires the alt tag. This can be done by clicking Edit > Preferences > Accessibility and then check each item. Dreamweaver will them prompt you to add an alt tag each time you insert an object that requires one.

Step 13 - Don't Embed Flash and Other Objects

The embed tag has been deprecated in XHTML (it is considered obsolete by the Web Consortium). This doesn't mean that the tag won't work, it will, but it is not favored by the organization that sets standards for web pages. The object tag is now used to embed all external media applications. For extensive information on including objects, applets, and other media objects (plug-ins) in your web documents, I would recommend that you read the W3.org's implementation recommendations. Because each object has its own set of parameters I would also recommend that you search for specific code information as needed.

Here are a few resources on the object tag that you may find helpful/interesting:

Just Another Wiki - Help on Embedding (lots of object examples, but our firewall will block this page in the classroom)
Spartinicus' Web Tips
Mozilla Using Web Standards on Your Web Pages
O'Riely Guide - Embedded Content

Flash Objects

A List Apart has an excellent article on Embedding Flash While Supporting Standards.

In a nutshell you will have to use an object tag that looks similar to this example when you want to insert a Flash object:

<object type="application/x-shockwave-flash" data="images/smnetworkresize.swf" width="135" height="102" align="left">
<param name="movie" value="images/smnetworkresize.swf" />
</object>

There is an issue with the Flash movie streaming if you use the above modified, compliant code from A List Apart. If you have a large Flash movie you will need need to modify the movie itself (by adding some actionscript) as discussed in the Satay Method section of the A List Apart Article.

Step 14 - Validate Your Documents

The World Wide Web Consortium or W3C offers XHTML document validation services. You can visit the validation page and enter the URL of the page that you want to validate. You can also validate a file on your local hard drive from this page. You can also use the W3.org site to validate CSS and other document types. Unfortunately, you must validate each page of your site individually.

You can try validating one of your own pages using the form below.

Enter the URL of your page
Example: http://www.tinkertech.net/tutor/xhtml/index.htm


Validation errors can be somewhat difficult to understand. Black Widow Web Design has a nice list of common validation errors and their solutions.

Once your document is "valid" you can post this graphic on your page to show that you have taken the time to make your pages compliant.

Valid XHTML 1.0!

The XHTML and CSS validation services can also be used to debug problem web page code. If you have a page that is not displaying correctly try validating the CSS and XHTML code. The results may help you to find your problematic code.

Dreamweaver MX 2004 XHTML

Dreamweaver also has validation tools built into the program. It can be accessed via the Results Panel (Window > Results). You can start the validation process or adjust the validation settings via the Validation Icon in Dreamweaver icon in the Results panel. While this is a handy way to quickly check your documents for XHTML compliance I have found that it does not pick up all errors.

You can clean up existing HTML/XHTML documents in Dreamweaver via the Commands > Clean Up XHTML menu.

You can also convert existing HTML documents to XHTML. This option is found in File > Convert > XHTML.

The University of Waterloo offers a free extension for checking your pages for accessibility.

A Few Extra Tips

Make your documents XHTML compliant to begin with. Use the above steps when you first creating your documents. It is much easier to create compliant documents than it is to update them at a later date.

Watch URLs that you link to, make sure that they don't include the & symbol. If they do check to see if you can alter the URL to use &amp; instead of the & symbol. If altering the URL causes an invalid link, use the URL of the home page instead.

Use CSS!!! By using CSS to control style and layout the code is much cleaner and less likely to produce validation errors. It is also much easier to change the style or look of documents if you are using CSS.

If you are working on making an existing HTML site XHTML compliant, keep notes. I created a two column, valid/not valid, list of all of the HTML pages in my site. Originally, all of the pages were listed on the not valid side of the page. After checking the pages at the W3.org validation service, correcting issues, and ensuring compliance, I then moved the page to the "valid" column.

Don't count on Dreamweaver to create compliant code. While their tools are great, many of the validation tools do not pick up errors that the W3 validation service does.

Turn on the accessibility features in Dreamweaver. Dreamweaver will then prompt you to insert alt tags into your code. This has saved me many hours.

Validate your pages as your work. Don't wait to validate your entire site at one time.

Watch out for cut-n-paste code from other web sites. I have used cut-n-paste code from Google and Amazon on this site and I have had to edit the code snippets because they're not XHTML compliant. Google did not include quotes around the values in their code snippets and a lot of their code was in ALL CAPS. I had trouble with code casing and attributes in tables with Amazon code. I would recommend that if you are going to use code snippets from third party sites that you make a page that contains all of the code snippets that are XHTML compliant and styled the way you like them. You can then use them repeatedly on your site. By doing this you won't have to "fix" the code each time you need to use it.

If you cannot figure out what is causing the validation problem search Google to find the answers. If you still cannot find the problem post a message to one of the many web development boards.

Avoid deprecated HTML tags. These are older tags that the W3C has declared as obsolete. There are now CSS alternatives that should be used for these tags. Here are some additional sites with information on deprecated tags and attributes.

What's a Deprecated Tag/Attribute - These tags should be avoided, replace these tags with CSS. Fantasi's HTML Quick References also has a short list of deprecated tags and their CSS replacements. Translations from HTML to CSS is a condense, printable list of deprecated HTML and their CSS alternatives from CSS Pointers Group.

Additional Resources

New York Public Library On-line Style Guide
The Structure of Accessible Pages by Joe Clark (read the entire book)
Interested in learning about the next version of XHTML and what changes you're in store for read XHTML 2.0 Explained at the Developer Shed. You can read the complete working draft of XHTML 2.0 at the W3.
XHTML Cheat sheet (and other resources) from Tunna Resources. This is a great site to learn about accessibility!

Stop by my Amazon Store for gifts for the web developer.

Visit Robin's Blog for more web design and development resources.
©Copyright 2001-2006 - Robin Wood - Send Questions or comments to robin at tinkertech dot net.
Last Updated: December 19, 2006