Two fish on a perch. One says to the other, "Can you smell fish?"

Beginner's Guide to XHTML

Wednesday 22nd June 2005 - Saturday 7th January 2006

Categories: Guides, Internet, Code

The Basics

If you want to design a website, there are various ways to do it. Although you can get programs to write the pages for you, you can get far more control if you decide to write the page yourself. This is done in the language called HTML (Hypertext Markup Language). The next version of HTML is XHTML - there are stricter rules, meaning that the look of webpages is more consistant, and the code is easier to edit and understand. As such, I will be looking at XHTML, although most of what is said in this guide can be applied to standard HTML as well.

This part of the guide will look at the bare essentials for building a website, including the structure of the page, headings and paragraphs. All you need is a text editor, such as Notepad or gedit, and you can start writing. When saving the file, use the extension .html (make sure there isn't a .txt on the end).

If you are using Windows, you might want to use an alternative to Notepad, such as Crimson Editor - this will perform syntax highlighting on your document so that it easier to read as you work. It is also generally nicer to work with than Notepad, and makes switching between documents easier through use of tabs. Those using Linux should already have a decent text editor, such as gedit, which includes syntax highlighting, as well as tabs.

When writing XHTML documents, you will be making heavy use of tags. These are used to do virtually everything in the webpage. In XHTML, they must always be written in lower case. Tags are always between < and >. The most basic tags are those that define the structure of the website. The first tag you want in a website is <html>. This states when the document begins. The last tag you always want is </html>. This is a closing tag - all tags must be closed, and most are closed by the corresponding closing tag, shown by a forward slash just after the <.

Next, we want to split the page into two parts. The first is called the head, and contains information about the page, while the second part, the body, contains the actual data that will appear on the page. We use <head> and <body> tags, along with their closing equivalents, to define the two different areas of the webpage. This means that the basic page layout is like this:

<html>
<head>
</head>
<body>
</body>
</html>

We also need a title for the page, which will appear in the titlebar for whichever browser you are using. Since the title doesn't appear on the page itself, it goes in the head section. Between the <head> and </head> tags, type <title>Whatever you want your title to be</title>. Your page should be looking similar to this now:

<html>
<head>
<title>Your Title</title>
</head>
<body>
</body>
</html>

After that, we should start actually adding stuff to the page itself. The method for this is to stick whatever you want between the body tags, since it appears on the actual page. Since, most of the time, you'll be writing in paragraphs, you need to stick a <p> tag at the beginning of each paragraph, and a </p> at the end of each paragraph. If you want to start a new line, use the <br /> tag. Note that, since it does not have a closing tag, it has a space and slash at the end instead.

You can also put in headings in the same way, ranging from the largest, <h1>, to the smallest, <h6>. When designing a website, use the heading according to its position rather than size e.g. for the title of a page, use <h1>, for subheadings use <h2>, subheadings within that <h3>, rather than choosing the heading based upon the font size of the heading. This should give you something like this:

<html>
<head>
<title>Your Title</title>
</head>
<body>
<h1>My first web page</h1>
<p>This is my first web page. It's not very good, but it works!</p>
</body>
</html>

Now that we've got a basic webpage, there's just a couple more things to do to make sure it is a proper web page. The first is to change the <html> tag. We need to tell the web browser that this page is XHTML, so change the tag to <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">. The xml:lang="en" part states the language; in this case English, so you can change it to whatever language you're using.

The last thing we have to do is add a doctype. A doctype tells dictates what version of XHTML you are using. At this point in time, the main versions are XHTML1.0 Strict, Transitional and Frameset, and XHTML1.1, which is essentially XHTML1.0 Strict. Frameset is, funnily enough, for use if you have frames in your webpage. However, it's probably a good idea not to use frames and use an alternative, such as divs. Strict is just that, cutting out many features of HTML, largely presentational, using CSS instead. Transitional is between Strict and HTML, including presentational features. In this guide, I will be concentrating on XHTML1.0 Strict.

When I said the <html> tag was the first tag, I lied a little - the doctype has to go first, although it isn't really an XHTML tag. You use a different doctype tag depending on which doctype you choose - if you're following the guide, I recommend XHTML1.0 Strict for now (I explain later why not to use XHTML1.1).

For XHTML1.0 Frameset, you need:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">.

For XHTML1.0 Transitional, you need:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">.

For XHTML1.0 Strict, you need:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">.

For XHTML1.1, you need:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">.

There is another part that is not required, but recommended. Since this now an XML document, you should add the XML declaration at the top: <?xml version="1.0" encoding="ISO-8859-1"?>, replacing ISO-8859-1 with whatever character encoding you want to use. I'm in Western Europe, so ISO-8859-1 is the right one for me. A link to a list can be found at the bottom of the page. You should also add the following line in the head section:

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

Here, we come across attiributes - these take the form of attribute="attributevalue". These are found inside tags, and the attribute must be in lower case, while the attribute value is in quotation marks. The 'charset' should match whatever you put in previously. The attribute 'content' tells the browser what type of document it is reading. XHTML should actually be defined as an XML document, but it is allowed to stay as text/html so that internet browsers can read the page properly. However, XHTML1.1 should always be defined as an XML document, rather than HTML as it in this case, which is why I do not recommend using XHTML1.1.

This means that you should be left with a page that looks similar to this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<title>Your Title</title>
</head>
<body>
<h1>My first web page</h1>
<p>This is my first web page. It's not very good, but it works!</p>
</body>
</html>

There! A fully working webpage! It may not do much, but it's a start. You can check that it is all written properly using a validator. Before the end of this part, there is one other tag that might be useful - the comment tag. You can add a comment anywhere, simply by using this format:

<!-- This is a comment -->

In the next part, we'll be looking at images and hyperlinks.

Summary

Useful Links