Skip to site navigation

HTML: Structures and basics

bleb.org > Writings > Beginner's Guide to HTML

...in which we learn what HTML looks like.

HTML

HTML can be imagined as a plain text file which contains special markers, called tags, to inform the browser about which bits mean what. Everything in an HTML page is between an open tag and a close tag.

A tag exists between '<' and '>' and many are defined. For example, to start a paragraph, the tag is <p>. The corresponding close tag is </p>. This is enough to let us write our first bit of HTML:

<p>Hello, this is our first paragraph in HTML.</p> <p>This is a totally different paragraph.</p>

...which results in:

Hello, this is our first paragraph in HTML.

This is a totally different paragraph.

Whitespace (ie. newlines, spaces and tabs) is mostly ignored between tags, so we'd have got exactly the same effect if the example had been written:

<p>Hello, this is our first paragraph in HTML.</p><p>This is a totally different paragraph.</p>

Block-level and text-level elements

There are really two types of tag: a block-level element is a tag which defines a block on the page; for example, a paragraph or table. A text-level element is part of a block-level element; for example, a word in italics or a different colour:

<p>Hello, this is our <em>second</em> paragraph in HTML.</p> <p>This paragraph has a <b>bold</b> word!</p>

...which gives:

Hello, this is our second paragraph in HTML.

This paragraph has a bold word!

The <em> tag is different to the <b> tag as the former - emphasis - imparts meaning to the text within it, whereas the latter - bold - just suggests to the browser how to render it.

This is an importance difference to remember when catering for speech browsers: emphasis can be conveyed to a blind person through the tone of voice, whereas a bold word is meaningless. Always use "meaningful" tags, such as <em> for emphais, wherever possible.

Other useful tags include:

Tag Name Block or text level? Example Notes
i Italics T <i>This is in italics</i> This is in italics  
u Underline T <u>Underlined text should be avoided as it can confuse users</u> Underlined text should be avoided as it can confuse users The confusion can arise as underlined text is usually reserved for links.
strong Strong emphasis T This is an <strong>example</strong> This is an example  
sup Superscript T On the 2<sup>nd</sup> of June... On the 2nd of June...  
sub Subscript T Water is also known as H<sub>2</sub>O Water is also known as H2O  
big Bigger text T This is <big>big and <big>bigger</big></big> This is big and bigger Each <big> increases the font size, but there must be a matching </big> for each one.
small Smaller text T This is getting <small>smaller and <small>smaller</small> </small> This is getting smaller and smaller See <big>
h2 Heading B <h2>Introduction</h2>

Introduction

h1-h6 are different size headings.
br Line break - This is on<br>two lines This is on
two lines
Multiple brs will be collapsed, even if separated by whitespace. Use <br>&nbsp;<br>.

Also to note is that tags must be ended in the correct sequence, ie. the most recently opened tag must always be closed first. Compare these two lines:

<p>This is <b>bold <i>italic</b></i> - honest.</p> <p>This is <b>bold <i>italic</i></b> - honest.</p>

The first line is incorrect as the bold tag is being closed before the italics. The second line is correct and gives:

This is bold italic - honest.

Character entities

As with any markup language you have a problem when you want to include in your text the characters which mark up the content. In this case '<' and '>'. If you wanted to say:

2 + x < 4 + x < 8 - x

...then you'd have a problem: the browser would see the '<' and think that a tag had been opened, would stop displaying your content until the next '>' and then would have to try and work out what tag starts < 4 + x ....

Not good.

So, to encode '<' you use a character entity: an ampersand - '&' - followed by a number of characters and then a semi-colon:

<p>Did you know 2 &lt; 4?</p>

...gives:

Did you know 2 < 4?

The following entities are the ones you'll probably need most often:

EntityCharacter
&lt;<
&gt;>
&quot;"
&pound;£
&copy;©
&reg;®
&amp;&
&nbsp;A non breakable space

A full HTML page

Now that we've learnt the basic tags, the structure of HTML and character entities we can pull it all together into a full HTML page:

<html> <head> <title>My first webpage</title> </head> <body> <h1>Welcome to my first webpage!</h1> <p>I hope you enjoy <strong>my</strong> page...</p> </body> </html>

We'll now go through this simple page step-by-step:

<html>...</html>

The block-level element which contains the whole page, and indicates that this is, after all, HTML.

<head>...</head>

This content isn't displayed, but contains meta information, such as the page title, author, keywords etc.

<title>My first webpage</title>

The page title - displayed by search engines as the title of your page and usually by the web browser in the title of your window.

<body>...</body>

The block-level element containing the content that is actually rendered by the web browser to the user.

Within the <body> tag are the same tags we saw above. It's that easy to create a simple HTML page. Of course, the real power of HTML comes from linking to other pages, but for that we need to expand our understanding of tags, which we'll do in the next part.

Beginner's Guide to HTML:
  1. Introduction and history
  2. Structure and basics
  3. Attributes and links
  4. Images
  5. Fonts
  6. Lists