HTML as a semantic and descriptive language
Before I start I want to share with you that while writing this article I felt like I was stating the obvious. This article is meant first and foremost for beginners. But sometimes even stating the obvious might help enlighten advanced users.
Hyper Text Markup Language
I want to delve into this for a minute or so. HTML is a language designed to markup text. It is meant to give contextual meaning to the text it used on. One of the great injustices that were done to HTML was the fact that this simple truth was almost always neglected.
HTML is actually quite beautiful. It is extremely rich, and provides some very adaptive patterns to describe complex content structures.
In this article (which might develop to a series) I intend to take the time to explain how to recognize the semantic meaning of various structures, and how to best utilize HTML to describe them.
I find that many times this issue is treated as "advanced knowledge", but I actually believe that this is the core of HTML, and thus should be taught from the very first day learning HTML.
Semantic Web
1st, I find it important to state this simple fact – all but few elements in HTML have strict semantic meaning. They were created to describe specific content meanings. A simple example of a well structured HTML document is that when you view the HTML source, It has the same (and sometimes more) meaning that it had without its style.
A few simple examples – an OL stands for an ordered list – this means that it contains a list of elements which are related by their order. The P element is used to identify a paragraph of text. The H1-6 elements are heading elements. Tables are used to markup probably the hardest structure of them all – the tabular data – such as a schedule.
In the rest of this article I will list few common and less common elements and how to use them to give meaning to structure.
Lists
HTML offers us 3 types of lists – OL, UL, and DL. Up until HTML5, which I will not cover in this series, HTML lists are probably the best way to describe a relationship between several element of the same type.
- UL
-
the
ULelement is a way to describe any list of items that are connected by their type. A good example for this is a list of links. Another very common example is a menu. An image gallery is essentially a list of images. In fact, theULshould be your default choice for grouping similar elements on your page. - OL
-
as I mentioned before, the
OLis a specific subset of theUL, which defines the relationship between the element-group by its order. For example, if your article list if ordered by any mean (usually publication date), you should use anOLto nest it. Another good example could be a breadcrumbs menu – this is a specific type of a list of links that has a strict importance to its order. - DL
-
this is a third, less known, seldom used type of list. If you don't know how to use this list, I suggest you read about it here. This list groups a list of elements as descriptions of their term – their title. A classic example is a contact information list – where a user can have several addresses. Another example is this specific list – which is made of terms and their definitions. Another, less used example is the use of
DLs is to markup forms. You can find a good description of this use on the link above.
Headers (h1-6)
This is often a misused set of elements. Many beginners use them to style their documents. But actually, the H-family elements are one of the strongest tools we have to describe the hierarchy of our page. To understand this, we 1st need to remember that almost all HTML pages are at their core simple documents. Every document has its hierarchy – it has a main title, its chapter's titles, these have their own sub-titles and so on. The H-family comes to specifically help us describe this type of hierarchy. Every page on our site has a main header – the page's title – this would be our H1. This doesn't have to be the site's logo. If it's an article – it could be the article's header, but every page must have one H1, and unless we're using HTML5, it should usually have just the one. Every header under that page's title should be nested in a chain of H-tags. When I say a chain I mean that we shouldn't have an H3 if the page hasn't an H2. An H6 should only exist if it is nested under a list of H1-5.
BIG , STRONG , SMALL , EM
This is another case of common misuse. In the old days of HTML prior to CSS2.1, many tags were used specifically to style our elements – the most common were the U, B and I tags. This has left us in a place where many web developers, along with many WYWSIWYG editors, still use the default styles of some elements to style their content. For example – the strong tag comes by default with a bold font. The em comes as italic. But these elements actually comes to help us distinguish special meanings in our text.
The above list of elements comes to help us describe the importance of the text marked by them. strong and big help us describe our text as important - the small as less relevant - Maybe a comment to the text. The em is a short for emphasis – we should use this to help us emphasize a specific word in our sentence for example. You might have noticed that these tags are actually quite close to one another. This is actually an advantage – it can help us develop our own style and preferences, and to define unique nuances.
ABBR, DFN
These are another couple of cool elements-
-
The
ABBRmarks abbreviations – such as initialisms and acronyms (although these have their own element – acronym, I prefer the abbr, and this is also the W3C recommendation). You can see various examples of this use throughout this text. The way we use it is that we define a title to theABBRthat is the full meaning of the abbreviation. An example – HTML. -
The
DFNelement is a way for us to define an inline definition in a paragraph. When content within a paragraph is wrapped with it, it is treated as the term the paragraph defines. Just inspect the markup of this paragraph to see this in use.
Those who have no meaning
Lastly, HTML defines 2 elements that were meant to have no specific meaning – the DIV and SPAN elements. Well, actually, they aren't quite the same.
The DIV element is meant to help us define a block-level division of our content. At the era of pre-HTML5, we use the div element to define sections of our page – such as header, footer, main content etc.. Basically, the DIV has 2 main uses – one is to divide our content into sections. The 2nd use is to help us style our content without harming our semantic markup.
The SPAN element should only be used for the 2nd use – a means to style specific parts of our text that have no specific semantic meaning. This is almost always unnecessary, but on the special occasions where it happens, it's there for us to use.
Conclusion
This was just the tip of the iceberg. I haven't included other very important tags, such as code, q, blockquote and many more. There are over 100 tags defined in HTML, and we can use them all in an endless number of combinations to markup our content – just to give you a taste of this, look at these 3 examples:
<ol>
<li>
<h4>Bricks</h4>
<p>A square stone.</p>
</li>
<li>
<h4>Pipes</h4>
<p>A round metal object.</p>
</li>
<li>
<h4>Stick</h4>
<p>A wooden object</p>
</li>
</ol>
<ul>
<li>
<p><dfn>Bricks</dfn> - A square stone.</p>
</li>
<li>
<p><dfn>Pipes</dfn> - A round metal object.</p>
</li>
<li>
<p><dfn>Stick</dfn> - A wooden object</p>
</li>
</ul>
<dl>
<dt>Bricks</dt>
<dd>A square stone.</dd>
<dt>Pipes</dt>
<dd>A round metal object.</dd>
<dt>Stick</dt>
<dd>A wooden object</dd>
</dl>
All 3 lists have a very similar meaning, but each is a bit different than the other. This is the magic of HTML, and this is where we go deep.
Additional Reading
- Using DLs to style forms.
- POSH - Plain Old Semantic Markup HTML.
- W3C Recomendations on using lists.
- More W3C on DLs.