Wednesday 24 August 2011

What Is HTML?


HTML is a markup language, although the original intent was to create a content description language.
It contains commands that, like a word processor, tell the computer— in a very loose sense— what the content of the document is. For example, using HTML, you can tell the computer that a document contains a paragraph, a bulleted list, a table, or an image. The HTML rendering engine is responsible for actually displaying the text and images on the screen. The difference between HTML and word processors is that word processors work with proprietary formats. Because they're proprietary, one word processor usually can't read another word processor's native file format directly. Instead, word processors use special programs called import/export filters to translate one file format to another. In contrast, HTML is an open, worldwide standard. If you create a file using the commands available in version 3.2, it can be displayed on almost any browser running on almost any computer with any operating system— anywhere in the world. The latest version of HTML, version 4.0, works on about 90 percent of the browsers currently in use.
HTML is a small subset of a much more full-featured markup language called Standard Generalized Markup Language (SGML). SGML has been under development for about 15 years and contains many desirable features that HTML lacks, but it is also complex to implement. This complexity makes it both difficult to create and difficult to display properly. HTML was developed as an SGML subset to provide a lightweight standard for displaying text and images over a slow dial-up connection— the World Wide Web. Originally, HTML had very few features it has grown considerably in the past few years. Nevertheless, you can still learn the core command set for HTML in just a few hours.
HTML contains only two kinds of information: markup, which consists of all the text contained between
angle brackets (<>), and content, which is all the text not contained between angle brackets. The difference between the two is that browsers don't display markup; instead, markup contains the information that tells the browser how to display the content.
For example, this HTML:
<html>
<head><title></title></head>
<body>
</body>
</html>
is a perfectly valid HTML file. You can save that set of commands as a file, navigate to it in your browser, and display the file without errors— but you won't see anything, because the file doesn't contain any content. All the text in the file is markup.
In contrast, a file with the following content contains no markup: This is a file with no markup
Although most browsers will display the contents of a file with no markup, it is not a valid HTML file. The individual parts of the markup between the brackets are tags, sometimes called commands. There are two types of tags— start tags and end tags, and they usually appear in pairs (although they may be widely separated in the file). The single difference is that the end tag begins with a forward slash, for instance </html>. Other than the forward slash, start tags and end tags are identical.