1. IST 523 Spring 2008
Notes
Session 4
Basic HTML
Background
HTML is short for hypertext markup language. The basics behind the name are:
• Hyper is the opposite of linear. It used to be that computer programs had to
move in a linear fashion. This before this, this before this, and so on. HTML does
not hold to that pattern and allows the person viewing the World Wide Web page
to go anywhere, any time they want.
• Text is what you will use. Real, honest to goodness English letters.
• Mark up is what you will do. You will write in plain English and then mark up
what you wrote.
• Language it's a language, really -- but the language is plain English.
You can write HTML by hand using tools such as NotePad on Windows, or SimpleText on
the Mac. Some people insist on using an HTML assistant program because it makes it
easier. That's true, but it also makes it harder to understand and to learn since the
program does the work for you.
HTML documents must be text only. When you save an HTML document, you must save
only the text, nothing else. HTML browsers can only read text. Browsers simply don't
understand anything else. To test this point, go ahead and create an HTML document
and save it in WORD as a regular Word document. Then try and open it in your browser.
Nothing will happen. Go ahead and try it. You won't hurt anything. However, if you are
using Notepad, Wordpad, or Simple Text, the document will be saved as text with no
extra prompting. Just choose SAVE.
A great way to learn HTML is to look at Web pages. To look at the source code of a Web
page:
1. When you find a page you like, click on VIEW at the top of the screen.
2. Choose DOCUMENT SOURCE from the menu. Sometimes it only reads SOURCE.
3. The HTML document will appear on the screen.
It may look like chicken-scratch right now, but you should be able to pick out elements
and later, you'll be able to find exactly how a certain HTML presentation was performed.
Many Web page authors write notes or describe what is happening within the HTML
document, but only want these notes to show in the HTML source, not the Web page
display. To achieve this, they write their notes in the following format:
<!-- start of syllabus and definitions -->
Naming Web documents
2. What you name your document is very important. You must first give your document a
name and then add a suffix to it. That's the way everything works in HTML. You give a
name and then a suffix.
Follow this format to name your document:
1. Choose a name. Anything.
2. Add a suffix. For all HTML documents, you will add either quot;.htmquot; or quot;.htmlquot;.
(quot;.htmquot; for PCs running Windows 3.x and quot;.htmlquot; for MAC and Windows 95/98/XP
Machines)
Name the file anything you want: tree followed by .html so the file name is tree.html.
The .html tells the computer that this file is an HTML document. When we get into
graphics, you'll see a different suffix. All files used on the Web will follow the format of
quot;name.suffix.quot; Always.
Basic HTML anatomy
HTML works in a very simple, very logical, format. It reads like you do, top to bottom, left
to right. That's important to remember. HTML is written with TEXT. What you use to set
certain sections apart, to specify their format (as bigger text, smaller text, bold text,
underlined text) is a series of tags.
Think of tags as commands. If you want a line of text to be bold you will put a tag at the
exact point you want the bold lettering to start and another tag where you want the bold
lettering to stop. If you want just a word to be italic, you will place a start italic tag at the
beginning of the word and an end italic tag at the end of the word.
All tag (you may want to think of them as commands) formats are the same. They begin
with a less-than sign: < and end with a greater-than sign: >. Always. No exceptions.
What goes inside the < and > is the tag. Learning HTML is learning the tag to perform
whatever command you want to do. The tag for bold lettering is quot;Bquot;. Here's what the
tags look like to turn the word quot;Samquot; bold: <B>Sam</B>
A peek behind the scenes:
1. <B> is the beginning bold tag.
2. quot;Samquot; is the word being affected by the <B> tag.
3. </B> is the end bold tag. Notice it is exactly the same as the beginning tag except
there is a slash in front of the tag command.
4. This is what the bold tags above produce in a Web browser: Sam
Quick points
The end tag for other commands is simply the begin tag with the added slash. Not all
tags will show up on a page, because the commands are placed inside the < and >
marks, so the tag is used to alter the text, but unless you view the page source the code
is hidden from view. The command inside the <> does not have to be a capital letter;
the browser doesn’t care. But for ease when coding or when reviewing HTML code you
have written, keeping the tag in capital letters sets them apart from the normal text.
3. Not everything you write for a web page needs to have tags. But if you want to format
the text with italics or underlines or bold, or if you want to center text, etc., you will need
to use tags. If you forget to add an end tag it will be obvious when you view the
document in your browser, as the entire document after the point where you forget the
end tag will be affected. To fix this, just go back into the document, add the slash, save,
and then reload the document into the browser.
You can have two tags affect text at the same time, like bold and italic. You just need to
make sure to begin and end both.
<B><I>Bold and Italic</I></B> gives you Bold and Italic
When using more than one tag set the beginning and end tags at the same time, always
placing them on the farthest end of the item being affected. In the example above note
that the Bold tags are on the far ends. Next in line are the Italics.Just keep setting
commands at the farthest ends each time you add them and you'll stay in good form.
Start every page with this tag: <HTML> You are denoting that this is an HTML
document.
The next tags will always be these: <TITLE> and </TITLE>
Whatever you put between these two tags will show up in the title bar way at the top of
the browser.
Finally, you'll end every page you write with this tag: </HTML>
You started the page with HTML and you will end the page with /HTML.
Headings
Heading commands are used extensively in HTML documents to, you guessed it, create
headings! There are six (6) heading commands: <H1> through <H6>. <H1> is the
largest and <H6> is the smallest. Heading commands create nice, bold text, as shown
above, and are quite easy to use. It's a simple H# and /H# command. However, they do
have one annoying trait. They like to be alone. When you use a heading command, by
default, you set the text alone. It's like the heading commands carry a <P> command
with them. You cannot get other text to sit right up against a heading.
Font size
To have a little more control over your text size use the <FONT SIZE> commands.
Heading commands are great for right at the top of the page, but these font commands
are going to end up as the workhorses of your pages. There are twelve (12) font size
commands available to you: +6 through +1 and -1 through -6.
+6 is the largest (it's huge); -6 is the smallest (it's a little too small). Here are a few of
them in action. There's no need to show all of them. You'll get the idea of their relative
sizes. Follow this pattern to place one on your page.
<FONT SIZE=quot;+3quot;>This is +3</FONT>
<FONT SIZE=quot;+1quot;>This is +1</FONT>
<FONT SIZE=quot;-1quot;>This is -1</FONT>
4. Notice that this first command, <FONT SIZE=quot;--quot;> is actually doing two things--It's
asking for a new font size... and then offering a number to denote the font size.
Aligning text
If you just write text it will appear in the browser window justified to the left of the
screen, as that’s the default. To center text you surround the text you want centered with
simple <CENTER> and </CENTER> commands: <CENTER> All text in here will be
centered </CENTER>
If you want to align text on the right, set the text aside as a separate paragraph using
the <P> command plus an attribute: <P ALIGN=quot;rightquot;>Text in the paragraph is pushed
to the right</P>.
Linking
Creating a link to another page, either within your website or at an entirely different site,
is a set tag format. To create a link to another web page:
<A HREF=quot;http://URL of the web pagequot;>text you want to display on the web page</A>
• A stands for Anchor. It begins the link to another page.
• HREF stands for Hypertext REFerence. That says to the browser, quot;This is where
the link is going to go.quot;
• URL of the web site is the FULL ADDRESS of the link. Also notice that the address
has an equal sign in front of it and is enclosed in quotes. Why? Because it's an
attribute of the Anchor tag, a command inside of a command
Where it reads quot;text you want to display on the web pagequot; is where you write the text
you want to appear on the page. What is in that space will appear on the page for the
viewer to click. So, write something that denotes the link.
/A ends the entire link command.
E-mail links are known as mailto: commands. They follow the same coding scheme as the
hypertext link above. What this format does is place wording on the screen that people
can click to send you a piece of e-mail. The pattern is:
<A HREF=quot;mailto:e-mail addressquot;>text to display</A>
Notice it's the same format as a link except you write quot;mailto:quot; in place of the “http://”
and your e-mail address in place of the page address. Yes, you still need the </A> tag at
the end. Please notice there is NO SPACE between the colon and the e-mail address.
Images
The command to place an image is constant. You will use the same format every time.
The format for placing an image: <IMG SRC=quot;filename.gifquot;>
• IMG stands for quot;image.quot; It announces to the browser that an image will go here
on the page. The image will pop up right where you write in the image tag.
• SRC stands for quot;source.quot; This is an attribute, a command inside a command. It's
telling the browser where to go to find the image.
5. • filename.gif is the name of the image. Notice it's following the same type of
format as your HTML documents. There is a name (of the image file) then a dot
and then there is a suffix (gif).
It's best for you to place the images you want to use in the same directory as the page.
This way you can call for the image by name alone. If you start to place your images all
over the place, you'll have to start adding directories and sub-directories to the SRC
attribute. Some folks place all their images in an image directory, and that can cut down
on the confusion. Just be consistent or else the image won’t display, just a blank space
with a funny little box in the upper left-hand corner.
Image file types
There are three basic image formats you will find on the Web. Each is denoted to the
browser by a different suffix.
.gif This is generally pronounced quot;gifquot; (hard quot;Gquot;). This is an acronym for Graphics
Interchange Format. The format was invented by Compuserve and it's very
popular, because it's a simple format. It's a series of colored picture elements, or
dots, known as pixels, that line up to make a picture. Your television's picture is
created much the same way. Browsers can handle this format quite easily.
.jpeg or .jpg (pronounced quot;j-pegquot;) There are two names to denote this format
because of the PC and MAC formats allowing 3 and 4 letters after the dot. JPEG is
an acronym for Joint Photographic Experts Group, the organization that invented
the format.
The format is unique in that it uses compression after it's been created. That
means that when the computer is not using a .jpeg image it folds it up and puts it
away. For example, if the picture is 10K bytes when displayed, it may be only 4K
bytes when stored. This saves on hard drive space, but also tends to require a bit
of memory on your part to unfold the image. .gif images also use compression,
but only when they are first created into that format. After that, no compression.
JPEG, on the other hand, uses compression throughout its life to fold up smaller
than it really is.
.bmp (pronounced quot;bimpquot;) This is a quot;bitmap.quot; You will probably never place a
bitmap as an image, although Internet Explorer browsers allow it. A bitmap is an
image that a computer produces and places for you. A counter is an example.
Even though Internet Explorer will allow you to place an image as a BMP, don’t.
No other browsers will be able to display it. Go with .gif or JPEG.
To have a “clickable” image, one where if you click on it you activate a hypertext link to
another web page, follow this format: <A HREF=quot;http://URL of the web pagequot;><IMG
SRC=quot;filename.gifquot;></A>
With this command an image tag is placed where normally there would be words. The
entire image is “clickable,” or active.
6. Whenever you place an image in a web page, use the “alt” attribute to provide alternate
text to display when you hover over the image. This alternate text is especially important
for users browsing with a text-only browser, or for those that cannot see and are using
audible readers with their browser to surf the web. The format is <alt=brief description
of image> and follows after the IMG SRC tag: <IMG SRC=quot;UpArrow.gifquot; ALT=quot;Upquot;>
Tables
Tables are very useful for presentation of tabular information as well as a boon to
creative HTML authors who use the table tags to present their regular Web pages, as
tables can control page layout.
The general format of a table looks like this:
<TABLE>
<!-- start of table definition -->
<CAPTION> caption contents </CAPTION>
<!-- caption definition -->
<TR>
<!-- start of header row definition -->
<TH> first header cell contents </TH>
<TH> last header cell contents </TH>
</TR>
<!-- end of header row definition -->
<TR>
<!-- start of first row definition -->
<TD> first row, first cell contents </TD>
<TD> first row, last cell contents </TD>
</TR>
<!-- end of first row definition -->
<TR>
<!-- start of last row definition -->
<TD> last row, first cell contents </TD>
<TD> last row, last cell contents </TD>
</TR>
<!-- end of last row definition -->
</TABLE>
<!-- end of table definition -->
7. Publishing your page on the web
You need to use an FTP program (File Transfer Protocol), a small program that allows
you to place files from your computer to your service provider's computer. You should get
an FTP program from your service provider along with directions on how to use it. If not,
check the utilities section at the end of these notes.
XML
XML (an acronym for Extensible Markup Language) is a set of rules, published by the
W3C (World Wide Web Consortium), for building new languages. The languages in
question are not written or spoken primarily for human consumption; they're intended to
simplify -- and simultaneously enrich -- information sharing among software and humans.
These languages, and the documents in which they're quot;written,quot; all share some common
characteristics. XML delimits blocks of content with intelligible, structure-defining markup
to add meaning to the content itself.
An XML document does exactly that with its plain-text contents: it scatters little verbal
signposts among the content, imposing on it a structure which is immediately
understandable even if what is being structured is not obvious. These signposts are
collectively referred to as markup. And here's where those special characters come into
play. The most important such characters -- no XML document does not include them --
are the <, >, and / (less-than, greater-than, and slash, respectively). Here's an XML-ified
version of a sentence, with the markup in boldface:
<sentence><clause>Benedict Arnold didn<punctuation type=quot;apostrophequot;/>t
cross the Delaware<punctuation type=quot;semi-colonquot;/></clause><clause>he
crossed his country<punctuation type=quot;periodquot;/></clause></sentence>
The markup in this quot;XML documentquot; is contained within the angle brackets. Notice that
the markup breaks up the overall sentence into smaller chunks, in a nested structure.
Often this structure is made more obvious for legibility using line breaks and spaces, like
this:
<sentence>
<clause>Benedict Arnold didn<punctuation type=quot;apostrophequot;/>t cross the
Delaware<punctuation type=quot;semi-colonquot;/></clause>
<clause>he crossed his country<punctuation type=quot;periodquot;/></clause>
</sentence>
Each clause is subordinate to the overall sentence, and within a clause there may be a
mixture of the plain text and punctuation. The punctuation could have been left as literal
text, rather than defined via markup. Furthermore, the markup itself is human-readable:
anyone with an elementary understanding of English grammar knows what the words
quot;sentence,quot; quot;clause,quot; and quot;punctuationquot; mean.
XML is all about well-formedness. Well-formedness are the specific rules with which all
XML documents must comply in order to be minimally legitimate XML. Other examples
include:
8. • element and attribute names are case-sensitive (a SENTENCE element is not the
same as a sentence element), and the corresponding markup is as well;
• attribute values must be enclosed in single or double quotation marks; and
• the nesting of one element within another, as defined by the placement of tags, is
precise. Every start tag must be balanced with one end tag, and no overlap of the
boundaries between one element and the next is permitted.
Implicit in that last point, by the way, is that each well-formed XML document has one
and only one quot;outermost element,quot; within which all the others are nested. This outermost
element is called the root element.
So, in conclusion, an XML document is a string of plain text, delimited by markup, in a
well-structured form including a single root element and others, nested inside one
another. In other words, XML was designed to describe data and to focus on what data
is, while HTML was designed to display data and to focus on how data looks.
Metadata
The term metadata evokes a technical image and it is not viewed as a quot;user friendlyquot;
topic. Simply defined, metadata is quot;data about data.quot; Used in the context of digital spatial
data, metadata is the background information that describes the content, quality,
condition, and other appropriate characteristics of the data. Paper maps contain
metadata, primarily as part of the map legend. In this form, metadata is readily apparent
and easily transferred between map producers and map users. When map data are in a
digital form, metadata is equally as important, but its development and maintenance
often require a more conscious effort on the part of data producers and the chain of
subsequent users who may modify the data to suit their particular needs.
A good source for info on metadata is
http://www.getty.edu/research/conducting_research/standards/intrometadata/
HTML utilities and info sites
Style guides
Composing good HTML http://www.ology.org/tilt/cgh/
Web style guide http://www.webstyleguide.com/
Web authoring
http://www.ku.edu/acs/documentation/docs/web-authoring_introduction.pdf
Web building tutorials http://www.w3schools.com/default.asp
Utilities
HTML validator http://www.htmlhelp.com/tools/validator/
Doctor HTML http://www2.imagiware.com/RxHTML/
HTML Tidy
http://www.w3.org/People/Raggett/tidy/ or http://tidy.sourceforge.net/
9. HTML editors
SoThink http://www.tucows.com/preview/194496.html
Power HTML http://library.thinkquest.org/C001341/phtml/index.php3
Netscape Composer http://channels.netscape.com/ns/browsers/download.jsp
Browsers
Mozilla Firefox http://www.mozilla.org/products/firefox/
Opera http://www.opera.com/
Netscape http://channels.netscape.com/ns/browsers/download.jsp
Internet Explorer http://www.microsoft.com/windows/ie/default.mspx
HTML info sites
HTML tags http://www.hypergurl.com/htmltags.html
HTML tags cheat sheet http://html-tags.info/
How to use meta tags http://searchenginewatch.com/webmasters/article.php/2167931
How did they do that http://www.tashian.com/htmlguide/
Image sources, info
Google Images http://images.google.com/
Barry’s clipart server http://www.barrysclipart.com/
Kids Domain clipart http://www.kidsdomain.com/clip/
Including an image
http://www.w3.org/TR/REC-html40-971218/struct/objects.html#h-13.2
ALT text attributes
http://builder.com.com/5100-31_14-5073307.html?tag=search
Last updated 12/31/2007
Denise A. Garofalo