| Basics | Anchors | Images, movies, sound | Background color & image | Tables | Forms |
The "Markup" in HyperText Markup Language means that "tags" are added to the source page (a regular word-processing text file).
A couple of <b>bold, bold</b> words
the displayed text looks like:
A couple of bold, bold words.
Many tags have attributes: options that provide more precise control of the behaviour of the tag. These are included with the tag within the angular brackets
The following is a list of the common HTML 3.2 and 4 tags, recognized by most current browsers, that should allow you to achieve pretty complicated page designs. Be warned that HTML is rapidly evolving. Different browsers may implement their own set of un-official tags that may or may not be recognized by other browsers; i.e., your page may not look too good to older or even different browsers if you design at the bleeding edge.
Another issue is that HTML is moving to a new and more powerful XML-based model called XHTML, and its recommendations should be followed, to ensure backward and forward compatibility.
For example, one of the most glaring differences is that HTML 4 wants tags to be in uppercase, while XHTML is case sensitive, and wants all regular HTML tags to be in lower case. Fortunately, most current browsers recognize both lower and upper case tags. So, the best bet is to use lower case tags exclusively
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>...all the stuff on the page...</html>
By definition, it contains the head and body of the page. It is supposed to define everything on the page that is to be displayed. But older documents don't have this tag, so it is optional (until XHTML comes along).
<head>...stuff In the header...</head>
is to define material that belongs in the header. It is conceivable that a viewer or an indexing program can download just the header, to decide whether to bother to download the rest of the page. Thus the header is a good place to put a succinct description of what the rest of the page is all about.
The head tag can have a number of attributes:
prompt tag, that allows you to use your own wording instead of whatever default text is normally used (For NetScape Navigator, the default is "This is a searchable index. Enter search keywords:"). Obviously, this does not work by itself, you need to have some program on the web server to service the databse search request.
The syntax is
<head base href="URL">
Essentially, the base URL (retaining all directory information but omitting any specific document) is prefixed to the relative URL (omitting any preceding dot-slash, UNIX-style shorthands).
For example, if we have
<head base href="http://www.microbiology.wustl.edu/sindbis/default.html">
then a relative link of
<A href="genes/nsP1p">
is interpreted as
<a href="http://www.microbiology.wustl.edu/sindbis/genes/nsP1p">
(see Anchors to make sense of this).
<body>...everything else...</body>
defines all the rest of the material on the page.
The body tag has a number of attributes for specifying:
<br> (Mnemonic: break)
To force a line break, and to move subsequent text beyond a floating image (see below), use
<br clear=left/right/all>
Text is moved until it is clear of the image on the left, right or both sides.
<p>(contents of the line)</p> (Mnemonic: paragraph)
<nobr>a line that should not break>/nobr>
You can specify a word break, using <wbr> where the line may break, if necessary, in an otherwise non-breaking line. (this might not work with some newer browser versions).
Example, using the same long line as above:
(All 'entites' or keywords are identified by an initial & and the required semi-colon at the end).
The following HTML code:
Here are some non-breaking spaces
looks like
Here are some non-breaking spaces
(Note how the regular spaces are scrunched, while the non-breaking spaces are observed. Also, note that the lower case is required. The upper case &NBSP; doesn't work!).
Here are some frequently used entities to display special characters (all need the & prefix and ; at the end):
| character code | displayed character |
| amp | & |
| lt | < |
| gt | > |
| quote | " |
| reg | ® |
| copy | © |
Characters that take the tilde, grave accent, acute accent, circumflex, or umlaut marks are constructed as follows
Examples:
Note that many of these are case-sensitive
<hx>.......</hx>
where x = 1 to 5.
The 5 levels of headers look like the following
<basefont=x>
x can be from 1 to 7.
To increase or decrease the font size of a block of text, use
<font size="x">........</font>
x can be an absolute number, e.g.,
Size is 1,
Size is 2,
Size is 3,
Size is 4,
Size is 5,
Size is 6,
Size is 7,
x can also be a relative number (e.g., +1, +2, -1, -2), to change the font size relative to the size specified by the basefont tag:
Size is +2. Size is +1. Regular or base size. Size is -1. Size is -2.
A few of the so-called "logical" styles or "Phrase elements" are
HTML Code
Note that tags within the CODE (and also the PRE) tag are interpreted.
If you don't want the code text to be interpreted, enclose it within < and > pair instead.
This can be quite annoying once the initial novelty wears off.
<center>some centered text</center>
to get
alignment are cumulative.
Commonly used lists are:
<ol type="I" start="4">
<li>item 1
<li>item 2
</ol>
type and start are optional attributes.
When used, the type attribute can be:
the start attribute is used to start the list with a value other than the default 1, I, i, A, or a.
Listed Items are defined by the li tag. (Note that a closing tag is not needed for HTML, but required by XHTML)
Here is an ordered list, with type="a" and start="6"
<ul type="disc", "circle" or "square">
<li>an item
<li>another item
</ul>
type is an optional attribute. It is used to specify the type of bullet at the start of a list item. The usual order for a nested set of lists is disc, then circle, then square. type is used to override the normal sequence.
Here is an unordered list with type="square"
Here is a whimsical countdown list
size="x" width="y|%" align="left|center|right" noshade
Examples: <hr> by itself looks like
This is not the case when you specify a width
<hr width="50" align="left">
looks like
<hr size="3" width="300" align="center">
looks like
<hr size="5" width="50%" align="right">
looks like
<hr noshade>
It looks like
<XMP>....</XMP>
Text that is formatted as XMP always go on a separate line, even overriding <NBR>
This tag is 'deprecated', meaning that it might not be supported in the future.
Instead, use &lt; and &gt; to enclose the code.
The links are accomplished by special elements called ANCHORS. An anchor is a page element (text, image, etc.) that a viewer can click, to jump to the linked item of information (some place on the same page, or on another page anywhere else), or that initiates some action (email, gopher, ftp, etc.)
The format or syntax for the anchor tag is
<a href="RESOURCE:URL">anchor text or picture</a>
The 3 components to be specified are the RESOURCE, the URL or uniform resource locator, and the anchor text or picture.
The anchor text or picture is what the viewer sees and can activate by clicking. Always try to choose an anchor text that is informative by itself or in context, so that the viewer has some idea of what is about to happen. You can also place an image in place of or in addition to the anchor text.
The most commonly encountered RESOURCE is
"my_hard_drive/folder_1/an_inner_folder/innermost_folder/the_file"
or
"../another_inner_folder/the_file"
"http://www.servername.wustl.edu/a_hard_drive/folder_1/an_inner_folder/the_file"
Note the addition of the name of the server, and its location in the internet, that is serving up the document.
Both of the above may point to a specific place in the page to go to. This is accomplished by giving a name to that place (how else would a dumb computer know where to go?), and including the name in the URL, separating it from the storage location with #:
<a href="storage_location/doc_name#named_place_in_the_doc">anchor text</a>
or
<a href="http://servername/storage_location/doc_name#named_place_in_the_doc">anchor text</a>
To name a place in the document, insert the naming tag
<a name="some_name"></a>
at the place to go to. Obviously, for the linkage to work, the name in the URL must be identical to the name in the naming tag. For forward comaptibility with XHTML, add id="name" as an extra attribute:
<a id="some_name" name="some_name"></a>
Other commonly encountered RESOURCEs are:
mailto: to activate email. The URL is the email address of the intended recipient.gopher:// to download a file or directory information using gopher file transfer protocol. The URL gives the location (including the host name, and the full UNIX-style path to the directory or file) of the information to go for.ftp:// to use standard file transfer protocol to download a file or a directory. The URL gives the location (including the host name, and the full or UNIX-style shorthand of the path to the file) of the information to download.telnet to use telnet to log into some host computer. The URL is the name and location of the host computer (e.g., servername.wustl.edu).file:/// to open a file on your local hard drive (note three / instead of the usual 2). The URL is the full path to the file you want to open. For example,
<A href="file:///myHardDrive/Applications/Netscape folder/documents/myHomePage.html">my home page</A>
Some of the RESOURCEs will require the assistance of other programs. For example, with telnet, most browser will turn the job over to a communication program. Browsers typically have a menu option for you to choose specific helper programs to use.
<IMG SRC="image_URL">
where image_URL is the URL of the image file. The URL used is identical to that of regular text-only files. For 'in-line' display (image is displayed on the web page), the images must be in the GIF (filename must end with .gif) or X Bitmap (filename must end with .xbm) format. Newer browsers can also display JPEG images in-line. Various plug-ins further extend the ability of browsers to display other image formats (including pdf, Adobe Acrobat format, as well as sound, video, etc., etc.). All other formats will be displayed in a separate window, by the designated helper applications. The following image attributes are available:
To move text down and beyond these floating images, use <BR clear=left, right or all> to get clear of any images on the left side only, the right side only, or on both sides.
The green rectangles have
Align= top texttop middle absmiddle baseline absbottom
Specifying the image width and height can speed up the page display process.
If the specified width is less (more) than the actual image width, the image is contracted (expanded) accordingly.
This can be used to distort an image horizontally.
If both width and height are decreased (increased) by the same factor, the image is contracted (expanded) without distortion.
If the specified height is less (more) than the actual image height, the image is contracted (expanded) accordingly.
This can be used to distort an image vertically.
If both width and height are decreased (increased) by the same factor, the image is contracted (expanded) without distortion
The attributes are included with the image tag, separated by spaces. For example:
<IMG SRC="image_URL" align=top alt="my picture" width=206 height=84>
Image files are typically much larger than text files, and will take proportionately longer to transmit. The viewer might get impatient and simply go somewhere else instead, thus never seeing your masterpiece. So, use images judiciously and efficiently
For example, use:
<A HREF= "URL_of_image"><src="../images/wumsMM.gif" width=69 height=28></A> WUMS logo (1.2 MB)
to display:
(This is the same image as the one at the top of this page. It is contracted by specifying proportionately smaller width and height attributes.
Normally you should not take this shortcut, since the whole image has to be downloaded, regardless of the final displayed size. You should create a separate, miniature version to use in-line.
I did it here:
1. to give an example of shrinking or expanding images using the width and height attributes.
2. Because the image has already been downloaded for display at the top of the page, it is available for use without needing to download it again.
Interleaving large GIF image files helps a lot.
<A HREF= "URL_of_the_file.file-extension">description of file</A> (5 MB)
(Again, the inclusion of the file size is a courtesy, since movies etc. are usually even bigger than image files).
Depending on the browser/plug-ins and the format of movies, sound, etc., the file is downloaded and then actually (dis)played by the browser, or by so-called "helper" applications. In the latter case, the browser downloads the file, and then uses the file extension to figure out what type of file it is, and, if it does not have the built-in capabilities, turns it over to the helper application to process. (If the viewer does not have the requisite helper applications, a well-mannered browser will offer to save it for later (dis)play).
Most browsers have a menu option for you to select the specific helper application to handle each type of downloaded file. Some common file formats are:
| File type | File extension | Comments |
|---|---|---|
| TIFF image | .tiff | Quicktime will do this in-line. Otherwise, use e.g. a graphics program |
| JPEG image | .jpg or .jpeg | helper not needed by newer browsers |
| rich text format files | .rtf | use e.g Microsoft Word |
| Adobe Acrobat files | use Adobe Acrobat viewer | |
| AIFF sound | .aiff | Quicktime will do this in-line. Otherwise, use e.g. MoviePlayer |
| AU sound | .au | Quicktime will do this in-line. Otherwise, use e.g. MoviePlayer |
| QuickTime movie | .mov | Quicktime will do this in-line. Otherwise, use e.g. MoviePlayer |
| MPEG movie | .mpeg or .mpg | Quicktime will do this in-line. Otherwise, use e.g. Sparkle |
| binhexed files | .hqx | use e.g. Stuffit Expander, Stuffit, unstuffit |
If the viewer had turned off the "Auto load images", background images will not be loaded (nor will foreground images). If the background color is not specified, then the colors of the text elements that you specify are ignored and they are set to the default colors.
<BODY BGCOLOR="#hxhxhx">
where hx is a hexadecimal (hex) numbers, and the three hex numbers are, in order, the intensity of red, green, blue to be displayed. 000000 is black, and FFFFFF is white.
Note that the default color table of the common browsers uses only the hex values of 00, 33, 66, 99, CC, FF for each of the R G and B colors (hence the 6x6x6 color table or color cube, or the so-called web-safe palette or color table). All other colors are achieved by dithering, that would yield colors that may differ among different browsers and operating systems. If you want a precise color to be shown, use one of the the 6x6x6 colors. This constraint is true for images in general. Try to use indexed colors, with the InterNet safe color table, when you save images in the.gif format (also remember to have diffusion selected).
Check that the color you use provides good legibility to the text and images on your page. Otherwise, the person looking at your page might see a big color rectangle, with some patches and specks scattered on it.
BGCOLOR is covered up by a background image. NetScape Navigator will paint the background color anyway, before covering it up with the image. The effect to the viewer is a quick flash of color, and then the background image appears. Unless you can use it to good effect, use a background color or an image, not both.
<BODY BACKGROUND="URL_of_the_image">
There are a few points to consider:
So, unless you have good reasons to do otherwise, make the image small. A 60 x 60 pixel image is nice and small. NetScape Navigator will tile it across and down the page.
Otherwise, the image might be too distracting, or might actually obscure the real text or images on the page.
Embossed paper (the effect I hope to achieve with the 'micro' background image of this page, with a size of 318 bytes), watermarks and light pastel textures work well.
These can be created in Photoshop reasonably easily. Many such images can also be downloaded from the web.
The color of each of these 4 types may be set (as attributes of the BODY tag) as follows:
<BODY TEXT="#hxhxhx" LINK="#hxhxhx" VLINK="#hxhxhx" ALINK="#hxhxhx">
where hx is a hexadecimal (hex) number, and the three hex numbers are, in order, the intensity of red, green, blue to be displayed.
000000 is black, and FFFFFF is white.
I don't like the default bright blue and red colors used by NetScape Navigator, so for this page I made the colors slightly darker, using
LINK="#0202aa" and VLINK="#aa0202"
Just for fun (since you barely see it anyway) I used a medium dark green (025502, bright green is almost illegible to my old eyes) for ALINK. Note that the precise color is not so important, and I didn't bother to conform to the internet safe color table.
<FONT COLOR="#336600">Colored text</FONT>
gives
Colored text
If you set the text colors, make sure that the text is legible against whatever background color or image that you choose to use! Some colors just don't show up very well.
Also, make sure that the different types of texts have clearly distinct colors, to make them easily distinguishble by the person looking at your page.
<table>....</table>
All of the following table attributes must be within the TABLE tag to be interpreted properly.
Optional table attributes are
BORDERWIDTHCELLSPACINGCELLPADDINGALIGNVALIGNEach Table Row is defined by <tr>....</tr>
The number of Table Row tags determines the number of rows in the table.
Optional table row attributes are
ALIGNVALIGNTable Data that go into each cell of a row are defined by <td>....</td>
The number of Table Data tags determines the number of cells in the current row of the table.
Optional table data attributes are
ALIGNVALIGNWIDTHCOLSPANROWSPANNOWRAPNote that when you start messing with WIDTH and COLSPAN and ROWSPAN and NOWRAP, it is easy to get UGLY tables with jagged edges, or worse. You have to keep track of the width (the number of cells) of each row and perhaps even the width of the individual cells.
No pain, no gain.
CAPTION and TH
<table width=480 cellpadding=3 cellspacing=3 border=4>
<caption><font size=+1><b>Table Title</b></font size></caption>
<tr>
<td colspan=3 align=center>Cell that spans multiple columns</td>
</tr>
<tr align=center>
<td width=160 rowspan=2>Cell that spans multiple rows
<table width=80% border=1 cellpadding=5>
<tr><td>Table</td><td>inside</td></tr>
<tr><td align=center>another</td><td align=center>table</td></tr>
</table></td>
<th colspan=2>A Header</th>
</tr>
<tr>
<td width=160 align=center>This could be text, an anchor, an image, a form, etc.</td>
<td width=160 valign=bottom>cell with <code>valign=bottom</code></td>
</tr>
<tr>
<td align=right>right and</td>
<td colspan=2 align=left>left align</td>
</tr>
</table>
produces this gem:
| Cell that spans multiple columns | ||||||
Cell that spans multiple rows
|
A Header | |||||
|---|---|---|---|---|---|---|
| This could be text, an anchor, an image, a form, etc. | cell with valign=bottom |
|||||
| right and | left align | |||||
When you use a web browser to view various documents out there on the Web, your browser program is the 'client". It sends a request for a specific file to the 'server'. The server is the program out there that finds the file and sends it to your browser client. Your browser client then interprets the information in the file, including the HTML tags described above, and draws the displayed page on your monitor.
In addition to this basic transaction between client and server, there are provisions for additional communications. The client can send information to the server, that the server can process and respond.
A form is one way the client can send information to the server. Obviously, the information in the form needs to be processed by the server to achieve anything useful.
Thus there is no point in including forms in your page unless you have a server and the appropriate cgi's to process the data returned by the form.
Just as the client browser uses a host of helper applications, the server typically takes advantage of other applications whose services are available to it. Since most applications were written with no regard to the web, the server cannot interact directly with them. Instead, communication between the server and these applications uses intermediaries called cgi's (Common Gateway Interfaces). A cgi is a small program, typically created using some scripting system, that accepts data from the web server, translates it into language that a particullar application can understand, and sends the data to that application. It might also mediate the reverse transfer: the application might return some information, that is translated by the cgi, and is sent back to the web server, that in turn processes the data and sends it to your browser client. More sophisticated cgi's can carry out all of the processing, independent of other applications.
<FORM ACTION="URL_of_cgi_that_processes_the_info" METHOD=GET or POST>
(various elements to get user's input)
</FORM>
The options of GET or POST determines the method of exchanging data between browser and server.
All of the following tags must be bracketed by the FORM tag to be interpreted properly.
Putting it all together:
<FORM>
<p>Type something:<INPUT type="text" value="Try it" size=20 maxlength=30>
<p>Enter your password:<INPUT type="password" value="secret" size=30 maxlength=20>
<p><INPUT type="checkbox" name="kase" value="cheddar">I like cheddar.
<br><INPUT type="checkbox" name="sausage" value="pepperoni" checked>I like pepperoni.
<p><INPUT type="radio" name="grp_1" value="B11" checked>It's night. Button 1 of group 1
<br><INPUT type="radio" name="grp_1" value="B12">It's day. Button 2 of group 1
<p><INPUT type="radio" name="grp_2" value="B21" checked>It's wet. Button 1 of group 2
<br><INPUT type="radio" name="grp_2" value="B22">It's dry. Button 2 of group 2
<p>If the ACTION attribute of the FORM tag is missing, as it is here, the current page is simply reloaded, and all inputs are reset when the SUBMIT button is clicked.
<INPUT type="submit" value="Send it into the ozone">
<INPUT type="reset" value="Erase it all!">
</FORM>
This is interpreted by the browser as follows, to demonstrate the various INPUT types:
Here is an example of a SELECT field
<FORM action="some URL" method="post">
<SELECT name="a_list_of_choices" size=2 multiple>
<OPTION>choice 1
<OPTION selected>choice 2
<OPTION>choice 3
<OPTION>choice 4
<OPTION>choice 5
</SELECT>
</FORM>
becomes
Example of a text area:
Comments:<P>
<form action="some URL" method="post">
<TEXTAREA name="user_comments" rows=2 columns=40>
</textarea>
</form>
becomes:
Comments:
Here is the code for a simple form (with the help of an invisible table to space things out nicely):
<form action="URL_of_cgi_that processes_this_form" post>
<table border=0 width=450>
<tr>
<td colspan=4>Address: <input name="name_of_input" type=text size=40></td>
</tr>
<tr>
<td width=175> </td>
<td width=150><input type="submit" value="Submit info"></td>
<td width=125><input type="reset" value="Erase entry"></td>
</tr>
</table>
</form>
This is what it looks like:
.
| Basics | Anchors | Images, movies, sound | Background color & image | Tables | Forms |
Return to Molecular Microbiology Home Page | |
| Send suggestions and comments to: www@borcim.wustl.edu
WWW | |
| Tel 314-362-7250, FAX 314-362-1232 | |