SlideShare une entreprise Scribd logo
1  sur  48
Trinity College



          Markup Languages
                           Timothy Richards




    Trinity College, Hartford CT • Department of Computer Science • CPSC 225
HTML
            Hypertext Markup Language




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   2
HTML
            Hypertext Markup Language


           A Family of Related Languages




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   3
HTML
            Hypertext Markup Language


           A Family of Related Languages


          Most documents communicated
        on the web are written using HTML.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   4
HTML
              After the next few lectures
                you should be able to...
• Create standards-compliant static HTML documents
• Know where to find the reference definitions of HTML and
    XML and be able to understand (most of) these defns.
•   Determine if an XHTML document is syntactically correct
    by consulting an XML document type definition or schema.
•   Describe the history of HTML and relationship between
    HTML, XML, and XHTML.
•   Discuss pros and cons of following standards.
•   Explain the new additions to the next version: HTML 5

       Trinity College, Hartford CT • Department of Computer Science • CPSC 225   5
HTML Example

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   6
HTML Example
Every HTML document contains two types of information

        <!DOCTYPE html
           PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
           “http://www.w3.org/1999/xhtml”>
        <html xmlns=”http://www.w3.org/1999/xhtml”>
           <head>
              <title>HelloWorld</title>
           </head>
           <body>
              <p>Hello World!</p>
           </body>
        </html>

      Trinity College, Hartford CT • Department of Computer Science • CPSC 225   7
HTML Example
          The markup information (tags)

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   8
HTML Example
The character data of the document (not tags)

   <!DOCTYPE html
      PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
      “http://www.w3.org/1999/xhtml”>
   <html xmlns=”http://www.w3.org/1999/xhtml”>
      <head>
         <title>HelloWorld</title>
      </head>
      <body>
         <p>Hello World!</p>
      </body>
   </html>

 Trinity College, Hartford CT • Department of Computer Science • CPSC 225   9
HTML Example
   Document Type Declaration (more later)

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   10
HTML Example
                    Document Instance

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   11
HTML Example
    Each tag is either a start tag or end tag

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   12
HTML Example
The “word” in a tag is called the element name

   <!DOCTYPE html
      PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
      “http://www.w3.org/1999/xhtml”>
   <html xmlns=”http://www.w3.org/1999/xhtml”>
      <head>
         <title>HelloWorld</title>
      </head>
      <body>
         <p>Hello World!</p>
      </body>
   </html>

 Trinity College, Hartford CT • Department of Computer Science • CPSC 225   13
HTML Example
This is called the content of the head element.

   <!DOCTYPE html
      PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
      “http://www.w3.org/1999/xhtml”>
   <html xmlns=”http://www.w3.org/1999/xhtml”>
      <head>
         <title>HelloWorld</title>
      </head>
      <body>
         <p>Hello World!</p>
      </body>
   </html>

 Trinity College, Hartford CT • Department of Computer Science • CPSC 225   14
HTML Example
       Each document has a root element

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   15
HTML Example
       Each document has a root element

  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>                                      This is always
        <p>Hello World!</p>                      html in HTML
     </body>
                                                 documents
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   16
HTML Example
        This document strictly conforms
          to the XHTML 1.0 standard
  <!DOCTYPE html
     PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
     “http://www.w3.org/1999/xhtml”>
  <html xmlns=”http://www.w3.org/1999/xhtml”>
     <head>
        <title>HelloWorld</title>
     </head>
     <body>
        <p>Hello World!</p>
     </body>
  </html>

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   17
HTML Example
When viewed as a tree, XHTML 1.0 Documents always
        have two children: head and body
       <!DOCTYPE html
          PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
          “http://www.w3.org/1999/xhtml”>
       <html xmlns=”http://www.w3.org/1999/xhtml”>
          <head>
             <title>HelloWorld</title>
          </head>
          <body>
             <p>Hello World!</p>
          </body>
       </html>

     Trinity College, Hartford CT • Department of Computer Science • CPSC 225   18
HTML Example
When viewed as a tree, XHTML 1.0 Documents always
        have two children: head and body
       <!DOCTYPE html
          PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
          “http://www.w3.org/1999/xhtml”>
       <html xmlns=”http://www.w3.org/1999/xhtml”>
          <head>
             <title>HelloWorld</title>
          </head>
                                                The head element is
          <body>
                                                used to provide certain
             <p>Hello World!</p>
                                                instructions to the browser
          </body>
       </html>

     Trinity College, Hartford CT • Department of Computer Science • CPSC 225   19
HTML Example
When viewed as a tree, XHTML 1.0 Documents always
        have two children: head and body
       <!DOCTYPE html
          PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
          “http://www.w3.org/1999/xhtml”>
       <html xmlns=”http://www.w3.org/1999/xhtml”>
          <head>
             <title>HelloWorld</title>
          </head>
                                                The body element defines
          <body>
                                                the content of the page.
             <p>Hello World!</p>
          </body>
       </html>

     Trinity College, Hartford CT • Department of Computer Science • CPSC 225   20
HTML Example
               This document as a tree.


                                html


                head                            body



                 title                            p



           “HelloWorld”                   “Hello World!”

Trinity College, Hartford CT • Department of Computer Science • CPSC 225   21
HTML History
• Tim Berners-Lee (CERN, 1990)
• CERN - Physics Research Center
• Originally designed with science and
 engineering interest in mind.

• •1992 Elements:
    title
 •   paragraph
 •   hyperlinks
 •   headings
 •   simple lists
 •   glossaries
 •   monospace text
 •   address blocks & search terms in URL

      Trinity College, Hartford CT • Department of Computer Science • CPSC 225   22
HTML History
• Tim Berners-Lee (CERN, 1990)
• CERN - Physics Research Center
• Originally designed with science and
 engineering interest in mind.

• •1992 Elements:
    title
 •   paragraph
 •   hyperlinks                      That           was it!
 •   headings
 •   simple lists
 •   glossaries
 •   monospace text
 •   address blocks & search terms in URL

      Trinity College, Hartford CT • Department of Computer Science • CPSC 225   23
HTML History
• Marc Andreessen, Eric Bina
 • National Center for Supercomputer Applications (NCSA)
 • Graphical Browser: Mosaic (1993)
• Key Developers Left...
 • To form Netscape Communications!
• Microsoft
 • Created a team to develop Internet Explorer.
• The “Browser Wars”!
 • 1993-1997 HTML was defined by browser support
    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   24
HTML History
• Marc Andreessen, Eric Bina
 • National Center for Supercomputer Applications (NCSA)
 • Graphical Browser: Mosaic (1993)
• Key Developers Left...
 • To form Netscape Communications!
• Microsoft
 •   Created a team to develop Internet Explorer. This was
                                                                        BAD!
• The “Browser Wars”!                                                   Why?
 • 1993-1997 HTML was defined by browser support
     Trinity College, Hartford CT • Department of Computer Science • CPSC 225   25
HTML History
• HTML Developers
 • Required to “code” to each browser’s idiosyncrasies
• World Wide Web Consortium (W3C)
 • Launched in October of 1994 (16 years ago this month!)
 • Tim Berners-Lee




    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   26
HTML History
• HTML Developers
 • Required to “code” to each browser’s idiosyncrasies
• World Wide Web Consortium (W3C)
 • Launched in October of 1994 (16 years ago this month!)
 • Tim Berners-Lee
 • Goal: Produce Web Standards!



    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   27
HTML History
• Standards lagged behind de facto standards
 • 2.0 was a standard 6 months after draft for 3.0 released
 • 3.0 was never a standard
 • 3.2 was adopted as a standard by W3C in 1997
 • 3.2 specification captured “practice” of 1996 (year behind)
 • HTML 4 released in December 1997
 • HTML 4.01 is the “standard”
 • HTML 5 is up and coming!

    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   28
HTML History
      HTML standards are now being
      adopted from W3C rather than
         browser manufactures.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   29
HTML History
      HTML standards are now being
      adopted from W3C rather than
         browser manufactures.

           There are two important aspects
            of standardization for HTML.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   30
HTML History
      HTML standards are now being
      adopted from W3C rather than
         browser manufactures.

           There are two important aspects
            of standardization for HTML.


Syntax



Trinity College, Hartford CT • Department of Computer Science • CPSC 225   31
HTML History
      HTML standards are now being
      adopted from W3C rather than
         browser manufactures.

           There are two important aspects
            of standardization for HTML.


Syntax                                              Semantics



Trinity College, Hartford CT • Department of Computer Science • CPSC 225   32
HTML History
                          The Syntax



 Defines the strings of characters that can be
 used to represent an HTML document and
             those that cannot.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   33
HTML History
                          The Syntax



 Defines the strings of characters that can be
 used to represent an HTML document and
             those that cannot.

             < > A-Z a-z / * & % $ @ ! 0-9




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   34
HTML History
                      The Semantics



A description of what the various elements of
  a syntactically correct document mean.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   35
HTML History
                      The Semantics



A description of what the various elements of
  a syntactically correct document mean.

      The p element represents a paragraph
       The a element represents an anchor
     The href attribute represents a hyperlink


Trinity College, Hartford CT • Department of Computer Science • CPSC 225   36
HTML History
                      The Semantics



    Formal methods do exist for defining
semantics, however, often a language is defined
     using natural-language descriptions.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   37
HTML History
                      The Semantics



    Formal methods do exist for defining
semantics, however, often a language is defined
     using natural-language descriptions.

   For the syntax of computer languages,
 however, we use a metalanguage to describe
         components of the language.


Trinity College, Hartford CT • Department of Computer Science • CPSC 225   38
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   39
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.

      BNF could be used to define HTML...




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   40
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.

      BNF could be used to define HTML...

                  But, SGML
   (Standard Generalized Markup Language)
            is used for HTML 4.01




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   41
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.

      BNF could be used to define HTML...

        Turns out SGML is VERY complex!




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   42
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.

      BNF could be used to define HTML...

        Turns out SGML is VERY complex!

 W3C introduced XML in 1998 to describe
               HTML...



Trinity College, Hartford CT • Department of Computer Science • CPSC 225   43
HTML History
For languages such as Java, a formal notation
known as Backus-Naur Form (BNF) is used.

      BNF could be used to define HTML...

        Turns out SGML is VERY complex!

 W3C introduced XML in 1998 to describe
               HTML...

      This resulted in XHTML 1.0, which is
      syntactically identical to HTML 4.01
Trinity College, Hartford CT • Department of Computer Science • CPSC 225   44
HTML History
      For languages such as Java, a formal notation
      known as Backus-Naur Form (BNF) is used.

            BNF could be used to define HTML...

              Turns out SGML is VERY complex!

     W3C introduced XML in 1998 to describe
With Some          HTML...
Restrictions
        This resulted in XHTML 1.0, which is
         syntactically identical to HTML 4.01
      Trinity College, Hartford CT • Department of Computer Science • CPSC 225   45
HTML History
• XHTML 1.0
 • Semantically identical to HTML 4.01
 • Restricts from of HTML 4.01 generality
• Abstract Syntax Trees (AST)
 • Representation of HTML elements “abstractly” as trees
• Concrete Syntax Trees (CST)
 • Representation of HTML elements as characters in trees
• XHTML 1.0 AST == HTML 4.01 AST
• XHTML 1.0 CST != HTML 4.01 CST
    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   46
HTML History
• XHTML 1.0 Differences
 • Omitted tags are not allowed
 • All element and attribute names must be lowercase
   (HTML 4.01 names are case insensitive)

 • All attribute values must be quoted (not always necessary
   in HTML 4.01)

• Differences are not burdensome
 • They make it easier to write software to process HTML
   documents


    Trinity College, Hartford CT • Department of Computer Science • CPSC 225   47
Digging into XHTML


               More on this next time!

                       Any questions?




Trinity College, Hartford CT • Department of Computer Science • CPSC 225   48

Contenu connexe

En vedette

Website design and devlopment
Website design and devlopmentWebsite design and devlopment
Website design and devlopmentPX Media
 
Struts2 course chapter 1: Evolution of Web Applications
Struts2 course chapter 1: Evolution of Web ApplicationsStruts2 course chapter 1: Evolution of Web Applications
Struts2 course chapter 1: Evolution of Web ApplicationsJavaEE Trainers
 
The Evolution of the Web
The Evolution of the WebThe Evolution of the Web
The Evolution of the WebCJ Gammon
 
Chapter17 system implementation
Chapter17 system implementationChapter17 system implementation
Chapter17 system implementationDhani Ahmad
 
Issues of web design and structure
Issues of web design and structureIssues of web design and structure
Issues of web design and structureDotTourism
 
eXtensible Markup Language (By Dr.Hatem Mohamed)
eXtensible Markup Language (By Dr.Hatem Mohamed)eXtensible Markup Language (By Dr.Hatem Mohamed)
eXtensible Markup Language (By Dr.Hatem Mohamed)MUFIX Community
 
Website Design Issues
Website Design IssuesWebsite Design Issues
Website Design Issuesrakudepp
 
Web1, web2 and web 3
Web1, web2 and web 3Web1, web2 and web 3
Web1, web2 and web 3mercedeh37
 
Web Evolution Nova Spivack Twine
Web Evolution   Nova Spivack   TwineWeb Evolution   Nova Spivack   Twine
Web Evolution Nova Spivack TwineNova Spivack
 
Web 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
Web 1.0 to Web 3.0 - Evolution of the Web and its Various ChallengesWeb 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
Web 1.0 to Web 3.0 - Evolution of the Web and its Various ChallengesSubhash Basistha
 
Fundamentals of Web for Non-Developers
Fundamentals of Web for Non-DevelopersFundamentals of Web for Non-Developers
Fundamentals of Web for Non-DevelopersLemi Orhan Ergin
 

En vedette (18)

Website design and devlopment
Website design and devlopmentWebsite design and devlopment
Website design and devlopment
 
Class2
Class2Class2
Class2
 
Struts2 course chapter 1: Evolution of Web Applications
Struts2 course chapter 1: Evolution of Web ApplicationsStruts2 course chapter 1: Evolution of Web Applications
Struts2 course chapter 1: Evolution of Web Applications
 
The Evolution of the Web
The Evolution of the WebThe Evolution of the Web
The Evolution of the Web
 
Chapter17 system implementation
Chapter17 system implementationChapter17 system implementation
Chapter17 system implementation
 
Issues of web design and structure
Issues of web design and structureIssues of web design and structure
Issues of web design and structure
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
eXtensible Markup Language (By Dr.Hatem Mohamed)
eXtensible Markup Language (By Dr.Hatem Mohamed)eXtensible Markup Language (By Dr.Hatem Mohamed)
eXtensible Markup Language (By Dr.Hatem Mohamed)
 
Web evolution (Part I)
Web evolution (Part I) Web evolution (Part I)
Web evolution (Part I)
 
Client side and server side scripting
Client side and server side scriptingClient side and server side scripting
Client side and server side scripting
 
Website Design Issues
Website Design IssuesWebsite Design Issues
Website Design Issues
 
Web1, web2 and web 3
Web1, web2 and web 3Web1, web2 and web 3
Web1, web2 and web 3
 
Web Evolution Nova Spivack Twine
Web Evolution   Nova Spivack   TwineWeb Evolution   Nova Spivack   Twine
Web Evolution Nova Spivack Twine
 
Web 1.0 2.0-3.0-4.0 Overview
Web 1.0 2.0-3.0-4.0 OverviewWeb 1.0 2.0-3.0-4.0 Overview
Web 1.0 2.0-3.0-4.0 Overview
 
Web 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
Web 1.0 to Web 3.0 - Evolution of the Web and its Various ChallengesWeb 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
Web 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
 
Fundamentals of Web for Non-Developers
Fundamentals of Web for Non-DevelopersFundamentals of Web for Non-Developers
Fundamentals of Web for Non-Developers
 
ARPANET
ARPANETARPANET
ARPANET
 
Basic Web Concepts
Basic Web ConceptsBasic Web Concepts
Basic Web Concepts
 

Similaire à Markup Languages

Similaire à Markup Languages (20)

XHTML
XHTMLXHTML
XHTML
 
HTML5
HTML5HTML5
HTML5
 
Learning HTML
Learning HTMLLearning HTML
Learning HTML
 
Intro to Web Standards
Intro to Web StandardsIntro to Web Standards
Intro to Web Standards
 
Wt-UNNIT-1 (1).ppt
Wt-UNNIT-1 (1).pptWt-UNNIT-1 (1).ppt
Wt-UNNIT-1 (1).ppt
 
Introduction to Web Standards
Introduction to Web StandardsIntroduction to Web Standards
Introduction to Web Standards
 
Girl Develop It Cincinnati: Intro to HTML/CSS Class 1
Girl Develop It Cincinnati: Intro to HTML/CSS Class 1Girl Develop It Cincinnati: Intro to HTML/CSS Class 1
Girl Develop It Cincinnati: Intro to HTML/CSS Class 1
 
Base HTML & CSS
Base HTML & CSSBase HTML & CSS
Base HTML & CSS
 
HTML/CSS Lecture 1
HTML/CSS Lecture 1HTML/CSS Lecture 1
HTML/CSS Lecture 1
 
Intro to JavaScript
Intro to JavaScriptIntro to JavaScript
Intro to JavaScript
 
Doctype html public
Doctype html publicDoctype html public
Doctype html public
 
Web development using html 5
Web development using html 5Web development using html 5
Web development using html 5
 
Sustainable livelihood-framework-sr-presentation
Sustainable livelihood-framework-sr-presentationSustainable livelihood-framework-sr-presentation
Sustainable livelihood-framework-sr-presentation
 
lect9
lect9lect9
lect9
 
lect9
lect9lect9
lect9
 
HTML guide for beginners
HTML guide for beginnersHTML guide for beginners
HTML guide for beginners
 
Eclampsia 4-real-presentation
Eclampsia 4-real-presentationEclampsia 4-real-presentation
Eclampsia 4-real-presentation
 
DIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web DesigningDIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web Designing
 
Intro to JavaScript
Intro to JavaScriptIntro to JavaScript
Intro to JavaScript
 
XML Transformations With PHP
XML Transformations With PHPXML Transformations With PHP
XML Transformations With PHP
 

Plus de University of Massachusetts Amherst (7)

Community College Day Spring 2013
Community College Day Spring 2013Community College Day Spring 2013
Community College Day Spring 2013
 
Freshmen Advising Spring 2013
Freshmen Advising Spring 2013Freshmen Advising Spring 2013
Freshmen Advising Spring 2013
 
Basic SQL Part 2
Basic SQL Part 2Basic SQL Part 2
Basic SQL Part 2
 
Lecture 07 - Basic SQL
Lecture 07 - Basic SQLLecture 07 - Basic SQL
Lecture 07 - Basic SQL
 
Java review-2
Java review-2Java review-2
Java review-2
 
Lecture 06
Lecture 06Lecture 06
Lecture 06
 
java review
java reviewjava review
java review
 

Markup Languages

  • 1. Trinity College Markup Languages Timothy Richards Trinity College, Hartford CT • Department of Computer Science • CPSC 225
  • 2. HTML Hypertext Markup Language Trinity College, Hartford CT • Department of Computer Science • CPSC 225 2
  • 3. HTML Hypertext Markup Language A Family of Related Languages Trinity College, Hartford CT • Department of Computer Science • CPSC 225 3
  • 4. HTML Hypertext Markup Language A Family of Related Languages Most documents communicated on the web are written using HTML. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 4
  • 5. HTML After the next few lectures you should be able to... • Create standards-compliant static HTML documents • Know where to find the reference definitions of HTML and XML and be able to understand (most of) these defns. • Determine if an XHTML document is syntactically correct by consulting an XML document type definition or schema. • Describe the history of HTML and relationship between HTML, XML, and XHTML. • Discuss pros and cons of following standards. • Explain the new additions to the next version: HTML 5 Trinity College, Hartford CT • Department of Computer Science • CPSC 225 5
  • 6. HTML Example <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 6
  • 7. HTML Example Every HTML document contains two types of information <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 7
  • 8. HTML Example The markup information (tags) <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 8
  • 9. HTML Example The character data of the document (not tags) <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 9
  • 10. HTML Example Document Type Declaration (more later) <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 10
  • 11. HTML Example Document Instance <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 11
  • 12. HTML Example Each tag is either a start tag or end tag <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 12
  • 13. HTML Example The “word” in a tag is called the element name <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 13
  • 14. HTML Example This is called the content of the head element. <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 14
  • 15. HTML Example Each document has a root element <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 15
  • 16. HTML Example Each document has a root element <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> This is always <p>Hello World!</p> html in HTML </body> documents </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 16
  • 17. HTML Example This document strictly conforms to the XHTML 1.0 standard <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 17
  • 18. HTML Example When viewed as a tree, XHTML 1.0 Documents always have two children: head and body <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> <body> <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 18
  • 19. HTML Example When viewed as a tree, XHTML 1.0 Documents always have two children: head and body <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> The head element is <body> used to provide certain <p>Hello World!</p> instructions to the browser </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 19
  • 20. HTML Example When viewed as a tree, XHTML 1.0 Documents always have two children: head and body <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/1999/xhtml”> <html xmlns=”http://www.w3.org/1999/xhtml”> <head> <title>HelloWorld</title> </head> The body element defines <body> the content of the page. <p>Hello World!</p> </body> </html> Trinity College, Hartford CT • Department of Computer Science • CPSC 225 20
  • 21. HTML Example This document as a tree. html head body title p “HelloWorld” “Hello World!” Trinity College, Hartford CT • Department of Computer Science • CPSC 225 21
  • 22. HTML History • Tim Berners-Lee (CERN, 1990) • CERN - Physics Research Center • Originally designed with science and engineering interest in mind. • •1992 Elements: title • paragraph • hyperlinks • headings • simple lists • glossaries • monospace text • address blocks & search terms in URL Trinity College, Hartford CT • Department of Computer Science • CPSC 225 22
  • 23. HTML History • Tim Berners-Lee (CERN, 1990) • CERN - Physics Research Center • Originally designed with science and engineering interest in mind. • •1992 Elements: title • paragraph • hyperlinks That was it! • headings • simple lists • glossaries • monospace text • address blocks & search terms in URL Trinity College, Hartford CT • Department of Computer Science • CPSC 225 23
  • 24. HTML History • Marc Andreessen, Eric Bina • National Center for Supercomputer Applications (NCSA) • Graphical Browser: Mosaic (1993) • Key Developers Left... • To form Netscape Communications! • Microsoft • Created a team to develop Internet Explorer. • The “Browser Wars”! • 1993-1997 HTML was defined by browser support Trinity College, Hartford CT • Department of Computer Science • CPSC 225 24
  • 25. HTML History • Marc Andreessen, Eric Bina • National Center for Supercomputer Applications (NCSA) • Graphical Browser: Mosaic (1993) • Key Developers Left... • To form Netscape Communications! • Microsoft • Created a team to develop Internet Explorer. This was BAD! • The “Browser Wars”! Why? • 1993-1997 HTML was defined by browser support Trinity College, Hartford CT • Department of Computer Science • CPSC 225 25
  • 26. HTML History • HTML Developers • Required to “code” to each browser’s idiosyncrasies • World Wide Web Consortium (W3C) • Launched in October of 1994 (16 years ago this month!) • Tim Berners-Lee Trinity College, Hartford CT • Department of Computer Science • CPSC 225 26
  • 27. HTML History • HTML Developers • Required to “code” to each browser’s idiosyncrasies • World Wide Web Consortium (W3C) • Launched in October of 1994 (16 years ago this month!) • Tim Berners-Lee • Goal: Produce Web Standards! Trinity College, Hartford CT • Department of Computer Science • CPSC 225 27
  • 28. HTML History • Standards lagged behind de facto standards • 2.0 was a standard 6 months after draft for 3.0 released • 3.0 was never a standard • 3.2 was adopted as a standard by W3C in 1997 • 3.2 specification captured “practice” of 1996 (year behind) • HTML 4 released in December 1997 • HTML 4.01 is the “standard” • HTML 5 is up and coming! Trinity College, Hartford CT • Department of Computer Science • CPSC 225 28
  • 29. HTML History HTML standards are now being adopted from W3C rather than browser manufactures. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 29
  • 30. HTML History HTML standards are now being adopted from W3C rather than browser manufactures. There are two important aspects of standardization for HTML. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 30
  • 31. HTML History HTML standards are now being adopted from W3C rather than browser manufactures. There are two important aspects of standardization for HTML. Syntax Trinity College, Hartford CT • Department of Computer Science • CPSC 225 31
  • 32. HTML History HTML standards are now being adopted from W3C rather than browser manufactures. There are two important aspects of standardization for HTML. Syntax Semantics Trinity College, Hartford CT • Department of Computer Science • CPSC 225 32
  • 33. HTML History The Syntax Defines the strings of characters that can be used to represent an HTML document and those that cannot. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 33
  • 34. HTML History The Syntax Defines the strings of characters that can be used to represent an HTML document and those that cannot. < > A-Z a-z / * & % $ @ ! 0-9 Trinity College, Hartford CT • Department of Computer Science • CPSC 225 34
  • 35. HTML History The Semantics A description of what the various elements of a syntactically correct document mean. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 35
  • 36. HTML History The Semantics A description of what the various elements of a syntactically correct document mean. The p element represents a paragraph The a element represents an anchor The href attribute represents a hyperlink Trinity College, Hartford CT • Department of Computer Science • CPSC 225 36
  • 37. HTML History The Semantics Formal methods do exist for defining semantics, however, often a language is defined using natural-language descriptions. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 37
  • 38. HTML History The Semantics Formal methods do exist for defining semantics, however, often a language is defined using natural-language descriptions. For the syntax of computer languages, however, we use a metalanguage to describe components of the language. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 38
  • 39. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. Trinity College, Hartford CT • Department of Computer Science • CPSC 225 39
  • 40. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... Trinity College, Hartford CT • Department of Computer Science • CPSC 225 40
  • 41. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... But, SGML (Standard Generalized Markup Language) is used for HTML 4.01 Trinity College, Hartford CT • Department of Computer Science • CPSC 225 41
  • 42. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... Turns out SGML is VERY complex! Trinity College, Hartford CT • Department of Computer Science • CPSC 225 42
  • 43. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... Turns out SGML is VERY complex! W3C introduced XML in 1998 to describe HTML... Trinity College, Hartford CT • Department of Computer Science • CPSC 225 43
  • 44. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... Turns out SGML is VERY complex! W3C introduced XML in 1998 to describe HTML... This resulted in XHTML 1.0, which is syntactically identical to HTML 4.01 Trinity College, Hartford CT • Department of Computer Science • CPSC 225 44
  • 45. HTML History For languages such as Java, a formal notation known as Backus-Naur Form (BNF) is used. BNF could be used to define HTML... Turns out SGML is VERY complex! W3C introduced XML in 1998 to describe With Some HTML... Restrictions This resulted in XHTML 1.0, which is syntactically identical to HTML 4.01 Trinity College, Hartford CT • Department of Computer Science • CPSC 225 45
  • 46. HTML History • XHTML 1.0 • Semantically identical to HTML 4.01 • Restricts from of HTML 4.01 generality • Abstract Syntax Trees (AST) • Representation of HTML elements “abstractly” as trees • Concrete Syntax Trees (CST) • Representation of HTML elements as characters in trees • XHTML 1.0 AST == HTML 4.01 AST • XHTML 1.0 CST != HTML 4.01 CST Trinity College, Hartford CT • Department of Computer Science • CPSC 225 46
  • 47. HTML History • XHTML 1.0 Differences • Omitted tags are not allowed • All element and attribute names must be lowercase (HTML 4.01 names are case insensitive) • All attribute values must be quoted (not always necessary in HTML 4.01) • Differences are not burdensome • They make it easier to write software to process HTML documents Trinity College, Hartford CT • Department of Computer Science • CPSC 225 47
  • 48. Digging into XHTML More on this next time! Any questions? Trinity College, Hartford CT • Department of Computer Science • CPSC 225 48

Notes de l'éditeur