3. Agenda
Introduction to XML
XML Tree
XML Syntax Rules
XML Elements
XML Attributes
XML Namespaces
XML Encoding
XML with CSS
4. Introduction to XML
What is XML?
• XML is a markup language much like HTML
• XML was designed to describe data.
• XML tags are not predefined.
• XML is a W3C Recommendation
XML is not a replacement of HTML
• XML specifies what data is.
• HTML specifies how data looks.
XML Doesn’t do anything.
Some code makes use of XML.
5. Advantages of XML:
• XML Separates Data from HTML
• XML Simplifies Data Sharing
• XML Simplifies Data Transport
• XML Simplifies Platform Changes
• Several Internet languages are written in XML.
XHTML
XML Schema
SVG
WSDL
RSS
6. XML Tree
• XML documents form a tree structure
• XML documents are made up with
Elements
Attributes
Text
7. XML Syntax Rules
• XML Elements Must Have a Closing Tag
• XML Tags are Case Sensitive
• XML Elements Must be Properly Nested
• XML Documents Must Have a Root Element
• Entity References
• Comments in XML
• XML must be well formed
Valid XML:
<color id=“2”>green</color> <!-- The color is green -->
Invalid XML:
<color id=2>green</Color
8. XML Elements
• XML Element is everything from a start tag to end tag.
• An element can contain
other elements
text
attributes
or a mix of all of the above.
• XML Elements must follow naming rules.
E.g.:
<country type=“subcontinent”>India</country>
XML Attributes
• Attributes provide additional information about an element.
• XML Attribute Values Must be Quoted
• Avoid attributes – use only to store metadata.
E.g.:
<file type="gif">computer.gif</file>
9. XML Namespaces
• Namespaces – to avoid name conflicts
Syntax:
xmlns:prefix="URI“
Default Namespace:
• Saves from using prefixes in all the child elements
Syntax:
xmlns="namespaceURI“
10. XML Encoding
• XML documents can contain international characters
Syntax:
<?xml version="1.0" encoding="UTF-8"?>
Unicode:
• Unicode is an industry standard for character encoding of text documents
• Unicode has two variants:
UTF-8
UTF-16.
• UTF = Universal character set Transformation Format.
• UTF-8 uses 1 byte (8-bits) to represent characters in the ASCII set, and two or
three bytes for the rest.
• UTF-16 uses 2 bytes (16 bits) for most characters, and four bytes for the rest.
• UTF-8 is the default for documents without encoding information.
11. XML with CSS
• XML documents can be formatted with CSS (Cascading Style Sheets)
• Formatting XML with CSS is not the most common method.
• W3C recommends using XSLT instead.
13. Agenda
Introduction to DTD
DTD Building Blocks
DTD Elements
DTD Attributes
DTD Entities
14. Introduction to DTD
• DTD defines the document structure with a list of legal elements and
attributes.
• The XML document that follows DTD is valid and well formed.
Why DTD?
• With a DTD, each XML file can carry a description of its own format.
• To verify if the XML received from outside world is valid
• To maintain a standard for interchanging data
DTD Declaration Types:
1. Internal DTD Declaration
2. External DTD Declaration
15. 1. Internal DTD Declaration:
• The DTD is declared inside the XML file
Syntax:
<!DOCTYPE root-element [element-declarations]>
2. External DTD Declaration
• The DTD is declared in an external file
• The DTD document is referred to xml document
Syntax:
<!DOCTYPE root-element SYSTEM "filename">
16. DTD Building Blocks
• Per DTD all the XML documents are made up by the following building
blocks
Elements
Attribues
Entities
PCDATA
CDATA
17. DTD Elements
• In DTD, elements are declared with an ELEMENT declaration.
Syntax:
<!ELEMENT element-name category>
or
<!ELEMENT element-name (element-content)>
Element Types:
• <!ELEMENT element-name EMPTY>
• <!ELEMENT element-name (#PCDATA)>
• <!ELEMENT element-name ANY>
• <!ELEMENT element-name (child1, child2,…..)>
• <!ELEMENT element-name (child-name)>
• <!ELEMENT element-name (child-name+)>
• <!ELEMENT element-name (child-name*)>
• <!ELEMENT element-name (child-name?)>
• <!ELEMENT element-name (child1, child2, (child3|child4))>
• <!ELEMENT element-name (#PCDATA|child1|child2|child3|child4)*>
19. DTD Entities
• Entities are like variables
• Entities can be declared internal or external
1. Internal Entity Declaration:
Syntax:
<!ENTITY entity-name "entity-value">
2. External Entity Declaration:
Syntax:
<!ENTITY entity-name SYSTEM "URI/URL">
Entity reference in XML document:
<element-name>&entity-name;</element-name>
21. XML Schema
• XML schema describes the structure of an XML document.
• XSD - XML Schema language
What is an XML Schema?
• XML Schema defines the legal building blocks of an XML document.
An XML Schema -
defines elements that can appear in a document
defines attributes that can appear in a document
defines which elements are child elements
defines the order of child elements
defines the number of child elements
defines whether an element is empty or can include text
defines data types for elements and attributes
defines default and fixed values for elements and attributes
22. Advantages of XML Schema over DTD
• XML Schemas are written in XML
• XML Schemas support data types
• XML Schemas support namespaces
XML Schema Syntax:
• The XML Schema must be embedded inside the root element <schema>
<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>
XML With XSD:
• XML documents refer XML Schema. (XSD Documents)
23. Agenda
XML Schema
XSD Simple Types
XSD Complex Types
XSD Complex Types – Indicators
XSD Complex Types - any & anyAttribute
XSD Complex Types - Element Substitution
Writing XML Schema
XSD Data types
24. XSD Simple Types
• The Simple Types in XSD are –
Simple Element
Attribute
1. Simple Element:
• Element contains only text, but no other elements or attributes.
Syntax:
<xs:element name=“element-name" type=“element-type"/>
• Simple elements can have default and fixed values
• XML Schema has a lot of built-in data types. The most common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
25. 2. Attribute:
• Simple elements cannot have attributes.
• The attribute itself is a simple type.
Syntax:
<xs:attribute name=“attribute-name" type=“attribute-type"/>
E.g.:
<lastname lang="EN">Smith</lastname> <!--Element with Attribute -->
<xs:attribute name="lang" type="xs:string"/> <!-- Attribute definition -->
XSD Restrictions/Facets:
• Restrictions define acceptable values for XML elements or attributes.
• Restrictions on XML elements are called facets.
Different Restrictions:
• Restrictions on Values
• Restrictions on set of values
• Restrictions on a Series of Values
• Restrictions on Whitespace Characters
• Restrictions on Length
26. XSD Complex Types
• A complex type element contains other elements and/or attributes.
• There are four kinds of complex elements -
empty elements
elements that contain only other elements
elements that contain only text
elements that contain other elements, attributes and text
** The Complex Type Elements can be Extended or Restricted
Empty elements:
• An empty complex element cannot have contents, but only attributes.
E.g.: <product prodid="1345" />
** By giving complexType element a name and let the element have a type
attribute that refers to the name of the complexType several elements can
refer to the same complex type
27. Elements that contain only other elements:
• An "elements-only" complex type contains an element that contains only
other elements.
E.g.: <person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
Elements that contain only text:
• A complex text-only element can contain text and attributes.
E.g.: <shoesize country="france">35</shoesize>
• This type contains only simple content (text and attributes)
• We add a simpleContent element around the content.
28. Elements that contain other elements, attributes and text (Mixed):
• A mixed complex type element can contain attributes, elements, and text.
E.g.: <letter id=“123”>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
29. XSD Complex Types - Indicators
• We can control HOW elements are to be used in documents with indicators.
• There are seven indicators classified into 3 types
a) Order indicators:
• Order indicators define the order of the elements.
All: The child elements can appear in any order, but must occur
only once:
Choice: Either one child element or another can occur, but not both
Sequence: The child elements must appear in a specific order
b) Occurrence indicators:
• Occurrence indicators define the no. of times an element can appear
maxOccurs: Maximum number of times an element can occur
minOccurs: Minimum number of times an element can occur
30. c) Group indicators:
• Group indicators define related sets of elements.
Element Groups:
• Define related sets of elements
• Element groups are defined with the group declaration.
Syntax:
<xs:group name="groupname">
...
</xs:group>
Attribute Groups:
• Define related sets of attributes.
• Attribute groups are defined with the attributeGroup declaration
Syntax:
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>
31. XSD Complex Types - any & anyAttribute
any Element:
• The <any> element enables us to extend the XML document with elements not
specified by the schema!
anyAttribute Element:
• The <anyAttribute> element enables us to extend the XML document with
attributes not specified by the schema!
32. XSD Complex Types - Element Substitution
• With Element Substitution one element can substitute another in different
instances
• An attribute “substitutionGroup” used to apply substitution.
• Substitution can be blocked by using attribute block="substitution"
33. Writing XML Schema
• Schemas for XML can be created in below ways
Hierarchical manner
Divide the Schema
Using Named Types
34. XSD Data types
• XSD has below mentioned data types
String
Date
Numeric
Miscellaneous
Boolean
Binary
AnyURI
Reference:
http://www.w3schools.com