SlideShare une entreprise Scribd logo
1  sur  69
Télécharger pour lire hors ligne
XML Demystified


        Presented by Mr. Viraf Karai
       to staff of Sila Solutions Group
                 Seattle WA


    Fri. Feb 27, 2009 and Fri. Mar 6, 2009



                                             1
Agenda (session # 1)
    What is XML                  Structure of XML docs 
                            



    History of XML               XML node types
                            



    XML syntax & semantics       Where XML is being used 
                            


                                 today
    Well formed XML



                                 Advantages & 
                             
    Valid XML

                                 disadvantages of XML
    DTDs 

                                 XML vocabularies
                             

    XML schemas

                                 XML authoring tools
                             

     Relax NG





                                                            2
What is XML
    Stands for eXtensible Markup Language




    It is a World Wide Web Consortium std.




    A markup language like HTML – except you can 



    build your own tags 
    For machine consumption ­ still readable




    Hierarchical by nature




    Widespread use since late '90s





                                                    3
History of XML
     SGML around since 80's. SGML
     was the int'l std for data markup.




    Discussions began in 1996. Focus
                                            Spec by Tim
     on new simple markup language.
                                          Bray et al only 26
                                          pages. SGML ≈
                                             500 pages

            Approved as W3C
     standard (spec 1.0) in Nov. 1998.
                                          Mainly deals with
                                              Unicode
                                           enhancements
                                               (v4.0)
      W3C spec (v 1.1) in Feb. 2004



                                                               4
XML syntax & semantics
    Generally speaking an XML document



        is well­formed if it clears syntax checks 
    


        is valid if it is well­formed and clears semantic checks 
    


        (specified by grammar)
    Computers can't process XML documents that fail 



    syntax or semantic validation 
    Yes, XML is very fussy!  





                                                                    5
Well formed XML
    Elements 



        must start with letter or underscore (us)
    


        may contain any number of  letters, digits, underscores, 
    


        hyphens or periods 
        no embedded spaces
    



        are case sensitive
    


        must be closed (unless they're leaf elements)
    




                                                                    6
Well formed XML (cont'd)
    Element nesting order must be obeyed




    Encase attrs in single (')/double(”) quotes




    Escape special entities (in text and attrs)



        & →  & 
    


        < →  &lt; 
    


        ” →  &quot; # if used in a double quoted attr
    



        ' →  &apos; # if used in a single quoted attr
    




                                                        7
Sample well­formed XML doc

<?xml version='1.0' encoding='UTF-8'?>
<smallCompanies country='US' dateOfIncorporation='20081101'>
  <company name='Aztec Consulting Services'>
    <employee id='1' role='area manager'>Carly Whitman</employee>
    <employee id='2' role='bizdev manager'>Meg Fiorina</employee>
    <employee id='3' role='business analyst'>Ken Immelt</employee>
    <employee id='4' role='business analyst'>Jeff Lewis</employee>
    <employee id='5' role='technical analyst'>Mavis Rudd</employee>
    <employee id='6' role='software architect'>Perry Yang</employee>
    <employee id='7' role='db developer'>Amanda Blackwell</employee>
    <employee id='8' role='office manager'>Brenda Russo</employee>
    <employee id='9' role='accountant'>Andrea Barnes</employee>
    <employee id='10' role='mailroom clerk'>Tina Russell</employee>
  </company>
</smallCompanies>




                                                                       8
Valid XML
    Valid XML docs enforced by well­known APIs e.g. 



    Spring, Hibernate, Apache SOAP, Java EE – in 
    config files and RPC msgs
    Three basic models for constraints



        Document Type Defintion (DTD) 
    


        XML schemas (XSD)
    


        Relax NG
    



    Constraint models define structure of an XML 



    document – usually specified by URI

                                                    9
DTDs (extn: .dtd)
    Introduced part of XML 1.0 spec




    Oldest constraint model




    First used in SGML




    Still used in HTML 4.x spec




    Non­XML like syntax




    Simple to use but inflexible




    Poor validation capabilities





                                      10
XML doc and its DTD

<?xml version=quot;1.0quot;                    <!ELEMENT people_list (person*)>
   encoding=quot;UTF-8quot;?>
                                       <!ELEMENT person (name, birthdate?, gender?, 
<!DOCTYPE people_list SYSTEM              socialsecuritynumber?)>
   quot;example.dtdquot;>
                                       <!ELEMENT name (#PCDATA)>
<people_list>
                                       <!ELEMENT birthdate (#PCDATA)>
 <person>
                                       <!ELEMENT gender (#PCDATA)>
   <name>Fred Bloggs</name>
                                       <!ELEMENT socialsecuritynumber (#PCDATA)>
   <birthdate>27/11/2008</birthdate>

   <gender>Male</gender>

 </person>

</people_list>




                                                                                       11
XML schemas (XSD extn: .xsd)
    W3Cs succesor to DTDs 




    Extremely powerful semantic validation




    Vastly more flexible compared to DTDs




    Mind boggling support for rich data types




    Complex and difficult to author by hand




    Many XML gurus unhappy with complexity





                                                12
XML doc and its XSD
<?xml version=quot;1.0quot;?>                 <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;
                                          standalone=quot;yesquot;?>

<Process                              <xs:schema
   xmlns:xsi=quot;http://www.w3.org/200       xmlns:xs=quot;http://www.w3.org/2001/XMLSchemaquot;>
   1/XMLSchema-instancequot;               <xs:element name=quot;Processquot;>
   xsi:noNamespaceSchemaLocation=
                                        <xs:complexType>
   quot;ComplexTypes.xsdquot;>
                                          <xs:all>
 <Name>Bill Evjen</Name>
                                           <xs:elementname=quot;Namequot;type=quot;xs:stringquot;/>
 <Address>123 Main Street</Address>
                                           <xs:element name=quot;Addressquot; type=quot;xs:stringquot; />

 <City>Saint Charles</City>                  <xs:element name=quot;Cityquot; type=quot;xs:stringquot; />

 <State>Missouri</State>                     <xs:element name=quot;Statequot; type=quot;xs:stringquot; />

                                             <xs:element name=quot;Countryquot; type=quot;xs:stringquot;/>
 <Country>USA</Country>
                                            </xs:all>
</Process>
                                          </xs:complexType>

                                        </xs:element>

                                      </xs:schema>




                                                                                           13
Relax NG (extn: .rng / .rnc)
    Stands for REgular LAnguage for XML Next 



    Generation
    Not a W3C standard – part of OASIS




    Offers alternative to XSD complexity




    Based on Murata Makoto's RELAX and James 



    Clark's TREX.
    Mostly satisfies Pareto principle  




    XML and non­XML syntax





                                                14
XML doc and its RNG
                                         <?xml version=”1.0” encoding=”UTF-8”?>
<?xml version=”1.0” encoding=”UTF-8”?>
                                         <zeroOrMore>
<element name=quot;phonebookquot;>
                                          <element name=quot;phonebookquot;>
  <element name=quot;entryquot;>
                                            <oneOrMore>
    <element name=quot;firstNamequot;>
                                              <element name=quot;entryquot;>
      <text/>
                                                <element name=quot;firstNamequot;><text/></element>
    </element>                                  <optional>

    <element name=quot;firstNamequot;>                    <element name=quot;middleNamequot;/>

      <text/>                                   </optional>

                                                <element name=quot;firstNamequot;><text/></element>
    </element>
                                                <!-- etc... -->
    <!-- etc... -->
                                              </element>
  </element>
                                            </oneOrMore>
</element>
                                          </element>

                                         </zeroOrMore>




                                                                                         15
Sample XML doc with DTD decl
                                                         DTD
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>                   decl
<!DOCTYPE beans PUBLIC quot;-//SPRING//DTD BEAN//ENquot;
quot;http://www.springframework.org/dtd/spring-beans.dtdquot;>

<beans>
  <!-- Axis2 Web Service, but to Spring, its just another bean
that has dependencies -->
  <bean id=quot;springAwareServicequot; class=quot;spring.SpringAwareServicequot;>
    <property name=quot;myBeanquot; ref=quot;myBeanquot;/>
  </bean>

  <!-- just another bean / interface with a wired implementation,
that's injected by Spring into the Web Service -->
   <bean id=quot;myBeanquot; class=quot;spring.MyBeanImplquot;>
     <property name=quot;valquot; value=quot;Spring, emerge thyselfquot; />
  </bean>
</beans>




                                                                16
Sample XML doc with XSD decl
                                                       XML
                                                       NS
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>
<beans
   xmlns=quot;http://www.springframework.org/schema/beansquot;
   xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot;
                                                            XSD
   xmlns:aop=quot;http://www.springframework.org/schema/aopquot;
                                                            Decl
   xsi:schemaLocation=quot;
  http://www.springframework.org/schema/beans
  http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
  http://www.springframework.org/schema/aop
  http://www.springframework.org/schema/aop/spring-aop-2.0.xsdquot;>

   <bean
class=quot;org.springframework.beans.factory.config.PropertyPlaceholde
rConfigurerquot;>
     <property name=quot;locationsquot;
               value=quot;classpath:pestt_jdbc.propertiesquot;/>
   </bean>
</beans>


                                                                17
Structure of XML docs 




                             18
Structure of XML docs (cont'd)
                                            XML declaration
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>

                    Doc element
<nutrition>

  <?page skip?>
                       Procs'ng Instr
                                            Element
                                                                    End Element
  <daily-values>

     <total-fat units=quot;gquot;>65</total-fat>

     <saturated-fat units=quot;gquot;>20</saturated-fat>          Text
     <cholesterol units=quot;mgquot;>300</cholesterol>
                                                                        XML comment
     <sodium units=quot;mgquot;>2400</sodium>

     <carb units=quot;gquot;>300</carb> <!-- this is a comment -->

     <fiber units=quot;gquot;>25</fiber>   <! this is a comment

                                            Attribute
     <protein units=quot;gquot;>50</protein>

     <notes><![CDATA[Daily values for an adult <male> ]]></notes>

                              Text (CDATA)
  </daily-values>
                                                                       End of CDATA
</nutrition>
                    End Doc element

                                                                                      19
Hierarchy of previous example
                                     Nutrition




              ?page (PI)                           daily-values




        total-fat                  saturated-fat                                     Sodium
                                                               Cholestorol

units                      units                                             units
                                                       units

                                                                                      2400
                                       20                          300
         65



                                                                                              20
XML Node Types
    Document               Document Fragment
                      



    Element                Entity
                      



    Attribute              Entity reference
                      



    Text                   Processing instruction
                      



    Comment





                                                    21
XML Nodes: Document
    Represents the entire document




    Conceptual root of the document tree




    Provides primary access to doc's data




    Other node types must have a parent Document




    Used in DOM for full traversal  




    Used in SAX to signal start of a document





                                                   22
XML Nodes: Element
    Represents an element in an XML doc




    May have 0 or more attributes




    May have 0 or more children (other elements, text, 



    comments, CDATA, etc.)
    No children → leaf elements




    All elements must have closing elements e.g. 



    <color>pink</color> or <color/>
    <color></color>       <color/>





                                                          23
XML Nodes: Attributes
    Only associated with elements – optional




    Specify metadata about an element




    Shown as name/value pairs eg. Num='6'




    Each attribute must be quoted (') or (”)




    Elements may have any number of attrs




    Each attr must be unique for an element





                                               24
XML Nodes: Text
    Represents textual content of elements




    Text nodes are leaves




    Some chars such as '<' and '&' must be escaped 



    when authoring text 
    Use CDATA when such chars occur frequently. 



    Escaping impairs readability.
    Intn'l chars may freely be inserted in text





                                                      25
XML Nodes: Comments
    Mainly exist for human readability 




    Can span multiple lines




    Char sequence '­­' illegal inside comments




    Usually ignored by most parsers




    Can't have nested comments





                                                 26
XML Nodes: Entity references
    Used to substitute for a single char that is also a 



    markup delimiter in XML
    Using these prevents a literal char from being 



    mistaken for a markup delimiter 
    Predefined entity references:



        <  → &lt;
    



        &  → &amp;
    



        >  → &gtr;
    



        ”  → &quot;
    



        '        → &apos;
    


                                                           27
XML Nodes: Processing Instrs
    Provides info to app processing document




    E.g. how to process or render the doc




    XML declaration looks like a PI, but isn't




    Pis comprise of a target and data e.g.    <?sort alpha­



    ascending?>
               target           data




                                                          28
Where XML is being used today
    Config files                          Doc publishing (S1000D)
                                     



                                          Expr & query languages    
         Devices (Sony PRS­505)
                                     


                                          (XPath, XQuery)
         IDEs (IntelliJ, Eclipse)
     



                                          Transform languages (XSLT, 
         Frameworks (Spring,          
     


                                          XSL­FO)
         Hibernate, JPA)
    Java project build files (Ant,        DBMS storage (SQLSrv, 
                                     

    Maven)                                DB2,Oracle  support XML)
    Web services (SOAP, REST)             Web, PDAs, smart phones
                                     

    and XML­RPC 
                                          Vector graphics (SVG)
                                      

    Document storage (OpenOffice, 


                                          Log, data files
                                      
    MS­Office)


                                                                        29
Advantages of XML
    Full blown W3C standard




    I18N support –  UTF­8 and UTF­16




    Expressive – define CS data structures




    Semantic vald'n using DTD, XSD, RelaxNG




    Parsers on all platforms (Java, .net, LAMP)




    Widely in use in industry and academia




    Several vocabularies e.g. MathML, CML




    Rigid syntax → predictability in parsing




                                                  30
Disadvantages of XML
    Verbose syntax – not compact 




    Can't easily represent binary




    DOM and SAX APIs are complex




    Difficult to diff similar XML files




    Overheads in data transmission




    Different data model compared to RDBMS




    Storing XML in RDBMS is unnatural





                                             31
XML vocabularies
    S1000d(aerospace)       Legal XML
                       



    SVG (graphics)          Human XML
                       



    MathML                  Address XML
                       



    Food XML                Finance XML
                       



    Legal XML               Physics XML
                       



    Manufacturing XML       News XML
                       



    Healthcare XML          Astronomy XML
                       




                                            32
XML authoring tools
    XMLSpy ($) 




    Arbortext ($)  




    Oxygen ($) 




    XML Copy Editor 




    EditiX ($)




    Stylus Studio  





                                       33
Questions – session # 1




                              34
Agenda (session # 2)
    Sample PO (fixed fmt & XML)        Reading, writing XML property 
                                  


                                       files
    Typical Java enterprise tech



                                       Sample log4j.xml config file
                                   
    The XML family



                                       XML to build Java projects
                                   
    Parsing XML



                                       The JAX family
                                   
    DOM, SAX, StaX parsing



                                       XML in Java enterprise 
                                   
    XML parsing with Groovy

                                       development
    Compare DOM, SAX, StAX

                                       XML and databases
                                   


    XML usage in Java5 and 

                                       XML and web services
                                   
    beyond
                                       References
                                   
    XMLEncoder sample code





                                                                      35
Typical Java enterpise technologies
 CORBA          RMI       Velocity    Spring      Struts



   JSF       Terracotta   GridGain    JBoss      Logging



  Oracle     Hibernate      XML      Reporting   Scripting



   JMX          JPA        LDAP      JavaMail     JAXB



  Quartz       AJAX       TestNG       JMS         JTA


   Tag
               JNDI        JAAS       SNMP        Maven
 Libraries


                                                             36
Sample purchase order (rec types)
100: Jack Nicholson

120: 621 Mulholland Drive

140: Bel Air

150: CA

170: 90077

200: 3

210: 248

220: Decorative Widget, Red, Large

300: 19.95

200: 1

210: 1632

220: Packed electron storage container, AA, 4-pack

300: 4.95




                                                     37
Sample purchase order (XML)
<!DOCTYPE purchase.order SYSTEM               <product.num>248</product.num>
   quot;po.dtdquot;>
                                             <description>Decorative Widget,
<purchase.order>                            Red, Large</description>

<date>16 June 2009</date>                     <unitcost>19.95</unitcost>

<billing.address>                          </item>

  <name>Jack Nicholson</name>              <item>

  <street>621 Mulholland Drive</street>       <quantity>1</quantity>

  <city>Bel Air</city>                        <product.num>1632</product.num>

  <state>CA</state>                          <description>Packed electron
                                            storage container, AA, 4-
  <zip>90077</zip>                          pack</description>

</billing.address>                            <unitcost>4.95</unitcost>

<items>                                    </item>

  <item>                                  </items>

    <quantity>3</quantity>                </purchase.order>


                                                                                38
The XML family
    Has a number of number of members



        XHTML – HTML that's well­formed XML
    


        XSLT – transform XML to XML/HTML/Text
    


        XSL­FO – transform XML to PDF/PS/RTF
    


        XPath – XML expression lang (used in XSLT, XQuery 
    


        and code  e.g. Groovy, Java, Python)
        XQuery – extract, manipulate data in XML document
    



    All of the above are W3C standards




    All have full­blown I18N support (UTF)




                                                             39
Parsing XML
    Splendid support for parsing XML in Java, Groovy, 



    C, C++, Ruby, C# (.net), Python, Haskell, Scala, 
    Lisp, Erlang, PERL, etc.
    Parsers available from handheld devices to massive 



    supercomputers
    Most widely used techniques for parsing:



        DOM – build a tree representing the XML document
    



        SAX ­  fire events during parsing (push model)
    


        StAX – cursor and iterator based (pull model)
    




                                                           40
Parsing XML (cont'd)
    Above techniques are low­level




    DOM parsing heavily used in CAS Toolbox project




    Other (less common) parsing techniques;



    (a) JAXB  (used in Toolbox Metadata Loader)
    (b) XML Beans
    (c) Commons Digester
    (a) and (b) bind XML to Java objects (OO). Digester 



    defines rules defining XML struct

                                                      41
DOM parsing
    Full support for DOM in Java 5 and beyond




    Builds an in­memory tree of the XML doc




    Can't access the tree until it is fully built




    Random access to tree once built




    Extremely impractical for huge XML files – can 



    fully consume virtual memory
    Foolproof way to build small & medium XML docs 



    – build tree, then write out

                                                      42
DOM parsing (cont'd)
DocumentBuilderFactory dbfactory =
   DocumentBuilderFactory.newInstance();

dbfactory.setNamespaceAware(true);

DocumentBuilder domparser =
   dbfactory.newDocumentBuilder();

//parse the XML and create the DOM

Document doc = domparser.parse(new
   File(quot;data.xmlquot;));

//to create a new DOM from scratch -

//Document doc = domparser.newDocument();

//Use DOM once you have the Doc handle




                                            43
SAX parsing
    Brainchild of David Megginson




    Solid interface­centric design




    Full support for SAX in Java 5 and beyond




    Very low memory footprint. Ideal for processing 



    massive XML documents
    No random access – housekeeping reqd




    Fires synchronous events which should be 



    intercepted by code
    Readonly API – can't use it to build XML


                                                       44
SAX parsing (cont'd)
SAXParserFactory spfactory =
   SAXParserFactory.newInstance();

spfactory.setNamespaceAware(true);

SAXParser saxparser =
   spfactory.newSAXParser();

// write your handler for processing

// events and handling error

DefaultHandler handler = new MyHandler();

// parse the XML and report events and

// errors (if any) to the handler

saxparser.parse(new File(quot;data.xmlquot;),
                handler);




                                            45
SAX parsing (cont'd)
<?xml version = quot;1.0quot;               Start Document

     encoding = quot;utf-8quot;?>           Start Element quot;CarRentalquot;

<CarRental>                         Start Element quot;customerNamequot;

  <customerName>JohnDoe             Character Data quot;John Doequot;

  </customerName>                   End Element quot;customerNamequot;

  <date>2009-02-28</date>           Start Element quot;datequot;

  <model>Oldsmobile Alero</model>   Character Data quot;2009-02-28quot;

</CarRental>                        End Element quot;datequot;

                                    Start Element quot;modelquot;

                                    Character Data quot;Oldsmobile Aleroquot;

                                    End Element quot;modelquot;

                                    End Element quot;CarRentalquot;

                                    End Document

                                                                        46
StAX parsing
    StAX → Streaming API for XML




    Fully integrated in Java 6 




    JSR 173 sponsored by BEA (part of Oracle)




    A pull model to parse XML (SAX → push)




    Pull model → app asks parser for events




    Unlike SAX, no interfaces to implement




    Unlike SAX, you can read and write XML




    Offers cursor API and event iterator APIs




                                                47
StAX parsing (cont'd)
XMLInputFactory xmlif = XMLInputFactory.newInstance();

xmlif.setEventAllocator(new XMLEventAllocatorImpl());

allocator = xmlif.getEventAllocator();

XMLStreamReader xmlr =
    xmlif.createXMLStreamReader(filename,

    new FileInputStream(filename));

//The next step is to create an event iterator:

int eventType = xmlr.getEventType();

while(xmlr.hasNext( ))

{

    eventType = xmlr.next( );

    //Get all quot;Bookquot; elements as XMLEvent object

    if(eventType == XMLStreamConstants.START_ELEMENT &&

        xmlr.getLocalName().equals(quot;Bookquot;))

    {

        StartElement event =
    getXMLEvent(xmlr).asStartElement(););

    }

}
                                                          48
XML parsing with Groovy
def file = new File(quot;person.xmlquot; )   <?xml version=”1.0”>

person = new                         <!-- person.xml -->
   XmlSlurper().parse(file)
                                     <person id=quot;100quot; >
println person.firstname
                                      <firstname>Jane</firstname>
===> Jane
                                      <lastname>Wells</lastname>
println person.address.city
                                      <address type=quot;homequot; >
===> Denver
                                        <street>343 Evans Ave</street>
println person.address.@type
                                        <city>Denver</city>
===> home
                                        <state>CO</state>

                                        <zip>80020</zip>

                                      </address>

                                     </person>



                                                                     49
Online Xpath demo
    Xpath allows users to randomly access portions of an XML 
●


    document
    A DOM representing the XML doc must be built first
●



    Xpath operates against the DOM tree
●



    Xpath is typically used in XSLT but can be used standalone 
●


    or in your code 
    Xpath does a lot of the heavy lifting in XSLT scripts
●



http://www.orbeon.com/ops/sandbox­transformations/xpath/ 



                                                              50
Comparing DOM, SAX and StAX
Feature            DOM              SAX               StAX
API type           In memory tree   Streaming – push Streaming – pull
Ease of use        High             Medium            High
XPath capability   Yes              No                No
CPU & mem          Varies           Good              Good
o'head
Full navigation    Yes              No                No
Read XML           Yes              Yes               Yes
Write XML          Yes              No                Yes
CRUD               Yes              No                No



                                                                        51
XML usage in Java5 and beyond
    DOM and SAX parsers, XSLT and XPath all 



    standard in Java 5 (StAX avail in Java 6)
    Log4j not part of Java 5, but is ubiquitous. Preferred 



    config file format is XML.
    Property files can also be defined in XML




    XMLEncoder and XMLDecoder  ­ persistence 



    mechanism for Java Beans 
    XMLSignature – a W3C recommendation





                                                         52
XMLEncoder sample code
// Serialize orderBean to disk
//*****************************
XMLEncoder encoder = new XMLEncoder(new
    FileOutputStream(“serializedBeans/orderBean.xml”));
encoder.writeObject(orderBean);
encoder.close();
// Now read the serialized objects back into Java beans
//*****************************************************
XMLDecoder decoder = new XMLDecoder(new
    FileInputStream(“serializedBeans/orderBean.xml”));
OrderBean orderBean = (OrderBean) decoder.readObject();
decoder.close();

                                                          53
Reading XML property files
import java.util.*;                                   <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>

import java.io.*;                                     <!DOCTYPE properties SYSTEM quot;
                                                          http://java.sun.com/dtd/properties.dtdquot;>
public class LoadSampleXML
                                                      <!-- props.xml -->
{
                                                      <properties>
    public static void main(String args[ ])
                                                        <comment>Ths is an XML props file</comment>
                                throws Exception
                                                        <entry key=quot;fruitquot;>mango</entry>
    {
                                                        <entry key=quot;favGroupquot;>Led Zeppelin</entry>
        Properties prop = new Properties();
                                                        <entry key=”favStar”>Jack Nicholson</entry>
        FileInputStream fis =
                                                      </properties>
         new FileInputStream(quot;props.xmlquot;);

        prop.loadFromXML(fis);

        prop.list(System.out);

        System.out.println(quot;nfavStar property: quot; +

           prop.getProperty(quot;favStarquot;));

    }

}
                                                                                                      54
Writing XML property files
import java.util.*;                                        ?<xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>
import java.io.*;                                          <!DOCTYPE properties SYSTEM quot;
                                                               http://java.sun.com/dtd/properties.dtdquot;>
public class StoreXML {

                                                           <!-- rhyme.xml -->
    public static void main(String args[]) throws
      Exception {
                                                           <properties>
        Properties prop = new Properties();
                                                           <comment>Rhyme</comment>
        prop.setProperty(quot;one-twoquot;, quot;buckle my shoequot;);
                                                           <entry key=quot;seven-eightquot;>lay them
        prop.setProperty(quot;three-fourquot;, quot;shut the doorquot;);       straight</entry>
        prop.setProperty(quot;five-sixquot;, quot;pick up sticksquot;);
                                                           <entry key=quot;five-sixquot;>pick up sticks</entry>
        prop.setProperty(quot;seven-eightquot;, quot;lay them
                                                           <entry key=quot;nine-tenquot;>a big, fat hen</entry>
        straightquot;);

                                                           <entry key=quot;three-fourquot;>shut the door</entry>
        prop.setProperty(quot;nine-tenquot;, quot;a big, fat henquot;);

        FileOutputStream fos =                             <entry key=quot;one-twoquot;>buckle my shoe</entry>
         new FileOutputStream(quot;rhyme.xmlquot;);
                                                           </properties>
        prop.storeToXML(fos, quot;Rhymequot;);

        fos.close();

    }

}


                                                                                                           55
Sample log4j.xml config file
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>

<!DOCTYPE log4j:configuration SYSTEM quot;log4j.dtdquot;>

<log4j:configuration xmlns:log4j=quot;http://jakarta.apache.org/log4j/quot; debug=quot;truequot;>

  <appender name=quot;rollingquot; class=quot;org.apache.log4j.RollingFileAppenderquot;>

     <param name=quot;Filequot; value=quot;${LOG_FILE_PATH}/pestt.logquot; />

     <param name=quot;MaxFileSizequot; value=quot;1000KBquot; />

     <param name=quot;MaxBackupIndexquot; value=quot;5quot; />

     <param name=quot;Thresholdquot; value=quot;debugquot; />

     <layout class=quot;org.apache.log4j.PatternLayoutquot;>

        <param name=quot;ConversionPatternquot; value=quot;%d %-5p [%c] - %m%nquot; />

     </layout>

  </appender>

  <root>

     <priority value=quot;debugquot; />>

     <appender-ref ref=quot;rollingquot; />

  </root>

</log4j:configuration>


                                                                                    56
XML to build Java projects
    Ant – standard way to build Java for many years




    Maven – smart way to build projects and manage 



    complex dependencies 
    Ant is procedural, whereas Maven is declarative




    Ant files are typically build.xml




    Maven files are typically pom.xml




    Maven more complex but worth learning




    Ant and Maven build files very readable





                                                      57
Java API for XML (JAX)
    Set of packages comprising of



        Java API for XML Processing (JAXP)
    


        Java API for XML­based RPC (JAX­RPC)
    


        Java API for XML Registries (JAXR)
    


        Java Architecture for XML Binding (JAXB)
    



    Implemented in Java SE and Java EE3





                                                   58
Java API for XML Parsing (JAXP) 
    Consists of APIs to parse, search and transform 



    XML files
    JAXP was standard issue with Java 5




    JAXP components



        DOM parser
    


        SAX parser
    


        XPath lookups
    


        XSLT
    



    Implementations hidden from end users



                                                       59
Java API for XML Registries  (JAXR)
    Enables lookup of XML 



    registries
    A registry is infrastruct 



    to enable building 
    deploy, discov of WS
    JAXR supports  UDDI 



    and ebXML
    Consists of JAXR client 



    and provider

                                      60
Java API for XML RPC (JAX­RPC)
    V2.0 → JAX­WS




    JavaEE, .Net, LAMP 



    interoperability
    SOAP, REST support




    Client → proxy




    Proxy → JAX­RPC RS




    Method → SOAP msg




    SOAP msg transmitted 



    over HTTP
                                 61
JAVA API for XML Binding (JAXB)
    Part of Java 6




    Does a mapping of 



    XSD        Java 
    Is object­oriented




    More high­level




    xjc := XSD → Java




    schemagen :=    




    Java → XSD





                                      62
JAXB (continued)
                                          <?xml version=quot;1.0quot;?>
// Unmarshalling to Java from XML

Unmarshaller unmarshaller =
                                          <person
   DataBindingFactory.newUnmarshaller(
                                            xmlns=quot;http://www.example.com
   );
                                            /personquot;>
Person person = (Person)
   unmarshaller.unmarshal(new              <firstName>Lola</firstName>
   File(quot;lola.xmlquot;));
                                           <lastName>Boone</firstName>
System.out.println(person.getFirstName(
   ));
                                          </person>
// Marshalling to XML from Java

Person person = new Person( );

person.setFirstName(quot;Lolaquot;);

person.setLastName(quot;Boonequot;);

Marshaller marshaller =
   DataBindingFactory.newMarshaller( );

marshaller.marshal(person, new
   FileWriter(quot;lola.xmlquot;));


                                                                         63
XML  in Java enterprise dvlp
    Spring – wildly popular dependency injection f'work 




    Hibernate – the most widely used object → 



    relational mapping framework
    Ibatis – a popular object → query mapping f'work 




    SOAP – web service configuration and messages 




    JPA – relatively new persistence spec (part of EJB3)




    App servers and web containers (JBoss, Tomcat, 



    Jetty)
    Struts, JSF, AJAX –  Java web frameworks


                                                        64
XML in databases 
    Enterprise RDBMS's support XML – Oracle, 



    SQLServer, DB2 and Sybase
    Oracle has had XMLType since v 9.x




    XML DOMs can be persisted in Oracle (and 



    queried). Syntax somewhat clunky, but queries from 
    SQL*Plus possible.
    Consider other options if your RDBMS doesn't 



    persist natively (write to CLOB)


                                                     65
Native XML databases
    A popular fad in the early 21st century.




    Not too many use cases for this – possibly document 



    publishing, web publishing
    Mark Logic most popular native XML db




    Open source XML dbs include Xindice (no support 



    for XQuery) & eXist (perf issues)
    Tamino – first native XML db – now abandoned





                                                      66
XML and web services
    Two kinds of web services – SOAP, REST




    SOAP is still the dominant way to do web­services, 



    but REST is gaining popularity
    WSDL is an XML based language to describe web 



    services and how to access them.
    Sample WSDL: 


    http://geocoder.us/dist/eg/clients/GeoCoder.wsdl
    Besides WSDL, SOAP messages are also specified in XML 



    (demo GeoCoder example)

                                                         67
References
    http://www.w3schools.com




    http://www.xml.com




    http://www.xml.org




    http://www.zvon.org




    https://jaxp.dev.java.net/1.4/




    http://www.saxproject.org




    http://projects.apache.org/indexes/category.html#xml




    Professional XML by Bill Evjen et al.




    Pro XML Development with Java Technology by Ajay Vohra et al




    Java and XML by Brett McLaughlin et al





                                                                   68
Questions – session # 2




                              69

Contenu connexe

Tendances

Tendances (20)

Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Extensible Markup Language (XML)
Extensible Markup Language (XML)Extensible Markup Language (XML)
Extensible Markup Language (XML)
 
XSD
XSDXSD
XSD
 
Xsd
XsdXsd
Xsd
 
Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML
Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XMLFergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML
Fergus Fahey - DRI/ARA(I) Training: Introduction to EAD - Introduction to XML
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
XML
XMLXML
XML
 
XML Schema
XML SchemaXML Schema
XML Schema
 
XML XSLT
XML XSLTXML XSLT
XML XSLT
 
XML, DTD & XSD Overview
XML, DTD & XSD OverviewXML, DTD & XSD Overview
XML, DTD & XSD Overview
 
Xsd examples
Xsd examplesXsd examples
Xsd examples
 
Introduction to xml schema
Introduction to xml schemaIntroduction to xml schema
Introduction to xml schema
 
Xml schema
Xml schemaXml schema
Xml schema
 
Xslt
XsltXslt
Xslt
 
XML
XMLXML
XML
 
XML and XSLT
XML and XSLTXML and XSLT
XML and XSLT
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Xml dtd
Xml dtdXml dtd
Xml dtd
 
02 xml schema
02 xml schema02 xml schema
02 xml schema
 
Xml schema
Xml schemaXml schema
Xml schema
 

Similaire à Xml Demystified

Douglas Crockford Presentation Jsonsaga
Douglas Crockford Presentation JsonsagaDouglas Crockford Presentation Jsonsaga
Douglas Crockford Presentation Jsonsaga
Ajax Experience 2009
 
Jsonsaga
JsonsagaJsonsaga
Jsonsaga
nohmad
 
OSCON 2004: XML and Apache
OSCON 2004: XML and ApacheOSCON 2004: XML and Apache
OSCON 2004: XML and Apache
Ted Leung
 
XML processing with perl
XML processing with perlXML processing with perl
XML processing with perl
Joe Jiang
 
Kickstart Tutorial Xml
Kickstart Tutorial XmlKickstart Tutorial Xml
Kickstart Tutorial Xml
LiquidHub
 

Similaire à Xml Demystified (20)

XML Schema Patterns for Databinding
XML Schema Patterns for DatabindingXML Schema Patterns for Databinding
XML Schema Patterns for Databinding
 
Xml
XmlXml
Xml
 
Xml Schema
Xml SchemaXml Schema
Xml Schema
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Douglas Crockford Presentation Jsonsaga
Douglas Crockford Presentation JsonsagaDouglas Crockford Presentation Jsonsaga
Douglas Crockford Presentation Jsonsaga
 
Everything You Always Wanted To Know About XML But Were Afraid To Ask
Everything You Always Wanted To Know About XML But Were Afraid To AskEverything You Always Wanted To Know About XML But Were Afraid To Ask
Everything You Always Wanted To Know About XML But Were Afraid To Ask
 
Xml
XmlXml
Xml
 
Inroduction to XSLT with PHP4
Inroduction to XSLT with PHP4Inroduction to XSLT with PHP4
Inroduction to XSLT with PHP4
 
Xml Overview
Xml OverviewXml Overview
Xml Overview
 
Jsonsaga
JsonsagaJsonsaga
Jsonsaga
 
Xml
XmlXml
Xml
 
The JSON Saga
The JSON SagaThe JSON Saga
The JSON Saga
 
OSCON 2004: XML and Apache
OSCON 2004: XML and ApacheOSCON 2004: XML and Apache
OSCON 2004: XML and Apache
 
About XML
About XMLAbout XML
About XML
 
XML processing with perl
XML processing with perlXML processing with perl
XML processing with perl
 
Kickstart Tutorial Xml
Kickstart Tutorial XmlKickstart Tutorial Xml
Kickstart Tutorial Xml
 
XML
XMLXML
XML
 
Xml
XmlXml
Xml
 
XML and Web Services with PHP5 and PEAR
XML and Web Services with PHP5 and PEARXML and Web Services with PHP5 and PEAR
XML and Web Services with PHP5 and PEAR
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Xml Demystified

  • 1. XML Demystified Presented by Mr. Viraf Karai to staff of Sila Solutions Group Seattle WA Fri. Feb 27, 2009 and Fri. Mar 6, 2009   1
  • 2. Agenda (session # 1) What is XML Structure of XML docs    History of XML XML node types   XML syntax & semantics Where XML is being used    today Well formed XML  Advantages &   Valid XML  disadvantages of XML DTDs   XML vocabularies  XML schemas  XML authoring tools   Relax NG    2
  • 3. What is XML Stands for eXtensible Markup Language  It is a World Wide Web Consortium std.  A markup language like HTML – except you can   build your own tags  For machine consumption ­ still readable  Hierarchical by nature  Widespread use since late '90s    3
  • 4. History of XML SGML around since 80's. SGML was the int'l std for data markup. Discussions began in 1996. Focus Spec by Tim on new simple markup language. Bray et al only 26 pages. SGML ≈ 500 pages Approved as W3C standard (spec 1.0) in Nov. 1998. Mainly deals with Unicode enhancements (v4.0) W3C spec (v 1.1) in Feb. 2004   4
  • 5. XML syntax & semantics Generally speaking an XML document  is well­formed if it clears syntax checks   is valid if it is well­formed and clears semantic checks   (specified by grammar) Computers can't process XML documents that fail   syntax or semantic validation  Yes, XML is very fussy!      5
  • 6. Well formed XML Elements   must start with letter or underscore (us)  may contain any number of  letters, digits, underscores,   hyphens or periods  no embedded spaces  are case sensitive  must be closed (unless they're leaf elements)    6
  • 7. Well formed XML (cont'd) Element nesting order must be obeyed  Encase attrs in single (')/double(”) quotes  Escape special entities (in text and attrs)  & →  &amp;   < →  &lt;   ” →  &quot; # if used in a double quoted attr  ' →  &apos; # if used in a single quoted attr    7
  • 8. Sample well­formed XML doc <?xml version='1.0' encoding='UTF-8'?> <smallCompanies country='US' dateOfIncorporation='20081101'> <company name='Aztec Consulting Services'> <employee id='1' role='area manager'>Carly Whitman</employee> <employee id='2' role='bizdev manager'>Meg Fiorina</employee> <employee id='3' role='business analyst'>Ken Immelt</employee> <employee id='4' role='business analyst'>Jeff Lewis</employee> <employee id='5' role='technical analyst'>Mavis Rudd</employee> <employee id='6' role='software architect'>Perry Yang</employee> <employee id='7' role='db developer'>Amanda Blackwell</employee> <employee id='8' role='office manager'>Brenda Russo</employee> <employee id='9' role='accountant'>Andrea Barnes</employee> <employee id='10' role='mailroom clerk'>Tina Russell</employee> </company> </smallCompanies>   8
  • 9. Valid XML Valid XML docs enforced by well­known APIs e.g.   Spring, Hibernate, Apache SOAP, Java EE – in  config files and RPC msgs Three basic models for constraints  Document Type Defintion (DTD)   XML schemas (XSD)  Relax NG  Constraint models define structure of an XML   document – usually specified by URI   9
  • 10. DTDs (extn: .dtd) Introduced part of XML 1.0 spec  Oldest constraint model  First used in SGML  Still used in HTML 4.x spec  Non­XML like syntax  Simple to use but inflexible  Poor validation capabilities    10
  • 11. XML doc and its DTD <?xml version=quot;1.0quot; <!ELEMENT people_list (person*)> encoding=quot;UTF-8quot;?> <!ELEMENT person (name, birthdate?, gender?,  <!DOCTYPE people_list SYSTEM socialsecuritynumber?)> quot;example.dtdquot;> <!ELEMENT name (#PCDATA)> <people_list> <!ELEMENT birthdate (#PCDATA)> <person> <!ELEMENT gender (#PCDATA)> <name>Fred Bloggs</name> <!ELEMENT socialsecuritynumber (#PCDATA)> <birthdate>27/11/2008</birthdate> <gender>Male</gender> </person> </people_list>   11
  • 12. XML schemas (XSD extn: .xsd) W3Cs succesor to DTDs   Extremely powerful semantic validation  Vastly more flexible compared to DTDs  Mind boggling support for rich data types  Complex and difficult to author by hand  Many XML gurus unhappy with complexity    12
  • 13. XML doc and its XSD <?xml version=quot;1.0quot;?> <?xml version=quot;1.0quot; encoding=quot;UTF-8quot; standalone=quot;yesquot;?> <Process <xs:schema xmlns:xsi=quot;http://www.w3.org/200 xmlns:xs=quot;http://www.w3.org/2001/XMLSchemaquot;> 1/XMLSchema-instancequot; <xs:element name=quot;Processquot;> xsi:noNamespaceSchemaLocation= <xs:complexType> quot;ComplexTypes.xsdquot;> <xs:all> <Name>Bill Evjen</Name> <xs:elementname=quot;Namequot;type=quot;xs:stringquot;/> <Address>123 Main Street</Address> <xs:element name=quot;Addressquot; type=quot;xs:stringquot; /> <City>Saint Charles</City> <xs:element name=quot;Cityquot; type=quot;xs:stringquot; /> <State>Missouri</State> <xs:element name=quot;Statequot; type=quot;xs:stringquot; /> <xs:element name=quot;Countryquot; type=quot;xs:stringquot;/> <Country>USA</Country> </xs:all> </Process> </xs:complexType> </xs:element> </xs:schema>   13
  • 14. Relax NG (extn: .rng / .rnc) Stands for REgular LAnguage for XML Next   Generation Not a W3C standard – part of OASIS  Offers alternative to XSD complexity  Based on Murata Makoto's RELAX and James   Clark's TREX. Mostly satisfies Pareto principle    XML and non­XML syntax    14
  • 15. XML doc and its RNG <?xml version=”1.0” encoding=”UTF-8”?> <?xml version=”1.0” encoding=”UTF-8”?> <zeroOrMore> <element name=quot;phonebookquot;> <element name=quot;phonebookquot;> <element name=quot;entryquot;> <oneOrMore> <element name=quot;firstNamequot;> <element name=quot;entryquot;> <text/> <element name=quot;firstNamequot;><text/></element> </element> <optional> <element name=quot;firstNamequot;> <element name=quot;middleNamequot;/> <text/> </optional> <element name=quot;firstNamequot;><text/></element> </element> <!-- etc... --> <!-- etc... --> </element> </element> </oneOrMore> </element> </element> </zeroOrMore>   15
  • 16. Sample XML doc with DTD decl DTD <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> decl <!DOCTYPE beans PUBLIC quot;-//SPRING//DTD BEAN//ENquot; quot;http://www.springframework.org/dtd/spring-beans.dtdquot;> <beans> <!-- Axis2 Web Service, but to Spring, its just another bean that has dependencies --> <bean id=quot;springAwareServicequot; class=quot;spring.SpringAwareServicequot;> <property name=quot;myBeanquot; ref=quot;myBeanquot;/> </bean> <!-- just another bean / interface with a wired implementation, that's injected by Spring into the Web Service --> <bean id=quot;myBeanquot; class=quot;spring.MyBeanImplquot;> <property name=quot;valquot; value=quot;Spring, emerge thyselfquot; /> </bean> </beans>   16
  • 17. Sample XML doc with XSD decl XML NS <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> <beans xmlns=quot;http://www.springframework.org/schema/beansquot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot; XSD xmlns:aop=quot;http://www.springframework.org/schema/aopquot; Decl xsi:schemaLocation=quot; http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.0.xsdquot;> <bean class=quot;org.springframework.beans.factory.config.PropertyPlaceholde rConfigurerquot;> <property name=quot;locationsquot; value=quot;classpath:pestt_jdbc.propertiesquot;/> </bean> </beans>   17
  • 19. Structure of XML docs (cont'd) XML declaration <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> Doc element <nutrition> <?page skip?> Procs'ng Instr Element End Element <daily-values> <total-fat units=quot;gquot;>65</total-fat> <saturated-fat units=quot;gquot;>20</saturated-fat> Text <cholesterol units=quot;mgquot;>300</cholesterol> XML comment <sodium units=quot;mgquot;>2400</sodium> <carb units=quot;gquot;>300</carb> <!-- this is a comment --> <fiber units=quot;gquot;>25</fiber> <! this is a comment Attribute <protein units=quot;gquot;>50</protein> <notes><![CDATA[Daily values for an adult <male> ]]></notes> Text (CDATA) </daily-values> End of CDATA </nutrition> End Doc element   19
  • 20. Hierarchy of previous example Nutrition ?page (PI) daily-values total-fat saturated-fat Sodium Cholestorol units units units units 2400 20 300 65   20
  • 21. XML Node Types Document Document Fragment   Element Entity   Attribute Entity reference   Text Processing instruction   Comment    21
  • 22. XML Nodes: Document Represents the entire document  Conceptual root of the document tree  Provides primary access to doc's data  Other node types must have a parent Document  Used in DOM for full traversal    Used in SAX to signal start of a document    22
  • 23. XML Nodes: Element Represents an element in an XML doc  May have 0 or more attributes  May have 0 or more children (other elements, text,   comments, CDATA, etc.) No children → leaf elements  All elements must have closing elements e.g.   <color>pink</color> or <color/> <color></color>       <color/>    23
  • 24. XML Nodes: Attributes Only associated with elements – optional  Specify metadata about an element  Shown as name/value pairs eg. Num='6'  Each attribute must be quoted (') or (”)  Elements may have any number of attrs  Each attr must be unique for an element    24
  • 25. XML Nodes: Text Represents textual content of elements  Text nodes are leaves  Some chars such as '<' and '&' must be escaped   when authoring text  Use CDATA when such chars occur frequently.   Escaping impairs readability. Intn'l chars may freely be inserted in text    25
  • 26. XML Nodes: Comments Mainly exist for human readability   Can span multiple lines  Char sequence '­­' illegal inside comments  Usually ignored by most parsers  Can't have nested comments    26
  • 27. XML Nodes: Entity references Used to substitute for a single char that is also a   markup delimiter in XML Using these prevents a literal char from being   mistaken for a markup delimiter  Predefined entity references:  <  → &lt;  &  → &amp;  >  → &gtr;  ”  → &quot;  '  → &apos;    27
  • 28. XML Nodes: Processing Instrs Provides info to app processing document  E.g. how to process or render the doc  XML declaration looks like a PI, but isn't  Pis comprise of a target and data e.g.    <?sort alpha­  ascending?> target data   28
  • 29. Where XML is being used today Config files  Doc publishing (S1000D)   Expr & query languages     Devices (Sony PRS­505)   (XPath, XQuery) IDEs (IntelliJ, Eclipse)  Transform languages (XSLT,  Frameworks (Spring,    XSL­FO) Hibernate, JPA) Java project build files (Ant,  DBMS storage (SQLSrv,    Maven) DB2,Oracle  support XML) Web services (SOAP, REST)  Web, PDAs, smart phones   and XML­RPC  Vector graphics (SVG)  Document storage (OpenOffice,   Log, data files  MS­Office)   29
  • 30. Advantages of XML Full blown W3C standard  I18N support –  UTF­8 and UTF­16  Expressive – define CS data structures  Semantic vald'n using DTD, XSD, RelaxNG  Parsers on all platforms (Java, .net, LAMP)  Widely in use in industry and academia  Several vocabularies e.g. MathML, CML  Rigid syntax → predictability in parsing    30
  • 31. Disadvantages of XML Verbose syntax – not compact   Can't easily represent binary  DOM and SAX APIs are complex  Difficult to diff similar XML files  Overheads in data transmission  Different data model compared to RDBMS  Storing XML in RDBMS is unnatural    31
  • 32. XML vocabularies S1000d(aerospace) Legal XML   SVG (graphics) Human XML   MathML Address XML   Food XML Finance XML   Legal XML Physics XML   Manufacturing XML News XML   Healthcare XML Astronomy XML     32
  • 33. XML authoring tools XMLSpy ($)   Arbortext ($)    Oxygen ($)   XML Copy Editor   EditiX ($)  Stylus Studio      33
  • 35. Agenda (session # 2) Sample PO (fixed fmt & XML) Reading, writing XML property    files Typical Java enterprise tech  Sample log4j.xml config file  The XML family  XML to build Java projects  Parsing XML  The JAX family  DOM, SAX, StaX parsing  XML in Java enterprise   XML parsing with Groovy  development Compare DOM, SAX, StAX  XML and databases  XML usage in Java5 and   XML and web services  beyond References  XMLEncoder sample code    35
  • 36. Typical Java enterpise technologies CORBA RMI Velocity Spring Struts JSF Terracotta GridGain JBoss Logging Oracle Hibernate XML Reporting Scripting JMX JPA LDAP JavaMail JAXB Quartz AJAX TestNG JMS JTA Tag JNDI JAAS SNMP Maven Libraries   36
  • 37. Sample purchase order (rec types) 100: Jack Nicholson 120: 621 Mulholland Drive 140: Bel Air 150: CA 170: 90077 200: 3 210: 248 220: Decorative Widget, Red, Large 300: 19.95 200: 1 210: 1632 220: Packed electron storage container, AA, 4-pack 300: 4.95   37
  • 38. Sample purchase order (XML) <!DOCTYPE purchase.order SYSTEM <product.num>248</product.num> quot;po.dtdquot;> <description>Decorative Widget, <purchase.order> Red, Large</description> <date>16 June 2009</date> <unitcost>19.95</unitcost> <billing.address> </item> <name>Jack Nicholson</name> <item> <street>621 Mulholland Drive</street> <quantity>1</quantity> <city>Bel Air</city> <product.num>1632</product.num> <state>CA</state> <description>Packed electron storage container, AA, 4- <zip>90077</zip> pack</description> </billing.address> <unitcost>4.95</unitcost> <items> </item> <item> </items> <quantity>3</quantity> </purchase.order>   38
  • 39. The XML family Has a number of number of members  XHTML – HTML that's well­formed XML  XSLT – transform XML to XML/HTML/Text  XSL­FO – transform XML to PDF/PS/RTF  XPath – XML expression lang (used in XSLT, XQuery   and code  e.g. Groovy, Java, Python) XQuery – extract, manipulate data in XML document  All of the above are W3C standards  All have full­blown I18N support (UTF)    39
  • 40. Parsing XML Splendid support for parsing XML in Java, Groovy,   C, C++, Ruby, C# (.net), Python, Haskell, Scala,  Lisp, Erlang, PERL, etc. Parsers available from handheld devices to massive   supercomputers Most widely used techniques for parsing:  DOM – build a tree representing the XML document  SAX ­  fire events during parsing (push model)  StAX – cursor and iterator based (pull model)    40
  • 41. Parsing XML (cont'd) Above techniques are low­level  DOM parsing heavily used in CAS Toolbox project  Other (less common) parsing techniques;  (a) JAXB  (used in Toolbox Metadata Loader) (b) XML Beans (c) Commons Digester (a) and (b) bind XML to Java objects (OO). Digester   defines rules defining XML struct   41
  • 42. DOM parsing Full support for DOM in Java 5 and beyond  Builds an in­memory tree of the XML doc  Can't access the tree until it is fully built  Random access to tree once built  Extremely impractical for huge XML files – can   fully consume virtual memory Foolproof way to build small & medium XML docs   – build tree, then write out   42
  • 43. DOM parsing (cont'd) DocumentBuilderFactory dbfactory = DocumentBuilderFactory.newInstance(); dbfactory.setNamespaceAware(true); DocumentBuilder domparser = dbfactory.newDocumentBuilder(); //parse the XML and create the DOM Document doc = domparser.parse(new File(quot;data.xmlquot;)); //to create a new DOM from scratch - //Document doc = domparser.newDocument(); //Use DOM once you have the Doc handle   43
  • 44. SAX parsing Brainchild of David Megginson  Solid interface­centric design  Full support for SAX in Java 5 and beyond  Very low memory footprint. Ideal for processing   massive XML documents No random access – housekeeping reqd  Fires synchronous events which should be   intercepted by code Readonly API – can't use it to build XML    44
  • 45. SAX parsing (cont'd) SAXParserFactory spfactory = SAXParserFactory.newInstance(); spfactory.setNamespaceAware(true); SAXParser saxparser = spfactory.newSAXParser(); // write your handler for processing // events and handling error DefaultHandler handler = new MyHandler(); // parse the XML and report events and // errors (if any) to the handler saxparser.parse(new File(quot;data.xmlquot;), handler);   45
  • 46. SAX parsing (cont'd) <?xml version = quot;1.0quot; Start Document encoding = quot;utf-8quot;?> Start Element quot;CarRentalquot; <CarRental> Start Element quot;customerNamequot; <customerName>JohnDoe Character Data quot;John Doequot; </customerName> End Element quot;customerNamequot; <date>2009-02-28</date> Start Element quot;datequot; <model>Oldsmobile Alero</model> Character Data quot;2009-02-28quot; </CarRental> End Element quot;datequot; Start Element quot;modelquot; Character Data quot;Oldsmobile Aleroquot; End Element quot;modelquot; End Element quot;CarRentalquot; End Document   46
  • 47. StAX parsing StAX → Streaming API for XML  Fully integrated in Java 6   JSR 173 sponsored by BEA (part of Oracle)  A pull model to parse XML (SAX → push)  Pull model → app asks parser for events  Unlike SAX, no interfaces to implement  Unlike SAX, you can read and write XML  Offers cursor API and event iterator APIs    47
  • 48. StAX parsing (cont'd) XMLInputFactory xmlif = XMLInputFactory.newInstance(); xmlif.setEventAllocator(new XMLEventAllocatorImpl()); allocator = xmlif.getEventAllocator(); XMLStreamReader xmlr = xmlif.createXMLStreamReader(filename, new FileInputStream(filename)); //The next step is to create an event iterator: int eventType = xmlr.getEventType(); while(xmlr.hasNext( )) { eventType = xmlr.next( ); //Get all quot;Bookquot; elements as XMLEvent object if(eventType == XMLStreamConstants.START_ELEMENT && xmlr.getLocalName().equals(quot;Bookquot;)) { StartElement event = getXMLEvent(xmlr).asStartElement();); } }   48
  • 49. XML parsing with Groovy def file = new File(quot;person.xmlquot; ) <?xml version=”1.0”> person = new <!-- person.xml --> XmlSlurper().parse(file) <person id=quot;100quot; > println person.firstname <firstname>Jane</firstname> ===> Jane <lastname>Wells</lastname> println person.address.city <address type=quot;homequot; > ===> Denver <street>343 Evans Ave</street> println person.address.@type <city>Denver</city> ===> home <state>CO</state> <zip>80020</zip> </address> </person>   49
  • 50. Online Xpath demo Xpath allows users to randomly access portions of an XML  ● document A DOM representing the XML doc must be built first ● Xpath operates against the DOM tree ● Xpath is typically used in XSLT but can be used standalone  ● or in your code  Xpath does a lot of the heavy lifting in XSLT scripts ● http://www.orbeon.com/ops/sandbox­transformations/xpath/    50
  • 51. Comparing DOM, SAX and StAX Feature DOM SAX StAX API type In memory tree Streaming – push Streaming – pull Ease of use High Medium High XPath capability Yes No No CPU & mem Varies Good Good o'head Full navigation Yes No No Read XML Yes Yes Yes Write XML Yes No Yes CRUD Yes No No   51
  • 52. XML usage in Java5 and beyond DOM and SAX parsers, XSLT and XPath all   standard in Java 5 (StAX avail in Java 6) Log4j not part of Java 5, but is ubiquitous. Preferred   config file format is XML. Property files can also be defined in XML  XMLEncoder and XMLDecoder  ­ persistence   mechanism for Java Beans  XMLSignature – a W3C recommendation    52
  • 53. XMLEncoder sample code // Serialize orderBean to disk //***************************** XMLEncoder encoder = new XMLEncoder(new FileOutputStream(“serializedBeans/orderBean.xml”)); encoder.writeObject(orderBean); encoder.close(); // Now read the serialized objects back into Java beans //***************************************************** XMLDecoder decoder = new XMLDecoder(new FileInputStream(“serializedBeans/orderBean.xml”)); OrderBean orderBean = (OrderBean) decoder.readObject(); decoder.close();   53
  • 54. Reading XML property files import java.util.*; <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> import java.io.*; <!DOCTYPE properties SYSTEM quot; http://java.sun.com/dtd/properties.dtdquot;> public class LoadSampleXML <!-- props.xml --> { <properties> public static void main(String args[ ]) <comment>Ths is an XML props file</comment> throws Exception <entry key=quot;fruitquot;>mango</entry> { <entry key=quot;favGroupquot;>Led Zeppelin</entry> Properties prop = new Properties(); <entry key=”favStar”>Jack Nicholson</entry> FileInputStream fis = </properties> new FileInputStream(quot;props.xmlquot;); prop.loadFromXML(fis); prop.list(System.out); System.out.println(quot;nfavStar property: quot; + prop.getProperty(quot;favStarquot;)); } }   54
  • 55. Writing XML property files import java.util.*; ?<xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> import java.io.*; <!DOCTYPE properties SYSTEM quot; http://java.sun.com/dtd/properties.dtdquot;> public class StoreXML { <!-- rhyme.xml --> public static void main(String args[]) throws Exception { <properties> Properties prop = new Properties(); <comment>Rhyme</comment> prop.setProperty(quot;one-twoquot;, quot;buckle my shoequot;); <entry key=quot;seven-eightquot;>lay them prop.setProperty(quot;three-fourquot;, quot;shut the doorquot;); straight</entry> prop.setProperty(quot;five-sixquot;, quot;pick up sticksquot;); <entry key=quot;five-sixquot;>pick up sticks</entry> prop.setProperty(quot;seven-eightquot;, quot;lay them <entry key=quot;nine-tenquot;>a big, fat hen</entry> straightquot;); <entry key=quot;three-fourquot;>shut the door</entry> prop.setProperty(quot;nine-tenquot;, quot;a big, fat henquot;); FileOutputStream fos = <entry key=quot;one-twoquot;>buckle my shoe</entry> new FileOutputStream(quot;rhyme.xmlquot;); </properties> prop.storeToXML(fos, quot;Rhymequot;); fos.close(); } }   55
  • 56. Sample log4j.xml config file <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> <!DOCTYPE log4j:configuration SYSTEM quot;log4j.dtdquot;> <log4j:configuration xmlns:log4j=quot;http://jakarta.apache.org/log4j/quot; debug=quot;truequot;> <appender name=quot;rollingquot; class=quot;org.apache.log4j.RollingFileAppenderquot;> <param name=quot;Filequot; value=quot;${LOG_FILE_PATH}/pestt.logquot; /> <param name=quot;MaxFileSizequot; value=quot;1000KBquot; /> <param name=quot;MaxBackupIndexquot; value=quot;5quot; /> <param name=quot;Thresholdquot; value=quot;debugquot; /> <layout class=quot;org.apache.log4j.PatternLayoutquot;> <param name=quot;ConversionPatternquot; value=quot;%d %-5p [%c] - %m%nquot; /> </layout> </appender> <root> <priority value=quot;debugquot; />> <appender-ref ref=quot;rollingquot; /> </root> </log4j:configuration>   56
  • 57. XML to build Java projects Ant – standard way to build Java for many years  Maven – smart way to build projects and manage   complex dependencies  Ant is procedural, whereas Maven is declarative  Ant files are typically build.xml  Maven files are typically pom.xml  Maven more complex but worth learning  Ant and Maven build files very readable    57
  • 58. Java API for XML (JAX) Set of packages comprising of  Java API for XML Processing (JAXP)  Java API for XML­based RPC (JAX­RPC)  Java API for XML Registries (JAXR)  Java Architecture for XML Binding (JAXB)  Implemented in Java SE and Java EE3    58
  • 59. Java API for XML Parsing (JAXP)  Consists of APIs to parse, search and transform   XML files JAXP was standard issue with Java 5  JAXP components  DOM parser  SAX parser  XPath lookups  XSLT  Implementations hidden from end users    59
  • 60. Java API for XML Registries  (JAXR) Enables lookup of XML   registries A registry is infrastruct   to enable building  deploy, discov of WS JAXR supports  UDDI   and ebXML Consists of JAXR client   and provider   60
  • 61. Java API for XML RPC (JAX­RPC) V2.0 → JAX­WS  JavaEE, .Net, LAMP   interoperability SOAP, REST support  Client → proxy  Proxy → JAX­RPC RS  Method → SOAP msg  SOAP msg transmitted   over HTTP   61
  • 62. JAVA API for XML Binding (JAXB) Part of Java 6  Does a mapping of   XSD        Java  Is object­oriented  More high­level  xjc := XSD → Java  schemagen :=      Java → XSD    62
  • 63. JAXB (continued) <?xml version=quot;1.0quot;?> // Unmarshalling to Java from XML Unmarshaller unmarshaller = <person DataBindingFactory.newUnmarshaller( xmlns=quot;http://www.example.com ); /personquot;> Person person = (Person) unmarshaller.unmarshal(new <firstName>Lola</firstName> File(quot;lola.xmlquot;)); <lastName>Boone</firstName> System.out.println(person.getFirstName( )); </person> // Marshalling to XML from Java Person person = new Person( ); person.setFirstName(quot;Lolaquot;); person.setLastName(quot;Boonequot;); Marshaller marshaller = DataBindingFactory.newMarshaller( ); marshaller.marshal(person, new FileWriter(quot;lola.xmlquot;));   63
  • 64. XML  in Java enterprise dvlp Spring – wildly popular dependency injection f'work   Hibernate – the most widely used object →   relational mapping framework Ibatis – a popular object → query mapping f'work   SOAP – web service configuration and messages   JPA – relatively new persistence spec (part of EJB3)  App servers and web containers (JBoss, Tomcat,   Jetty) Struts, JSF, AJAX –  Java web frameworks    64
  • 65. XML in databases  Enterprise RDBMS's support XML – Oracle,   SQLServer, DB2 and Sybase Oracle has had XMLType since v 9.x  XML DOMs can be persisted in Oracle (and   queried). Syntax somewhat clunky, but queries from  SQL*Plus possible. Consider other options if your RDBMS doesn't   persist natively (write to CLOB)   65
  • 66. Native XML databases A popular fad in the early 21st century.  Not too many use cases for this – possibly document   publishing, web publishing Mark Logic most popular native XML db  Open source XML dbs include Xindice (no support   for XQuery) & eXist (perf issues) Tamino – first native XML db – now abandoned    66
  • 67. XML and web services Two kinds of web services – SOAP, REST  SOAP is still the dominant way to do web­services,   but REST is gaining popularity WSDL is an XML based language to describe web   services and how to access them. Sample WSDL:   http://geocoder.us/dist/eg/clients/GeoCoder.wsdl Besides WSDL, SOAP messages are also specified in XML   (demo GeoCoder example)   67
  • 68. References http://www.w3schools.com  http://www.xml.com  http://www.xml.org  http://www.zvon.org  https://jaxp.dev.java.net/1.4/  http://www.saxproject.org  http://projects.apache.org/indexes/category.html#xml  Professional XML by Bill Evjen et al.  Pro XML Development with Java Technology by Ajay Vohra et al  Java and XML by Brett McLaughlin et al    68