SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Parsing XML with SAX, DOM & JDOM
         Hicham Qaissi
          hicham.qaissi@gmail.com




                                    1
Contents
    0.   What is an XML parser? ............................................................................................ 3

    1.   Describing the example to develop........................................................................... 3

    2.   SAX............................................................................................................................. 6

    3.   DOM ........................................................................................................................ 11

    4.   JDOM....................................................................................................................... 14

    5.   Conclusion ............................................................................................................... 16




                                                                                                                                         2
0. What is an XML parser?
        The XML parsers bring us the possibility of analyzing and composing of the XML
documents. Analyzing the XML data and structure, we can make some objects in some
languages programming (Java in our case). Also we can make the inverse process, in other
words, make a XML document from some data objects (See Fig. 1). In this manual, I analyze
with examples three kinds, SAX, DOM & JDOM.




        1. Describing the example to develop
        The example that I make is entertained. This is the same for the entire three API (SAX,
DOM and JDOM). The example consists in analyzing a XML document that contains
information about some books (ISBN code (isbn is an attribute), Name, Author name, Price,
Editorial). The program expects a book code (ISBN), and searches this book into the XML. If the
book exists, all its information are printed by the standard output, in other case, we print a
message notifying that the book doesn’t exist in the XML. Are you finding it as amusing as I do?
Let’s go!!!




                                                                                              3
The xml example (books.xml) is the following:
<books>
     <book isbn="0000000001">
          <name>Book 1</name>
          <author>Author name 1</author>
          <price>12.54</price>
          <editorial>Editorial 1</editorial>
     </book>
     <book isbn="0000000002">
          <name>Book 2</name>
          <author>Author name 2</author>
          <price>58.25</price>
          <editorial>Editorial 2</editorial>
     </book>
     <book isbn="0000000003">
          <name>Book 3</name>
          <author>Author name 3</author>
          <price>29.45</price>
          <editorial>Editorial 3</editorial>
     </book>
     <book isbn="0000000004">
          <name>Book 4</name>
          <author>Author name 4</author>
          <price>78.95</price>
          <editorial>Editorial 4</editorial>
     </book>
     <book isbn="0000000005">
          <name>PBook 5</name>
          <author>Author name 5</author>
          <price>61.25</price>
          <editorial>Editorial 5</editorial>
     </book>
</books>




                                                4
For all parsers (SAX, DOM & JDOM), I use this DTO (Data Transfer Object):

public class MyBook {

    private   String   isbn;
    private   String   name;
    private   String   author;
    private   String   price;
    private   String   editorial;

    public String getIsbn() {
      return isbn;
    }
    public void setIsbn(String isbn) {
      this.isbn = isbn;
    }
    public String getName() {
      return name;
    }
    public void setName(String name) {
      this.name = name;
    }
    public String getAuthor() {
      return author;
    }
    public void setAuthor(String author) {
      this.author = author;
    }
    public String getPrice() {
      return price;
    }
    public void setPrice(String price) {
      this.price = price;
    }
    public String getEditorial() {
      return editorial;
    }
    public void setEditorial(String editorial) {
      this.editorial = editorial;
    }
}




                                                                            5
2. SAX
         SAX (Simple API for XML), it Works by events and associated methods. As the parser is
reading the document XML and finds the components (the events) of the document
(elements, attributes, values, etc) or it detects errors, is invoking to the methods that the
programmer         has   associated.   You     can    find   more   information   about   SAX   on
www.saxproject.org.
         First, be sure that you’ve included the sax jar in the classpath (The jar file can be
downloaded http://sourceforge.net/projects/sax/files/). We must instantiate the reader. This
reader implements the XMLReader’s interface, we can obtain it from the abstract class
SAXParser. I obtain SAXParser from the SAXParserFactory. The method parse of XMLReader
analyses the xml document:
import java.io.IOException;
import org.xml.sax.SAXException;
import javax.xml.parsers.ParserConfigurationException;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.XMLReader;

public class MySAXSeracher{

    public static void main(String[] args) {
      try {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        factory.setNamespaceAware( true );
        factory.setValidating( true );
        SAXParser saxParser = factory.newSAXParser();
        XMLReader xr = saxParser.getXMLReader();

          xr.parse( args[0] );
        } catch ( IOException ioe ) {
            System.out.println( "Error: " + ioe.getMessage() );
        } catch ( SAXException saxe ){
            System.out.println( "Error: " + saxe.getMessage() );
        } catch ( ParserConfigurationException pce ){
            System.out.println( "Error: " + pce.getMessage() );
        }
    }
}

         If the program compiles, it means that java and the jar file are ok. Nevertheless, the
program doesn’t do anything because we haven’t been interested on any event at the
moment.     It’s    important    to    catch    the    exceptions   java.io.IOException,
org.xml.sax.SAXException                                                                        and
javax.xml.parsers.ParserConfigurationException.




                                                                                                  6
To     manipulate      the      events,    our    main    class   must   extends
org.xml.sax.helpers.DefaultHandler.               DefaultHandler implements the following
interfaces:
org.xml.sax.ContentHandler: events about data (The most extended)
org.xml.sax.ErrorHandler: events about errors
org.xml.sax.DTDhandler: DTD’s treatment
org.xml.sax.EntityResolver: foreign entities

        We can make our own classes implementing ContentHandler and ErrorHandler to treat
the event which we are interested in:
        Data: implementing ContentHandler and associate it to the reader (parser) with the
method setContenthandler().
        Errors: implementing ErrorHandler and associate it to the reader (parser) with the
method setErrorHandler().


        The most important methods in the interface ContentHandler (implemented by
DefaultHandler which is extended by our class MySAXSearcher) are:


    •   startDocument():Receive notification of the beginning of a document.
    •   endDocument(): Receive notification of the end of a document.
    •   startElement():Receive notification of the beginning of an element
    •   endElement():Receive notification of the end of an element.
    •   characters():Receive notification of character data.

        See more about ContentHandler on
http://download.oracle.com/javase/1.4.2/docs/api/org/xml/sax/ContentHandler.html.


        Now, MySAXSearcher is the following (I’ve made my own ContentHandler and
ErrorHandler, it’s much more clean than overriding the ContentHandler and ErrorHandler
interesting methods in our class that extends DefaultHandler):




                                                                                             7
MySAXSearcher.java:
import java.io.IOException;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;


public class MySAXSearcher extends DefaultHandler{

  public static void main(String[] args) {
    MySAXSearcher searcher = new MySAXSearcher();
    searcher.searchBook(args[0], args[1]);
  }

  private void searchBook(String xml, String isbn){
    try {
      SAXParserFactory factory = SAXParserFactory.newInstance();
      factory.setNamespaceAware( true );
      factory.setValidating( true );
      SAXParser saxParser = factory.newSAXParser();

       XMLReader xr = saxParser.getXMLReader();

       // Assigning my own ContentHandler at my XMLReader.
       MyContentHandler ch = new MyContentHandler();
       ch.isbnSearched = isbn;
       xr.setContentHandler( ch );
       // Assigning my own ErrorHandler at my XMLReader.
       xr.setErrorHandler( new MyOwnErrorHandler() );

       xr.setFeature( "http://xml.org/sax/features/validation", false);
       xr.setFeature( "http://xml.org/sax/features/namespaces", true);

        long before = System.currentTimeMillis();
        xr.parse( xml );
        long after = System.currentTimeMillis();
        printResult (xml, ch, after - before);
      } catch ( IOException ioe ) {
        System.out.println( "Error: " + ioe.getMessage() );
      } catch ( SAXException saxe ){
        System.out.println( "Error: " + saxe.getMessage() );
      } catch ( ParserConfigurationException pce ){
        System.out.println( "Error: " + pce.getMessage() );
      }
  }

  public void printResult(String xml, MyContentHandler ch, long time){
    System.out.println("Document " + xml + ". Parsed in : " + time + "
ms");
    if (ch.book != null){
      System.out.println("Book found:");
      System.out.println(" Isbn: "       + ch.book.getIsbn());
      System.out.println(" Name: "       + ch.book.getName());
      System.out.println(" Author: "     + ch.book.getAuthor());
      System.out.println(" Price: "      + ch.book.getPrice());
      System.out.println(" Editorial: " + ch.book.getEditorial());

                                                                          8
} else {
          System.out.println("Book not found");
        }
    }
}


MyContentHandler.java:
import     org.xml.sax.Attributes;
import     org.xml.sax.ContentHandler;
import     org.xml.sax.Locator;
import     org.xml.sax.SAXException;

public class MyContentHandler implements ContentHandler {

  boolean isBookFound = false;
  String isbnSearched = "";
  String currentNode = "";
  MyBook book          = null;
  // Overrided
  public void startDocument() throws SAXException {
    System.out.println("***Start document***");
  }
  // Overrided
  public void endDocument() throws SAXException {
    System.out.println("***End document***");
  }
  // Overrided
  public void startElement(String uri, String local, String raw,
Attributes attrs) {
    currentNode = local;
    if ("book".equals(local) && !isBookFound){
    // The book node only has an attribute (isbn)
      if ("isbn".equals(attrs.getLocalName(0)) &&
isbnSearched.equals(attrs.getValue(0))){
        isBookFound = true;
        book = new MyBook();
        book.setIsbn(isbnSearched);
      }
    }
  }
  // Overrided
  public void characters(char ch[], int start, int length) {
    String value = "";
    // I get the text value
    for (int i = start; i < start + length; i++) {
      value+= Character.toString(ch [i]);
    }
    if (!"".equals(value.trim()) && isBookFound){
      if("name".equals(currentNode)){
        book.setName(value.trim());
      } else if ("author".equals(currentNode)){
          book.setAuthor(value.trim());
      } else if ("price".equals(currentNode)){
          book.setPrice(value.trim());
      } else if ("editorial".equals(currentNode)){
        book.setEditorial(value.trim());
        isBookFound = false;
      }
    }


                                                                   9
}
    // Overrided
    public void endElement(String arg0, String arg1, String arg2)
                    throws SAXException {
    }
    // Overrided
    public void endPrefixMapping(String arg0) throws SAXException {
    }
    // Overrided
    public void ignorableWhitespace(char[] arg0, int arg1, int arg2)
                    throws SAXException {
    }
    // Overrided
    public void processingInstruction(String arg0, String arg1)
                    throws SAXException {
    }
    // Overrided
    public void setDocumentLocator(Locator arg0) {
    }
    // Overrided
    public void skippedEntity(String arg0) throws SAXException {
    }
    // Overrided
    public void startPrefixMapping(String arg0, String arg1)
                    throws SAXException {

    }
}


MyErrorHandler.java:
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class MyErrorHandler implements ErrorHandler {
  // Overrided
  public void warning(SAXParseException ex) {
    System.err.println("[Warning] : "+ ex.getMessage());
  }
  // Overrided
  public void error(SAXParseException ex) {
    System.err.println("[Error] : "+ex.getMessage());
  }
  // Overrided
  public void fatalError(SAXParseException ex) throws SAXException {
    System.err.println("[Error!] : "+ex.getMessage());
  }
}

       With our xml (books.xml), and the book code to search 0000000003, we can executed
our program with:
                     java MySAXSearcher “books.xml” “0000000003”




                                                                                     10
The result must be the following:

                       ***Start document***
                       ***End document***
                       Document books.xml Parsed in: 141ms
                       Book found:
                         Isbn: 0000000003
                         Name: Book 3
                         Author: Author name 3
                         Price: 29.45
                         Editorial: Editorial 3




       3. DOM
       DOM (Document Object Model), while SAX offers access at all elements of document,
DOM brings the parsing as a tree that can be parsed and transformed. DOM has some
disadvantages and advantages with regards to SAX:
       Disadvantage:
           •   The data can be acceded only when the entire document is parsed.
           •   The tree is an object loaded on the memory; this is problematic for big and
               complex documents.
       Advantages:
           •   With DOM we can manipulate (update, delete and add elements) the xml
               document. Also, we can create a new xml document.


       To manipulate an xml document, we must instantiate a Document (interface) object
that implements the Document interface (extends the interface Node). We use the classes’
javax.xml.parsers.DocumentBuilder and javax.xml.parsers.DocumentBuilderFactory, we
invoke the method parse() to obtain a Document object.


       For manipulate an XML with DOM, there are some important classes’:
org.w3c.dom.Document        (interface     representing   the    entire    XML     document),
org.w3c.dom.Element (Elements in the XML document), org.w3c.dom.Node (node that has
some elements) and org.w3c.dom.Att (The attributes of every element).


        Ok, now let’s talk in java code language. As DTO (Data Transfer Object), I use the same
object MyBook.




                                                                                            11
MyDOMSearcher.java:

import java.io.File;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import   org.w3c.dom.Document;
import   org.w3c.dom.Node;
import   org.w3c.dom.NodeList;
import   org.xml.sax.SAXException;

public class MyDOMSearcher {

  public static void main(String[] args) {
    MyDOMSearcher searcher = new MyDOMSearcher();
    searcher.searchBook(args[0], args[1]);
  }

  private void searchBook(String xml, String isbn) {
    long before = System.currentTimeMillis();
    MyBook book = null;
    try{
      DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
      factory.setNamespaceAware(true);
      factory.setValidating(true);
      DocumentBuilder parser = factory.newDocumentBuilder();
      // I assign my own ErrorHandler to my Parser
      parser.setErrorHandler(new MyErrorHandler());
      File file = new File(xml);
      Document doc = parser.parse (file);
      // I obtain all the elements <book>
      // NodeList is an interface that has 2 methods:
      //    1. item(int): returns the Node (Interface) Object of the
position int.
      //    2. getLength(): returns the length of the List
      NodeList booksNodes = doc.getElementsByTagName("book");
      NodeList bookChildsNodes = null;
      String isbnAttribute = "";

      for(int i = 0; i < booksNodes.getLength(); i++) {
        Node node = booksNodes.item(i);
        if(node != null && node.hasAttributes()) {
        isbnAttribute =
node.getAttributes().getNamedItem("isbn").getNodeValue();
          if(isbnAttribute.equals(isbn)){
            //I've caught the isbn searched
            if(book == null){
              book = new MyBook();
              book.setIsbn(isbn);
            }
            if(node.hasChildNodes()){
              bookChildsNodes = node.getChildNodes();
              for (int j = 0; j < bookChildsNodes.getLength(); j++) {
            if("name".equals(bookChildsNodes.item(j).getNodeName())){

book.setName(bookChildsNodes.item(j).getTextContent());


                                                                        12
}else
if("author".equals(bookChildsNodes.item(j).getNodeName())){
  book.setAuthor(bookChildsNodes.item(j).getTextContent());
}else if("price".equals(bookChildsNodes.item(j).getNodeName())){
  book.setPrice(bookChildsNodes.item(j).getTextContent());
}else if("editorial".equals(bookChildsNodes.item(j).getNodeName())){
  book.setEditorial(bookChildsNodes.item(j).getTextContent());
  // I've found my book. Ending the for iteration
  break;
}
         }
      }
    }
  }
}
      }catch(IOException ioe){
        System.err.println("[Error] : "+ioe.getMessage());
      }catch(ParserConfigurationException pce){
        System.err.println("[Error] : "+pce.getMessage());
      }catch(SAXException se){
         System.err.println("[Error] : "+se.getMessage());
      }
      long after = System.currentTimeMillis();
      printResults(xml, book, after - before);
}

public void printResults(String xml, MyBook book, long time) {
    System.out.println("Document " + xml + ". Parsed in : " + time + "
ms");
    if (book != null){
      System.out.println("Book found:");
      System.out.println(" Isbn: "       + book.getIsbn());
      System.out.println(" Name: "       + book.getName());
      System.out.println(" Author: "     + book.getAuthor());
      System.out.println(" Price: "      + book.getPrice());
      System.out.println(" Editorial: " + book.getEditorial());
    }else{
      System.out.println("Book not found");
    }
  }
}




                                                                         13
4. JDOM
        All the precedents API’s are available for many programming languages, but their use
is laborious in Java. A specific API has been made for java (JDOM), that API uses the own
capacities and features of Java, therefore, using it make the XMlL parsing easily. We can find
some related information on www.jdom.org.

        Now, let’s make the same example (searching a book in our XML) with JDOM (be sure
that the jar is installed in your classpath, you can download it on
http://www.jdom.org/dist/binary/).



MyJDOMSearcher.java:

import java.io.IOException;
import java.util.Iterator;
import java.util.List;

import   org.jdom.Document;
import   org.jdom.Element;
import   org.jdom.JDOMException;
import   org.jdom.input.SAXBuilder;


public class MyJDOMSearcher {

  private String isbn;
  private MyBook book;
  private boolean noSearchMore = false;

  public static void main(String[] args) {
    try {
      long before = System.currentTimeMillis();
      MyJDOMSearcher searcher = new MyJDOMSearcher();
      // The second parameter is the isbn to search
      searcher.isbn = args[1];
      SAXBuilder saxBuilder = new SAXBuilder();
      Document document = saxBuilder.build(args[0]);
      searcher.searchBook(document.getRootElement());
      long after = System.currentTimeMillis();
      searcher.printResults(args[0], after-before);
    } catch (JDOMException jde){
       System.err.println("[Error] JDOMException: "+jde.getMessage());
    } catch (IOException ioe){
       System.err.println("[Error] IOException: "+ioe.getMessage());
    }

  }

  private void searchBook(Element element){

      inspect(element);

      List content = element.getContent();
      Iterator iterator = content.iterator();
      Element child = null;
      Object object = null;


                                                                                           14
while(iterator.hasNext()){ // All times we have "books" node
          object = iterator.next();
          if(object instanceof Element){
            child = ((Element)object); //Casting from Object to Element
            searchBook(child);
          }
        }
    }

    // Recursively descend the tree
    public void inspect(Element element) {
      if (!noSearchMore){ // If I've had the book yet, I'll do anything
        if("book".equals(element.getQualifiedName()) & book == null){
          if(isbn.equals(element.getAttribute("isbn").getValue())){
                book = new MyBook();
                book.setIsbn(isbn);
          }
        }
        if(book != null){
          if("name".equals(element.getQualifiedName())){
            book.setName(element.getValue());
          }
          if("author".equals(element.getQualifiedName())){
            book.setAuthor(element.getValue());
          }
          if("price".equals(element.getQualifiedName())){
            book.setPrice(element.getValue());
          }
          if("editorial".equals(element.getQualifiedName())){
            book.setEditorial(element.getValue());
            noSearchMore = true;
          }
        }
      }
    }

  private void printResults(String xml, long time) {
    System.out.println("Document " + xml + ". Parsed in : " + time + "
ms");
    if (book != null){
      System.out.println("Book found:");
      System.out.println(" Isbn: "       + book.getIsbn());
      System.out.println(" Name: "       + book.getName());
      System.out.println(" Author: "     + book.getAuthor());
      System.out.println(" Price: "      + book.getPrice());
      System.out.println(" Editorial: " + book.getEditorial());
    } else {
      System.out.println("Book not found");
    }
  }

}




                                                                          15
5. Conclusion
       Executing the same example with the three API’s (MySAXSearcher, MyDOMSearcher
and MyJDOMSearcher) having us parameters received the same xml file and the isbn to search
("0000000003"), the result (in time) obtained is the following:

      MySAXSearcher                 MyDOMSearcher                      MyJDOMSearcher
          93 ms                          750 ms                           609 ms

       The SAX API is faster than DOM and JDOM (But it’s laborious).




                                                                                        16

Contenu connexe

Tendances

Introductionto xslt
Introductionto xsltIntroductionto xslt
Introductionto xslt
Kumar
 

Tendances (20)

Querring xml with xpath
Querring xml with xpath Querring xml with xpath
Querring xml with xpath
 
XML Document Object Model (DOM)
XML Document Object Model (DOM)XML Document Object Model (DOM)
XML Document Object Model (DOM)
 
Xml parsers
Xml parsersXml parsers
Xml parsers
 
Understanding XML DOM
Understanding XML DOMUnderstanding XML DOM
Understanding XML DOM
 
Java and XML
Java and XMLJava and XML
Java and XML
 
Python xml processing
Python   xml processingPython   xml processing
Python xml processing
 
XML SAX PARSING
XML SAX PARSING XML SAX PARSING
XML SAX PARSING
 
JAXP
JAXPJAXP
JAXP
 
Object Relational Mapping in PHP
Object Relational Mapping in PHPObject Relational Mapping in PHP
Object Relational Mapping in PHP
 
Xml passing in python
Xml passing in pythonXml passing in python
Xml passing in python
 
Extracting data from xml
Extracting data from xmlExtracting data from xml
Extracting data from xml
 
L16 Object Relational Mapping and NoSQL
L16 Object Relational Mapping and NoSQLL16 Object Relational Mapping and NoSQL
L16 Object Relational Mapping and NoSQL
 
What's new, what's hot in PHP 5.3
What's new, what's hot in PHP 5.3What's new, what's hot in PHP 5.3
What's new, what's hot in PHP 5.3
 
Building Data Mapper PHP5
Building Data Mapper PHP5Building Data Mapper PHP5
Building Data Mapper PHP5
 
Xml
XmlXml
Xml
 
Xslt by asfak mahamud
Xslt by asfak mahamudXslt by asfak mahamud
Xslt by asfak mahamud
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
Introductionto xslt
Introductionto xsltIntroductionto xslt
Introductionto xslt
 
XML and XPath details
XML and XPath detailsXML and XPath details
XML and XPath details
 
Learning XSLT
Learning XSLTLearning XSLT
Learning XSLT
 

En vedette

Xml serialization
Xml serializationXml serialization
Xml serialization
Raghu nath
 
Ado.net xml data serialization
Ado.net xml data serializationAdo.net xml data serialization
Ado.net xml data serialization
Raghu nath
 
RESTful services with JAXB and JPA
RESTful services with JAXB and JPARESTful services with JAXB and JPA
RESTful services with JAXB and JPA
Shaun Smith
 

En vedette (12)

Xml serialization
Xml serializationXml serialization
Xml serialization
 
Ado.net xml data serialization
Ado.net xml data serializationAdo.net xml data serialization
Ado.net xml data serialization
 
XML parsing using jaxb
XML parsing using jaxbXML parsing using jaxb
XML parsing using jaxb
 
Xml & Java
Xml & JavaXml & Java
Xml & Java
 
RESTful services with JAXB and JPA
RESTful services with JAXB and JPARESTful services with JAXB and JPA
RESTful services with JAXB and JPA
 
Ado.Net Tutorial
Ado.Net TutorialAdo.Net Tutorial
Ado.Net Tutorial
 
Serialization in .NET
Serialization in .NETSerialization in .NET
Serialization in .NET
 
Intro To .Net Threads
Intro To .Net ThreadsIntro To .Net Threads
Intro To .Net Threads
 
Threading in C#
Threading in C#Threading in C#
Threading in C#
 
Delegates and events
Delegates and events   Delegates and events
Delegates and events
 
What are actionable insights? (Introduction to Operational Analytics Software)
What are actionable insights? (Introduction to Operational Analytics Software)What are actionable insights? (Introduction to Operational Analytics Software)
What are actionable insights? (Introduction to Operational Analytics Software)
 
JAX B
JAX BJAX B
JAX B
 

Similaire à SAX, DOM & JDOM parsers for beginners

JSR 172: XML Parsing in MIDP
JSR 172: XML Parsing in MIDPJSR 172: XML Parsing in MIDP
JSR 172: XML Parsing in MIDP
Jussi Pohjolainen
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Marco Gralike
 

Similaire à SAX, DOM & JDOM parsers for beginners (20)

5 xml parsing
5   xml parsing5   xml parsing
5 xml parsing
 
Stax parser
Stax parserStax parser
Stax parser
 
JSR 172: XML Parsing in MIDP
JSR 172: XML Parsing in MIDPJSR 172: XML Parsing in MIDP
JSR 172: XML Parsing in MIDP
 
Ch23
Ch23Ch23
Ch23
 
Ch23 xml processing_with_java
Ch23 xml processing_with_javaCh23 xml processing_with_java
Ch23 xml processing_with_java
 
Xm lparsers
Xm lparsersXm lparsers
Xm lparsers
 
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Handout - Introduction to Programming
Handout - Introduction to ProgrammingHandout - Introduction to Programming
Handout - Introduction to Programming
 
Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)
 
24sax
24sax24sax
24sax
 
backend
backendbackend
backend
 
backend
backendbackend
backend
 
SchemaStudioTypeLandscape_Article.pdf
SchemaStudioTypeLandscape_Article.pdfSchemaStudioTypeLandscape_Article.pdf
SchemaStudioTypeLandscape_Article.pdf
 
Metadata Extraction and Content Transformation
Metadata Extraction and Content TransformationMetadata Extraction and Content Transformation
Metadata Extraction and Content Transformation
 
XML
XMLXML
XML
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
 
eXtensible Markup Language (XML)
eXtensible Markup Language (XML)eXtensible Markup Language (XML)
eXtensible Markup Language (XML)
 
Reversing JavaScript
Reversing JavaScriptReversing JavaScript
Reversing JavaScript
 
treeview
treeviewtreeview
treeview
 

Plus de Hicham QAISSI (9)

Blockchain for business. Interview-based article - October 2019. Polytechnic ...
Blockchain for business. Interview-based article - October 2019. Polytechnic ...Blockchain for business. Interview-based article - October 2019. Polytechnic ...
Blockchain for business. Interview-based article - October 2019. Polytechnic ...
 
International Business Isbn-9780273766957 - Points-counterpoints
International Business Isbn-9780273766957 - Points-counterpointsInternational Business Isbn-9780273766957 - Points-counterpoints
International Business Isbn-9780273766957 - Points-counterpoints
 
XML de A a Z
XML de A a ZXML de A a Z
XML de A a Z
 
Jdom how it works & how it opened the java process
Jdom how it works & how it opened the java processJdom how it works & how it opened the java process
Jdom how it works & how it opened the java process
 
JDOM makes xml easy
JDOM makes xml easyJDOM makes xml easy
JDOM makes xml easy
 
JSTLQuick Reference
JSTLQuick ReferenceJSTLQuick Reference
JSTLQuick Reference
 
Mobil Social Networks
Mobil Social NetworksMobil Social Networks
Mobil Social Networks
 
SPARQL
SPARQLSPARQL
SPARQL
 
J2EE Servlets Tutorial
J2EE Servlets TutorialJ2EE Servlets Tutorial
J2EE Servlets Tutorial
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

SAX, DOM & JDOM parsers for beginners

  • 1. Parsing XML with SAX, DOM & JDOM Hicham Qaissi hicham.qaissi@gmail.com 1
  • 2. Contents 0. What is an XML parser? ............................................................................................ 3 1. Describing the example to develop........................................................................... 3 2. SAX............................................................................................................................. 6 3. DOM ........................................................................................................................ 11 4. JDOM....................................................................................................................... 14 5. Conclusion ............................................................................................................... 16 2
  • 3. 0. What is an XML parser? The XML parsers bring us the possibility of analyzing and composing of the XML documents. Analyzing the XML data and structure, we can make some objects in some languages programming (Java in our case). Also we can make the inverse process, in other words, make a XML document from some data objects (See Fig. 1). In this manual, I analyze with examples three kinds, SAX, DOM & JDOM. 1. Describing the example to develop The example that I make is entertained. This is the same for the entire three API (SAX, DOM and JDOM). The example consists in analyzing a XML document that contains information about some books (ISBN code (isbn is an attribute), Name, Author name, Price, Editorial). The program expects a book code (ISBN), and searches this book into the XML. If the book exists, all its information are printed by the standard output, in other case, we print a message notifying that the book doesn’t exist in the XML. Are you finding it as amusing as I do? Let’s go!!! 3
  • 4. The xml example (books.xml) is the following: <books> <book isbn="0000000001"> <name>Book 1</name> <author>Author name 1</author> <price>12.54</price> <editorial>Editorial 1</editorial> </book> <book isbn="0000000002"> <name>Book 2</name> <author>Author name 2</author> <price>58.25</price> <editorial>Editorial 2</editorial> </book> <book isbn="0000000003"> <name>Book 3</name> <author>Author name 3</author> <price>29.45</price> <editorial>Editorial 3</editorial> </book> <book isbn="0000000004"> <name>Book 4</name> <author>Author name 4</author> <price>78.95</price> <editorial>Editorial 4</editorial> </book> <book isbn="0000000005"> <name>PBook 5</name> <author>Author name 5</author> <price>61.25</price> <editorial>Editorial 5</editorial> </book> </books> 4
  • 5. For all parsers (SAX, DOM & JDOM), I use this DTO (Data Transfer Object): public class MyBook { private String isbn; private String name; private String author; private String price; private String editorial; public String getIsbn() { return isbn; } public void setIsbn(String isbn) { this.isbn = isbn; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getAuthor() { return author; } public void setAuthor(String author) { this.author = author; } public String getPrice() { return price; } public void setPrice(String price) { this.price = price; } public String getEditorial() { return editorial; } public void setEditorial(String editorial) { this.editorial = editorial; } } 5
  • 6. 2. SAX SAX (Simple API for XML), it Works by events and associated methods. As the parser is reading the document XML and finds the components (the events) of the document (elements, attributes, values, etc) or it detects errors, is invoking to the methods that the programmer has associated. You can find more information about SAX on www.saxproject.org. First, be sure that you’ve included the sax jar in the classpath (The jar file can be downloaded http://sourceforge.net/projects/sax/files/). We must instantiate the reader. This reader implements the XMLReader’s interface, we can obtain it from the abstract class SAXParser. I obtain SAXParser from the SAXParserFactory. The method parse of XMLReader analyses the xml document: import java.io.IOException; import org.xml.sax.SAXException; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.XMLReader; public class MySAXSeracher{ public static void main(String[] args) { try { SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setNamespaceAware( true ); factory.setValidating( true ); SAXParser saxParser = factory.newSAXParser(); XMLReader xr = saxParser.getXMLReader(); xr.parse( args[0] ); } catch ( IOException ioe ) { System.out.println( "Error: " + ioe.getMessage() ); } catch ( SAXException saxe ){ System.out.println( "Error: " + saxe.getMessage() ); } catch ( ParserConfigurationException pce ){ System.out.println( "Error: " + pce.getMessage() ); } } } If the program compiles, it means that java and the jar file are ok. Nevertheless, the program doesn’t do anything because we haven’t been interested on any event at the moment. It’s important to catch the exceptions java.io.IOException, org.xml.sax.SAXException and javax.xml.parsers.ParserConfigurationException. 6
  • 7. To manipulate the events, our main class must extends org.xml.sax.helpers.DefaultHandler. DefaultHandler implements the following interfaces: org.xml.sax.ContentHandler: events about data (The most extended) org.xml.sax.ErrorHandler: events about errors org.xml.sax.DTDhandler: DTD’s treatment org.xml.sax.EntityResolver: foreign entities We can make our own classes implementing ContentHandler and ErrorHandler to treat the event which we are interested in: Data: implementing ContentHandler and associate it to the reader (parser) with the method setContenthandler(). Errors: implementing ErrorHandler and associate it to the reader (parser) with the method setErrorHandler(). The most important methods in the interface ContentHandler (implemented by DefaultHandler which is extended by our class MySAXSearcher) are: • startDocument():Receive notification of the beginning of a document. • endDocument(): Receive notification of the end of a document. • startElement():Receive notification of the beginning of an element • endElement():Receive notification of the end of an element. • characters():Receive notification of character data. See more about ContentHandler on http://download.oracle.com/javase/1.4.2/docs/api/org/xml/sax/ContentHandler.html. Now, MySAXSearcher is the following (I’ve made my own ContentHandler and ErrorHandler, it’s much more clean than overriding the ContentHandler and ErrorHandler interesting methods in our class that extends DefaultHandler): 7
  • 8. MySAXSearcher.java: import java.io.IOException; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; public class MySAXSearcher extends DefaultHandler{ public static void main(String[] args) { MySAXSearcher searcher = new MySAXSearcher(); searcher.searchBook(args[0], args[1]); } private void searchBook(String xml, String isbn){ try { SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setNamespaceAware( true ); factory.setValidating( true ); SAXParser saxParser = factory.newSAXParser(); XMLReader xr = saxParser.getXMLReader(); // Assigning my own ContentHandler at my XMLReader. MyContentHandler ch = new MyContentHandler(); ch.isbnSearched = isbn; xr.setContentHandler( ch ); // Assigning my own ErrorHandler at my XMLReader. xr.setErrorHandler( new MyOwnErrorHandler() ); xr.setFeature( "http://xml.org/sax/features/validation", false); xr.setFeature( "http://xml.org/sax/features/namespaces", true); long before = System.currentTimeMillis(); xr.parse( xml ); long after = System.currentTimeMillis(); printResult (xml, ch, after - before); } catch ( IOException ioe ) { System.out.println( "Error: " + ioe.getMessage() ); } catch ( SAXException saxe ){ System.out.println( "Error: " + saxe.getMessage() ); } catch ( ParserConfigurationException pce ){ System.out.println( "Error: " + pce.getMessage() ); } } public void printResult(String xml, MyContentHandler ch, long time){ System.out.println("Document " + xml + ". Parsed in : " + time + " ms"); if (ch.book != null){ System.out.println("Book found:"); System.out.println(" Isbn: " + ch.book.getIsbn()); System.out.println(" Name: " + ch.book.getName()); System.out.println(" Author: " + ch.book.getAuthor()); System.out.println(" Price: " + ch.book.getPrice()); System.out.println(" Editorial: " + ch.book.getEditorial()); 8
  • 9. } else { System.out.println("Book not found"); } } } MyContentHandler.java: import org.xml.sax.Attributes; import org.xml.sax.ContentHandler; import org.xml.sax.Locator; import org.xml.sax.SAXException; public class MyContentHandler implements ContentHandler { boolean isBookFound = false; String isbnSearched = ""; String currentNode = ""; MyBook book = null; // Overrided public void startDocument() throws SAXException { System.out.println("***Start document***"); } // Overrided public void endDocument() throws SAXException { System.out.println("***End document***"); } // Overrided public void startElement(String uri, String local, String raw, Attributes attrs) { currentNode = local; if ("book".equals(local) && !isBookFound){ // The book node only has an attribute (isbn) if ("isbn".equals(attrs.getLocalName(0)) && isbnSearched.equals(attrs.getValue(0))){ isBookFound = true; book = new MyBook(); book.setIsbn(isbnSearched); } } } // Overrided public void characters(char ch[], int start, int length) { String value = ""; // I get the text value for (int i = start; i < start + length; i++) { value+= Character.toString(ch [i]); } if (!"".equals(value.trim()) && isBookFound){ if("name".equals(currentNode)){ book.setName(value.trim()); } else if ("author".equals(currentNode)){ book.setAuthor(value.trim()); } else if ("price".equals(currentNode)){ book.setPrice(value.trim()); } else if ("editorial".equals(currentNode)){ book.setEditorial(value.trim()); isBookFound = false; } } 9
  • 10. } // Overrided public void endElement(String arg0, String arg1, String arg2) throws SAXException { } // Overrided public void endPrefixMapping(String arg0) throws SAXException { } // Overrided public void ignorableWhitespace(char[] arg0, int arg1, int arg2) throws SAXException { } // Overrided public void processingInstruction(String arg0, String arg1) throws SAXException { } // Overrided public void setDocumentLocator(Locator arg0) { } // Overrided public void skippedEntity(String arg0) throws SAXException { } // Overrided public void startPrefixMapping(String arg0, String arg1) throws SAXException { } } MyErrorHandler.java: import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; public class MyErrorHandler implements ErrorHandler { // Overrided public void warning(SAXParseException ex) { System.err.println("[Warning] : "+ ex.getMessage()); } // Overrided public void error(SAXParseException ex) { System.err.println("[Error] : "+ex.getMessage()); } // Overrided public void fatalError(SAXParseException ex) throws SAXException { System.err.println("[Error!] : "+ex.getMessage()); } } With our xml (books.xml), and the book code to search 0000000003, we can executed our program with: java MySAXSearcher “books.xml” “0000000003” 10
  • 11. The result must be the following: ***Start document*** ***End document*** Document books.xml Parsed in: 141ms Book found: Isbn: 0000000003 Name: Book 3 Author: Author name 3 Price: 29.45 Editorial: Editorial 3 3. DOM DOM (Document Object Model), while SAX offers access at all elements of document, DOM brings the parsing as a tree that can be parsed and transformed. DOM has some disadvantages and advantages with regards to SAX: Disadvantage: • The data can be acceded only when the entire document is parsed. • The tree is an object loaded on the memory; this is problematic for big and complex documents. Advantages: • With DOM we can manipulate (update, delete and add elements) the xml document. Also, we can create a new xml document. To manipulate an xml document, we must instantiate a Document (interface) object that implements the Document interface (extends the interface Node). We use the classes’ javax.xml.parsers.DocumentBuilder and javax.xml.parsers.DocumentBuilderFactory, we invoke the method parse() to obtain a Document object. For manipulate an XML with DOM, there are some important classes’: org.w3c.dom.Document (interface representing the entire XML document), org.w3c.dom.Element (Elements in the XML document), org.w3c.dom.Node (node that has some elements) and org.w3c.dom.Att (The attributes of every element). Ok, now let’s talk in java code language. As DTO (Data Transfer Object), I use the same object MyBook. 11
  • 12. MyDOMSearcher.java: import java.io.File; import java.io.IOException; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.SAXException; public class MyDOMSearcher { public static void main(String[] args) { MyDOMSearcher searcher = new MyDOMSearcher(); searcher.searchBook(args[0], args[1]); } private void searchBook(String xml, String isbn) { long before = System.currentTimeMillis(); MyBook book = null; try{ DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setNamespaceAware(true); factory.setValidating(true); DocumentBuilder parser = factory.newDocumentBuilder(); // I assign my own ErrorHandler to my Parser parser.setErrorHandler(new MyErrorHandler()); File file = new File(xml); Document doc = parser.parse (file); // I obtain all the elements <book> // NodeList is an interface that has 2 methods: // 1. item(int): returns the Node (Interface) Object of the position int. // 2. getLength(): returns the length of the List NodeList booksNodes = doc.getElementsByTagName("book"); NodeList bookChildsNodes = null; String isbnAttribute = ""; for(int i = 0; i < booksNodes.getLength(); i++) { Node node = booksNodes.item(i); if(node != null && node.hasAttributes()) { isbnAttribute = node.getAttributes().getNamedItem("isbn").getNodeValue(); if(isbnAttribute.equals(isbn)){ //I've caught the isbn searched if(book == null){ book = new MyBook(); book.setIsbn(isbn); } if(node.hasChildNodes()){ bookChildsNodes = node.getChildNodes(); for (int j = 0; j < bookChildsNodes.getLength(); j++) { if("name".equals(bookChildsNodes.item(j).getNodeName())){ book.setName(bookChildsNodes.item(j).getTextContent()); 12
  • 13. }else if("author".equals(bookChildsNodes.item(j).getNodeName())){ book.setAuthor(bookChildsNodes.item(j).getTextContent()); }else if("price".equals(bookChildsNodes.item(j).getNodeName())){ book.setPrice(bookChildsNodes.item(j).getTextContent()); }else if("editorial".equals(bookChildsNodes.item(j).getNodeName())){ book.setEditorial(bookChildsNodes.item(j).getTextContent()); // I've found my book. Ending the for iteration break; } } } } } } }catch(IOException ioe){ System.err.println("[Error] : "+ioe.getMessage()); }catch(ParserConfigurationException pce){ System.err.println("[Error] : "+pce.getMessage()); }catch(SAXException se){ System.err.println("[Error] : "+se.getMessage()); } long after = System.currentTimeMillis(); printResults(xml, book, after - before); } public void printResults(String xml, MyBook book, long time) { System.out.println("Document " + xml + ". Parsed in : " + time + " ms"); if (book != null){ System.out.println("Book found:"); System.out.println(" Isbn: " + book.getIsbn()); System.out.println(" Name: " + book.getName()); System.out.println(" Author: " + book.getAuthor()); System.out.println(" Price: " + book.getPrice()); System.out.println(" Editorial: " + book.getEditorial()); }else{ System.out.println("Book not found"); } } } 13
  • 14. 4. JDOM All the precedents API’s are available for many programming languages, but their use is laborious in Java. A specific API has been made for java (JDOM), that API uses the own capacities and features of Java, therefore, using it make the XMlL parsing easily. We can find some related information on www.jdom.org. Now, let’s make the same example (searching a book in our XML) with JDOM (be sure that the jar is installed in your classpath, you can download it on http://www.jdom.org/dist/binary/). MyJDOMSearcher.java: import java.io.IOException; import java.util.Iterator; import java.util.List; import org.jdom.Document; import org.jdom.Element; import org.jdom.JDOMException; import org.jdom.input.SAXBuilder; public class MyJDOMSearcher { private String isbn; private MyBook book; private boolean noSearchMore = false; public static void main(String[] args) { try { long before = System.currentTimeMillis(); MyJDOMSearcher searcher = new MyJDOMSearcher(); // The second parameter is the isbn to search searcher.isbn = args[1]; SAXBuilder saxBuilder = new SAXBuilder(); Document document = saxBuilder.build(args[0]); searcher.searchBook(document.getRootElement()); long after = System.currentTimeMillis(); searcher.printResults(args[0], after-before); } catch (JDOMException jde){ System.err.println("[Error] JDOMException: "+jde.getMessage()); } catch (IOException ioe){ System.err.println("[Error] IOException: "+ioe.getMessage()); } } private void searchBook(Element element){ inspect(element); List content = element.getContent(); Iterator iterator = content.iterator(); Element child = null; Object object = null; 14
  • 15. while(iterator.hasNext()){ // All times we have "books" node object = iterator.next(); if(object instanceof Element){ child = ((Element)object); //Casting from Object to Element searchBook(child); } } } // Recursively descend the tree public void inspect(Element element) { if (!noSearchMore){ // If I've had the book yet, I'll do anything if("book".equals(element.getQualifiedName()) & book == null){ if(isbn.equals(element.getAttribute("isbn").getValue())){ book = new MyBook(); book.setIsbn(isbn); } } if(book != null){ if("name".equals(element.getQualifiedName())){ book.setName(element.getValue()); } if("author".equals(element.getQualifiedName())){ book.setAuthor(element.getValue()); } if("price".equals(element.getQualifiedName())){ book.setPrice(element.getValue()); } if("editorial".equals(element.getQualifiedName())){ book.setEditorial(element.getValue()); noSearchMore = true; } } } } private void printResults(String xml, long time) { System.out.println("Document " + xml + ". Parsed in : " + time + " ms"); if (book != null){ System.out.println("Book found:"); System.out.println(" Isbn: " + book.getIsbn()); System.out.println(" Name: " + book.getName()); System.out.println(" Author: " + book.getAuthor()); System.out.println(" Price: " + book.getPrice()); System.out.println(" Editorial: " + book.getEditorial()); } else { System.out.println("Book not found"); } } } 15
  • 16. 5. Conclusion Executing the same example with the three API’s (MySAXSearcher, MyDOMSearcher and MyJDOMSearcher) having us parameters received the same xml file and the isbn to search ("0000000003"), the result (in time) obtained is the following: MySAXSearcher MyDOMSearcher MyJDOMSearcher 93 ms 750 ms 609 ms The SAX API is faster than DOM and JDOM (But it’s laborious). 16