SlideShare une entreprise Scribd logo
1  sur  90
Télécharger pour lire hors ligne
Gegevensbanken: toekomst...

Kortrijk, 31 March 2011




Erik Duval
http://erikduval.wordpress.com
@ErikDuval




                                 1

Thursday 31 March 2011
http://www.slideshare.net/erik.duval




Thursday 31 March 2011
                         2
which database holds the web?



                         3

Thursday 31 March 2011
•         XML
               •         NoSQL (Met dank aan Steven Noels)




                                           4

Thursday 31 March 2011
? XML ?



                            5

Thursday 31 March 2011
Thursday 31 March 2011
                         6




                             http://en.wikipedia.org/wiki/Extensible_Markup_Language
Thursday 31 March 2011
                         7




                             http://www.itjobboard.be/ICT-banen/xml/Belgie/alle/0/relevantie/nl/
8   http://www.khbo.be/12385
Thursday 31 March 2011
9   http://www.w3.org/XML
Thursday 31 March 2011
10   http://www.w3c.it/talks/2005/openCulture/slide7-0.html

Thursday 31 March 2011
Thursday 31 March 2011
                         11




                              http://en.wikipedia.org/wiki/List_of_XML_markup_languages
XML is not ...
   •      Extension of HTML
         •      XHTML is XML-compliant, and extensible

   •      Just for Web pages
         •      Useful when data are stored or exchanged

   •      Concerned with semantics
         •      XML does not define semantics, just syntax

   •      Innovative new technology
         •      Standard, building on existing technology

   •      Only a hype
         •      Though also
Thursday 31 March 2011
                                        12
XML is ...
   •      Endorsed by W3C and major companies
   •      Extensible
         •      No tag name limitations
         •      No language limitations
   •      Human   software developer-readable

         •      Can be processed with basic text tools
   •      Open standard
         •      no vendor lock-in (in theory...)
   •      Easy to implement
         •      powerful, cheap (free), off-the-shelf XML tools
Thursday 31 March 2011
                                         13
when was XML invented?



                         14

Thursday 31 March 2011
•         1969: SGML (Standard Generalized Markup Language)
                     •      Meta-language: describe other languages
                     •      Powerful, but rather complicated
                     •      1986: ISO standard

               •         1992: HTML (HyperText Markup Language)
                     •      Based on SGML
                     •      Simple, but limited

               •         1996: Start design of XML
                     •      By World Wide Web Consortium (W3C)

               •         1998: Publication of XML 1.0
                                               15

Thursday 31 March 2011
Design Goals
               •         Easy to use over the Internet
                     •     Power of SGML
                     •     Simplicity of HTML
               •         Human-legible
               •         Easy to create
               •         Compactness is not an issue
               •         “The ASCII of the Web”
                                       16

Thursday 31 March 2011
what does XML look like?


                         17

Thursday 31 March 2011
XML Basics
            <Person>
                <Name>
                         <First>Thomas</First>
                         <Last>Atkinson</Last>
                </Name>
                <Age>30</Age>
            </Person>



               •         Self-defined, meaningful tags
               •         Separate data and its representation
                                           18

Thursday 31 March 2011
•      Language for defining syntax
   •      Records and fields have explicit boundaries
         •      parse-able without knowing structure (self-descriptive)
   •      Unicode support (UTF-8, UTF-16, ...)
   •      Web-aware
         •      DTD, ENTITY and Schema can be loaded through URL
   •      Strictly parsed: no ambiguity (case sensitive!)
   •      Extensible: namespaces

                                        19

Thursday 31 March 2011
<?xml version="1.0” encoding=“UTF-8”?>
    <!-- processing instruction: XML follows -->
  <!DOCTYPE addressbook SYSTEM
      "http://www/~koenh/ddml/addressbook.dtd”>
        <!-- Document Type Declaration... -->
        <!-- ExternalDTDPointer -->
  <addressbook> <!--root element -->
    <person first-name="John" family-name="Doe”
      employee-number="1234">
      <contact-info>
        <email address="Jdoe@home.com"/>
      </contact-info>
      <address street="Celestijnenlaan”
        number="200A"/>
    </person></addressbook>
                         20

Thursday 31 March 2011
<H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”        > a Heading </H1>
                           attribute
                           opening                        closing
                                                  content
                             tag                            tag
                                       element




                                          21

Thursday 31 March 2011
•      Cfr. HTML markup tags

           <H1           align=”center”     > a Heading </H1>
                           attribute
                           opening                       closing
                                                 content
                             tag                           tag
                                       element

    •      Major differences:
          •      Case sensitive
          •      Proper nesting: No <A> … <B> … </A> … </B>
          •      Unicode instead of ASCII
                                       21

Thursday 31 March 2011
Vocabularies

    •      Agreed-upon XML tag sets for specific domain
    •      Examples
         •       Chemical Markup Language (CML)
         •       Business: ebXML, RosettaNet, BizTalk
         •       Mathematics: MathML
         •       Multimedia: Synchronized Multimedia Integration Language (SMIL)
         •       Etc.
                                           22

Thursday 31 March 2011
•      well-formed: follows XML syntax

         •      Proper tag and attribute names

         •      Tags properly closed

         •      Attributes and text between tags do not contain
                ‘<‘ (escape with &lt;)

   •      valid: well-formed and vocabulary

         •      All elements and their attributes declared in DTD

         •      Attribute values follow DTD type declarations
              •          CDATA, ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, enumerated

         •      Nesting and sequencing of elements follows DTD
                                              23

Thursday 31 March 2011
Elements
     •      XML’s container for
          •       Attributes
          •       Character data
          •       Other elements (“child” elements)

     •      Delimited by opening and closing tags
          •       Non-empty element:	

 <name>..</name>

          •       Empty element:      	

<name/>

     •      Form a simple hierarchic tree
          •       Root = “document element”
                                        24

Thursday 31 March 2011
Attributes and Strings
      •      Attributes
           •       Name-value pairs: name=value
           •       Only strings as value!
      •      Strings
           •       Enclosed by ‘...’ or “...”
                   → replace with &apos; or &quot;
      •      Character data
           •       Any text that is not markup
           •       ‘&’, ‘<’ and ‘>’ are markup
                    → replace with &amp; &lt; and &gt;
                                    25

Thursday 31 March 2011
Document structure

   •      Prolog (optional)
         •      <?xml version="1.0” encoding=“UTF-8”?>

              •                (compulsory)
                         version="number"

              •
             encoding="character encoding" (optional)

   •      Document type declaration
           • <!DOCTYPE document_element ... >

• Body
     – The document element
                                        26

Thursday 31 March 2011
Another example
<?xml version="1.0" standalone="no"?>
<!DOCTYPE BankAccounts ...>
<!-- This is an example XML document -->
<BankAccounts>
        <Account accountNr="123-456789-01" use="personal">
                <Owners> <Person ID="1258-a8d72-98">
                          <Name>John Smith</Name></Person>
                         <Person ID="5842-df5ef-e9">
                          <Name>Claudia Scott</Name></Person>
                </Owners>
                <CreditCards><CreditCard number="12345"/></CreditCards>
                <Balance Currency="EUR">50000</Balance>
        </Account>
         ...
</BankAccounts>                        27

Thursday 31 March 2011
namespaces: problem
<widget type="gadget">
     <head size="medium"/>
     <big><subwidget ref="gizmo"/></big>
     <info>
          <head><title>Gadget</title></head>
          <body><h1>Gadget</h1>
                A gadget contains a big gizmo
          </body>                        Name collision!
     </info>
</widget>                         28

Thursday 31 March 2011
solution ?



                             29

Thursday 31 March 2011
namespaces: approach


   •      A collection of names, identified by a URI
          reference, which are used in XML documents as
          element types and attribute names
     •xmlns:prefix="URI"
   •      URI used only as identifier
         •      does not need to point to anything

   •      applies to all nested elements and attributes
                                     30

Thursday 31 March 2011
namespaces: example
 <widget xmlns="http://www.widget.org"
      xmlns:xhtml="http://www.w3.org/TR/xhtml1"
      type="gadget">
      <head size="medium"/>
      <big><subwidget ref="gizmo"/></big>
      <info><xhtml:head><xhtml:title>Gadget
                           </xhtml:title></xhtml:head>
                         <xhtml:body><xhtml:h1>Gadget
                           </xhtml:h1>A gadget contains...
                         </xhtml:body></info>
   </widget>                           31

Thursday 31 March 2011
Another example

<Address>                            <Server>
  <Street>Celestijnenlaan</Street>     <Name>www</Name>
  <Nr>200A</Nr>                        <Address>
                                           134.58.43.1
  <City>Heverlee-Leuven</City>
                                         </Address>
  <Country>Belgium</Country>         </Server>
</Address>




                               ?
                                32

Thursday 31 March 2011
Another example (2)
<Address                                   <Server
  xmlns="www.all.edu/departments">           xmlns="www.dns.net/servers">
  <Street>Celestijnenlaan</Street>           <Name>www</Name>
  <Nr>200A</Nr>                              <Address>
  <City>Heverlee-Leuven</City>                 134.58.43.1
                                             </Address>
  <Country>Belgium</Country>               </Server>
</Address>



       <Department xmlns:edu="www.all.edu/departments"
                   xmlns:dns="www.dns.net/servers">
         <edu:Address>
           <Street>Celestijnenlaan</Street>
           ...
         </edu:Address>
         <dns:Name>www</dns:Name>
         <dns:Address>134.58.43.1</dns:Address>
       </Department>


                                      33

Thursday 31 March 2011
how would you process XML?



                         34

Thursday 31 March 2011
Accessing XML documents

   •       Manual text file manipulation
         •      Cumbersome & Error-prone

   •       Parser
         •      Simplifies document manipulation
               •         Ensures proper grammar, well-formedness

               •         Abstracts content from grammar

         •      Accessed through standard API
            • Document Object Model (DOM)
            • Simple API for XML (SAX)
                                               35

Thursday 31 March 2011
•      DOM parser
         •      create DOM object tree
   •      SAX parser
         •      generates events when elements encountered
         •      one-pass translation
         •      no need to keep whole document tree in memory
   •      Both can be validating or non-validating
   •      Many available
          (most freeware, open source)
         •      ibm xml4j, apache xerces, sun parser, microsoft,
                datachannel, oracle, ...
                                        36

Thursday 31 March 2011
DOM approach




                                   http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview/3_apis.html#JAXP
                              37

Thursday 31 March 2011
Dom Benefits & Drawbacks

    •      Benefits
          •      W3C Recommendation
          •      Language- and platform-independent
          •      Random access
          •      Intuitive
    •      Drawback
          •      Entire object tree in memory
                                    38

Thursday 31 March 2011
Simple API for XML (SAX)

    •      Not an official standard
         •       Ad-hoc product by XML developers
         •       Primarily Java API
    •      Event-based mechanism
         •       Don’t call the parser, the parser calls you
         •       No object model in memory
         •       Programmer must keep state information
                                      39

Thursday 31 March 2011
http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview/3_apis.html#JAXP




Thursday 31 March 2011
                         40
                                                                                                      SAX approach
SAX Benefits & Drawbacks
   •       Benefits
         •      Suitable when
               •         parsing large documents

               •         constructing proprietary object structures

               •         only small subset of information is needed

         •      Simple and fast

   •       Drawbacks
         •      Read-only
         •      No random access
         •      Complex searches messy to program
Thursday 31 March 2011
                                     41
how to define valid instances?



                         42

Thursday 31 March 2011
XML Schema
    •      typering van waarden

          •      vb. integer, string, enz.
          •      ook beperkingen op min/max waarden
    •      types door gebruiker gedefinieerd
    •      is gespecificeerd in XML syntax,
          •      meer gestandaardiseerde voorstelling

    •      is geïntegreerd met namespaces
    •      en nog andere mogelijkheden
          •      lijst types, uniciteitsbeperking op sleutels,
                 verwijssleutelbeperkingen, overerving,…
                                               43

Thursday 31 March 2011
XSDL


               •         XML Schema Definition Language
               •         documenten met suffix .xsd




                                           44

Thursday 31 March 2011
XML Schema: voorbeeld
       XML schema

       <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
       ....
       <xsd:element name="PWORKER" minOccurs="0" maxOccurs="unbounded">
         <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="HOURS" type="xsd:float"/>
            </xsd:sequence>
            <xsd:attribute name="SSN" type="xsd:IDREF" use="required"/>
         </xsd:complexType>
       </xsd:element>
       ....
       </xsd:schema>


       XML instantie

                         <PWORKER SSN="_123456789">
                           <HOURS>7.5</HOURS>
                         </PWORKER>            45

Thursday 31 March 2011
XML: eenvoudige types
–        ingebouwde eenvoudige types
        •      string, integer, decimal, float, boolean, date, time,…
        •      <xsd:element name=“gebdat” type=“xsd:date” />
–        door gebruiker gedefinieerde eenvoudige types
        •      gedefinieerd met simpleType element
        •      restriction element geeft het basistype waarop gesteund is
        •      <xsd:simpleType name=“salaryRange”>
                 <xsd:restriction base=“xsd:integer”>
                   <xsd:minInclusive value=“25000” />
                   <xsd:maxInclusive value=“100000” />
                 </xsd:restriction>
               </xsd:simpleType>
                                         46

Thursday 31 March 2011
XML: eenvoudige types
       <xsd:simpleType name=“studentClassificatie”>
             <xsd:restriction base=“xsd:string”>
               <xsd:enumeration value=“bachelorstudent” />
       	

 	

 <xsd:enumeration value=“masterstudent” />
       	

 	

 <xsd:enumeration value=“doctorstudent” />
             </xsd:restriction>
       </xsd:simpleType>

       <xsd:simpleType name=“deptType”>
         <xsd:restriction base=“xsd:string”>
           <xsd:length value=“3” />
         </xsd:restriction>
       </xsd:simpleType>            47

Thursday 31 March 2011
48

Thursday 31 March 2011
49

Thursday 31 March 2011
50

Thursday 31 March 2011
51

Thursday 31 March 2011
how to query XML?


                                 52

Thursday 31 March 2011
XPath (example)
                                 ROOT

                                        COMPANY
             /COMPANY/EMPLOYEE

                                             EMPLOYEE

                                                  SSN

                                                        _123456789

                                             EMPLOYEE

                                                  SSN

                                                        _333445555


                                             EMPLOYEE

                                                  SSN

                                                        _999887777


                                  53

Thursday 31 March 2011
ROOT

                                        COMPANY

            / COMPANY/EMPLOYEE
                                             EMPLOYEE

                                                  SSN

                                                        _123456789

                                             EMPLOYEE

                                                  SSN

                                                        _333445555


                                             EMPLOYEE

                                                  SSN

                                                        _999887777


                                  54

Thursday 31 March 2011
ROOT

                                             COMPANY
                  /COMPANY/EMPLOYEE
                                                  EMPLOYEE

                                                       SSN

                                                             _123456789

                                                  EMPLOYEE

                                                       SSN

                                                             _333445555


                                                  EMPLOYEE

                                                       SSN

                                                             _999887777


                                       55

Thursday 31 March 2011
ROOT

                                       COMPANY

                         /
            /COMPANY EMPLOYEE
                                            EMPLOYEE

                                                 SSN

                                                       _123456789

                                            EMPLOYEE

                                                 SSN

                                                       _333445555


                                            EMPLOYEE

                                                 SSN

                                                       _999887777


                                 56

Thursday 31 March 2011
ROOT

                                         COMPANY
                         EMPLOYEE
                 /COMPANY/
                                              EMPLOYEE

                                                   SSN

                                                         _123456789

                                              EMPLOYEE

                                                   SSN

                                                         _333445555


                                              EMPLOYEE

                                                   SSN

                                                         _999887777


                                    57

Thursday 31 March 2011
XPath    ROOT

                                                       COMPANY
             /COMPANY/EMPLOYEE

                                                            EMPLOYEE


           <EMPLOYEE SSN="_123456789" SEX="M“                    SSN
              SUPERSSN="_333445555" DNO="_5">
                    <FNAME>John</FNAME>                                _123456789
                     <MINIT>B</MINIT>
                           ....                             EMPLOYEE
                       </EMPLOYEE>
           <EMPLOYEE SSN="_333445555" SEX="M“                    SSN
              SUPERSSN="_888665555" DNO="_5">
                 <FNAME>Franklin</FNAME>
                     <MINIT>T</MINIT>                                  _333445555
                    <LNAME>Wong</LNAME>
                 <BDATE>08-DEC-45</BDATE>
                       </EMPLOYEE>                          EMPLOYEE
           <EMPLOYEE SSN="_999887777" SEX="F“
              SUPERSSN="_987654321" DNO="_4">                    SSN
                  <FNAME>Alicia</FNAME>
                                                                       _999887777
                         .....


                                                 58

Thursday 31 March 2011
XML family of technologies

   •      Xlink: hypertext

   •      XSL: Extensible Style Sheet Language

         •      XSL-T Transformation

         •      Formatting Objects

   •      Xschema: additional constraints on attribute types

   •      and more...

                                       59

Thursday 31 March 2011
XML applications
   •       RDF: Resource Description Framework

         •      infra

   •       XHTML: eXtensible HTML en HTML5
         •      XML compliant HTML

   •       MathML

   •       SMILE: synchronized multimedia presentation

   •       Many others

         •      Chemical Markup Language,Vector Graphics Markup Language, Open Software
                Description Format, Weather observation, astronomical data, financial data,
                electronic components, workflow, business cards, real estate, newspaper,
                classifieds, javadoc, human resource, advertising, architecture ….
                                               60

Thursday 31 March 2011
More XPath Features
  •    Operator “|” used to implement union

      •    E.g. //EMPLOYEE[count(DEPENDENT) = 1] | //EMPLOYEE[not(DEPENDENT)]

          •    gives employees with either 0 or 1 dependents

  •    “//” can be used to skip multiple levels of nodes

      •    E.g. /COMPANY//FNAME

          •    finds any FNAME element anywhere under the /COMPANY element, regardless of the
               element in which it is contained.

  •    A step in the path can go to:

            parents, siblings, ancestors and descendants
      of the nodes generated by the previous step, not just to the children

      •    “//”, described above, is a short from for specifying “all descendants”

      •    “..” specifies the parent.

          •    e.g. : /COMPANY//FNAME/../BDATE
                                             61

Thursday 31 March 2011
XQuery
   •       laat toe om meer algemene queries te formuleren dan XPath
   •       algemene vorm: FLWOR uitdrukking
                         FOR 	

 	

 < for-variabele > IN < in-uitdrukking >
                         LET	

 	

 < let-variabele > := < let-uitdrukking >
                         [ WHERE	

 < filter-uitdrukking > ]
                         [ ORDER BY	

 < orde-specificatie > ]
                         RETURN	

 uitdrukking >
                               <
   •       opm: FOR en LET kunnen alleen of samen voorkomen
                                        62

Thursday 31 March 2011
•       Q1: voornaam en familienaam van alle werknemers die meer
           dan 70000 verdienen
         •      FOR $x IN doc(www.company.com/info.xml)
                // employee [employeeSalary > 70000] / employeeName
                RETURN < res > $x / firstName, $x / lastName </ res >
   •       alternatief:
           FOR $x IN doc(www.company.com/info.xml)
           company / employee
           WHERE $x / employeeSalary > 70000
           RETURN < res > $x / employeeName / firstName,
                           $x / employeeName / lastName </ res >

                                      63

Thursday 31 March 2011
•       Q3: voornaam en familienaam van alle werknemers die meer
           dan 20 uur op project nummer 5 werken, met dat aantal uren
   •       FOR $x IN doc(www.company.com/info.xml)
           / company / project [projectNumber = 5] / projectWorker ,
           $y IN doc(www.company.com/info.xml) / company /
           employee
           WHERE $x/hours > 20.0 AND $y.ssn = $x.ssn
           RETURN < res > $y / employeeName / firstName,
           $y / employeeName / lastName,
           $x / hours </ res >


                                    64

Thursday 31 March 2011
•         XML
               •         NoSQL (Met dank aan Steven Noels)




                                           65

Thursday 31 March 2011
66

Thursday 31 March 2011
Hoe bovenop SQL?



                                67

Thursday 31 March 2011
select	

 fun, profit
           from	

	

 real_world
           where	

relational=false;


                         68

Thursday 31 March 2011
NoSQL
               •         problems with existing relational approach for
                         Amazon (Dynamo) and Google (BigTable)
                     •      flexibility, performance, scaling, cost
                           •   millions of users
                           •   application changes rolled out
                               incrementally without downtime
                     •      now more broadly applicable (velcro)
               •         Open source developments:
                         Facebook,Yahoo! - Cassandra, Hadoop,
                         MapReduce, Hive, Pig
                                              69

Thursday 31 March 2011
http://www.odbms.org/download/NoSQL-Whitepaper-1.pdf
                                                   70

Thursday 31 March 2011
NoSQL

               •         non-relational
               •         distributed
               •         open source
               •         horizontally scalable



                                                 71

Thursday 31 March 2011
NoSQL

               •         non-relational               •   “web scale”
               •         distributed                  •   schema free
               •         open source                  •   easy replication
               •         horizontally scalable        •   simple API



                                                 71

Thursday 31 March 2011
Systems
               •         Core: Hadoop, HBase, Cassandra, Hypertable, ...
               •         Docs: CouchDB, MongoDB, Riak, Terrastore, ...
               •         Key-Value, tuple: Amazon SimpleDB, Azure, ...
               •         Graph: Neo4J, Bigdata, InfoGrid, HyperGraph, ...
               •         Object:Versant, Perst, ZODB, ...
               •         Grid: GigaSpaces, Hazelcast, ...
               •         XML: Tamino, eXist, Mark Logic, Xindice, ...
               •         ...
                                               72             http://nosql-databases.org/
Thursday 31 March 2011
73

Thursday 31 March 2011
http://about.digg.com/blog/looking-future-cassandra
                                                  74

                                                       Thursday 31 March 2011
http://about.digg.com/blog/looking-future-cassandra
                                                  74

                                                       Thursday 31 March 2011
http://about.digg.com/blog/looking-future-cassandra
14 seconds



                                                    74

                                                         Thursday 31 March 2011
http://about.digg.com/blog/looking-future-cassandra
                         75

Thursday 31 March 2011
Text




                          76    http://www.slideshare.net/oemebamo/database-sharding-at-netlog-presentation

Thursday 31 March 2011
77

Thursday 31 March 2011
no attempt to ACID
               •         Atomicity
               •         Consistency
               •         Isolation
               •         Durability


               •         BASE: trade ACID off in favor of high availability

                                              78

Thursday 31 March 2011
http://cacm.acm.org/blogs/blog-cacm/50678-the-nosql-discussion-has-nothing-to-do-with-sql/fulltext



                                                                79

Thursday 31 March 2011
Questions?
                         http://erikduval.wordpress.com/
                                twitter: @ErikDuval



                                        80

Thursday 31 March 2011

Contenu connexe

Plus de Erik Duval

InfoVis1415: slides sessie 12, 18 mei 2015
InfoVis1415: slides sessie 12, 18 mei 2015InfoVis1415: slides sessie 12, 18 mei 2015
InfoVis1415: slides sessie 12, 18 mei 2015Erik Duval
 
InfoVis1415: slides sessie 11, 11 mei 2015
InfoVis1415: slides sessie 11, 11 mei 2015InfoVis1415: slides sessie 11, 11 mei 2015
InfoVis1415: slides sessie 11, 11 mei 2015Erik Duval
 
InfoVis1415: slides sessie 10, 4 mei 2015
InfoVis1415: slides sessie 10, 4 mei 2015InfoVis1415: slides sessie 10, 4 mei 2015
InfoVis1415: slides sessie 10, 4 mei 2015Erik Duval
 
Evaluation
 of information visualisation
Evaluation
 of information visualisationEvaluation
 of information visualisation
Evaluation
 of information visualisationErik Duval
 
InfoVis1415: slides sessie 9, 27 april 2015
InfoVis1415: slides sessie 9, 27 april 2015InfoVis1415: slides sessie 9, 27 april 2015
InfoVis1415: slides sessie 9, 27 april 2015Erik Duval
 
Social Media and Science a wedding made in Heaven...
 or in Hell?
Social Media and Science a wedding made in Heaven...
 or in Hell?Social Media and Science a wedding made in Heaven...
 or in Hell?
Social Media and Science a wedding made in Heaven...
 or in Hell?Erik Duval
 
Information visualisation: 
Data ink design principles
Information visualisation: 
Data ink design principlesInformation visualisation: 
Data ink design principles
Information visualisation: 
Data ink design principlesErik Duval
 
InfoVis1415: slides sessie 8, 20 april 2015
InfoVis1415: slides sessie 8, 20 april 2015InfoVis1415: slides sessie 8, 20 april 2015
InfoVis1415: slides sessie 8, 20 april 2015Erik Duval
 
A short history (and even shorter future)
 of information visualisation
A short history (and even shorter future)
 of information visualisationA short history (and even shorter future)
 of information visualisation
A short history (and even shorter future)
 of information visualisationErik Duval
 
InfoVis1415: slides sessie 7, 30 March 2015
InfoVis1415: slides sessie 7, 30 March 2015InfoVis1415: slides sessie 7, 30 March 2015
InfoVis1415: slides sessie 7, 30 March 2015Erik Duval
 
InfoVis1415: slides sessie 6, 23 March 2015
InfoVis1415: slides sessie 6, 23 March 2015InfoVis1415: slides sessie 6, 23 March 2015
InfoVis1415: slides sessie 6, 23 March 2015Erik Duval
 
History of Human Computer Interaction
History of Human Computer InteractionHistory of Human Computer Interaction
History of Human Computer InteractionErik Duval
 
InfoVis1415: slides sessie 5, 9 March 2015
InfoVis1415: slides sessie 5, 9 March 2015InfoVis1415: slides sessie 5, 9 March 2015
InfoVis1415: slides sessie 5, 9 March 2015Erik Duval
 
InfoVis1415: slides sessie 4, 2 March 2015
InfoVis1415: slides sessie 4, 2 March 2015InfoVis1415: slides sessie 4, 2 March 2015
InfoVis1415: slides sessie 4, 2 March 2015Erik Duval
 
InfoVis1415: slides sessie 3, 23 Feb 2015
InfoVis1415: slides sessie 3, 23 Feb 2015InfoVis1415: slides sessie 3, 23 Feb 2015
InfoVis1415: slides sessie 3, 23 Feb 2015Erik Duval
 
InfoVis1415: slides sessie 2, 16 Feb 2015
InfoVis1415: slides sessie 2, 16 Feb 2015InfoVis1415: slides sessie 2, 16 Feb 2015
InfoVis1415: slides sessie 2, 16 Feb 2015Erik Duval
 
Technology that makes HUMANS smarter
Technology that makes HUMANS smarterTechnology that makes HUMANS smarter
Technology that makes HUMANS smarterErik Duval
 
InfoVis1415: slides sessie 1, 10 Feb 2015
InfoVis1415: slides sessie 1, 10 Feb 2015InfoVis1415: slides sessie 1, 10 Feb 2015
InfoVis1415: slides sessie 1, 10 Feb 2015Erik Duval
 
201502010 pen ocw_les1_erik
201502010 pen ocw_les1_erik201502010 pen ocw_les1_erik
201502010 pen ocw_les1_erikErik Duval
 
Inleiding Human Computer Interaction
Inleiding Human Computer InteractionInleiding Human Computer Interaction
Inleiding Human Computer InteractionErik Duval
 

Plus de Erik Duval (20)

InfoVis1415: slides sessie 12, 18 mei 2015
InfoVis1415: slides sessie 12, 18 mei 2015InfoVis1415: slides sessie 12, 18 mei 2015
InfoVis1415: slides sessie 12, 18 mei 2015
 
InfoVis1415: slides sessie 11, 11 mei 2015
InfoVis1415: slides sessie 11, 11 mei 2015InfoVis1415: slides sessie 11, 11 mei 2015
InfoVis1415: slides sessie 11, 11 mei 2015
 
InfoVis1415: slides sessie 10, 4 mei 2015
InfoVis1415: slides sessie 10, 4 mei 2015InfoVis1415: slides sessie 10, 4 mei 2015
InfoVis1415: slides sessie 10, 4 mei 2015
 
Evaluation
 of information visualisation
Evaluation
 of information visualisationEvaluation
 of information visualisation
Evaluation
 of information visualisation
 
InfoVis1415: slides sessie 9, 27 april 2015
InfoVis1415: slides sessie 9, 27 april 2015InfoVis1415: slides sessie 9, 27 april 2015
InfoVis1415: slides sessie 9, 27 april 2015
 
Social Media and Science a wedding made in Heaven...
 or in Hell?
Social Media and Science a wedding made in Heaven...
 or in Hell?Social Media and Science a wedding made in Heaven...
 or in Hell?
Social Media and Science a wedding made in Heaven...
 or in Hell?
 
Information visualisation: 
Data ink design principles
Information visualisation: 
Data ink design principlesInformation visualisation: 
Data ink design principles
Information visualisation: 
Data ink design principles
 
InfoVis1415: slides sessie 8, 20 april 2015
InfoVis1415: slides sessie 8, 20 april 2015InfoVis1415: slides sessie 8, 20 april 2015
InfoVis1415: slides sessie 8, 20 april 2015
 
A short history (and even shorter future)
 of information visualisation
A short history (and even shorter future)
 of information visualisationA short history (and even shorter future)
 of information visualisation
A short history (and even shorter future)
 of information visualisation
 
InfoVis1415: slides sessie 7, 30 March 2015
InfoVis1415: slides sessie 7, 30 March 2015InfoVis1415: slides sessie 7, 30 March 2015
InfoVis1415: slides sessie 7, 30 March 2015
 
InfoVis1415: slides sessie 6, 23 March 2015
InfoVis1415: slides sessie 6, 23 March 2015InfoVis1415: slides sessie 6, 23 March 2015
InfoVis1415: slides sessie 6, 23 March 2015
 
History of Human Computer Interaction
History of Human Computer InteractionHistory of Human Computer Interaction
History of Human Computer Interaction
 
InfoVis1415: slides sessie 5, 9 March 2015
InfoVis1415: slides sessie 5, 9 March 2015InfoVis1415: slides sessie 5, 9 March 2015
InfoVis1415: slides sessie 5, 9 March 2015
 
InfoVis1415: slides sessie 4, 2 March 2015
InfoVis1415: slides sessie 4, 2 March 2015InfoVis1415: slides sessie 4, 2 March 2015
InfoVis1415: slides sessie 4, 2 March 2015
 
InfoVis1415: slides sessie 3, 23 Feb 2015
InfoVis1415: slides sessie 3, 23 Feb 2015InfoVis1415: slides sessie 3, 23 Feb 2015
InfoVis1415: slides sessie 3, 23 Feb 2015
 
InfoVis1415: slides sessie 2, 16 Feb 2015
InfoVis1415: slides sessie 2, 16 Feb 2015InfoVis1415: slides sessie 2, 16 Feb 2015
InfoVis1415: slides sessie 2, 16 Feb 2015
 
Technology that makes HUMANS smarter
Technology that makes HUMANS smarterTechnology that makes HUMANS smarter
Technology that makes HUMANS smarter
 
InfoVis1415: slides sessie 1, 10 Feb 2015
InfoVis1415: slides sessie 1, 10 Feb 2015InfoVis1415: slides sessie 1, 10 Feb 2015
InfoVis1415: slides sessie 1, 10 Feb 2015
 
201502010 pen ocw_les1_erik
201502010 pen ocw_les1_erik201502010 pen ocw_les1_erik
201502010 pen ocw_les1_erik
 
Inleiding Human Computer Interaction
Inleiding Human Computer InteractionInleiding Human Computer Interaction
Inleiding Human Computer Interaction
 

XML en NoSQL

  • 1. Gegevensbanken: toekomst... Kortrijk, 31 March 2011 Erik Duval http://erikduval.wordpress.com @ErikDuval 1 Thursday 31 March 2011
  • 3. which database holds the web? 3 Thursday 31 March 2011
  • 4. XML • NoSQL (Met dank aan Steven Noels) 4 Thursday 31 March 2011
  • 5. ? XML ? 5 Thursday 31 March 2011
  • 6. Thursday 31 March 2011 6 http://en.wikipedia.org/wiki/Extensible_Markup_Language
  • 7. Thursday 31 March 2011 7 http://www.itjobboard.be/ICT-banen/xml/Belgie/alle/0/relevantie/nl/
  • 8. 8 http://www.khbo.be/12385 Thursday 31 March 2011
  • 9. 9 http://www.w3.org/XML Thursday 31 March 2011
  • 10. 10 http://www.w3c.it/talks/2005/openCulture/slide7-0.html Thursday 31 March 2011
  • 11. Thursday 31 March 2011 11 http://en.wikipedia.org/wiki/List_of_XML_markup_languages
  • 12. XML is not ... • Extension of HTML • XHTML is XML-compliant, and extensible • Just for Web pages • Useful when data are stored or exchanged • Concerned with semantics • XML does not define semantics, just syntax • Innovative new technology • Standard, building on existing technology • Only a hype • Though also Thursday 31 March 2011 12
  • 13. XML is ... • Endorsed by W3C and major companies • Extensible • No tag name limitations • No language limitations • Human software developer-readable • Can be processed with basic text tools • Open standard • no vendor lock-in (in theory...) • Easy to implement • powerful, cheap (free), off-the-shelf XML tools Thursday 31 March 2011 13
  • 14. when was XML invented? 14 Thursday 31 March 2011
  • 15. 1969: SGML (Standard Generalized Markup Language) • Meta-language: describe other languages • Powerful, but rather complicated • 1986: ISO standard • 1992: HTML (HyperText Markup Language) • Based on SGML • Simple, but limited • 1996: Start design of XML • By World Wide Web Consortium (W3C) • 1998: Publication of XML 1.0 15 Thursday 31 March 2011
  • 16. Design Goals • Easy to use over the Internet • Power of SGML • Simplicity of HTML • Human-legible • Easy to create • Compactness is not an issue • “The ASCII of the Web” 16 Thursday 31 March 2011
  • 17. what does XML look like? 17 Thursday 31 March 2011
  • 18. XML Basics <Person> <Name> <First>Thomas</First> <Last>Atkinson</Last> </Name> <Age>30</Age> </Person> • Self-defined, meaningful tags • Separate data and its representation 18 Thursday 31 March 2011
  • 19. Language for defining syntax • Records and fields have explicit boundaries • parse-able without knowing structure (self-descriptive) • Unicode support (UTF-8, UTF-16, ...) • Web-aware • DTD, ENTITY and Schema can be loaded through URL • Strictly parsed: no ambiguity (case sensitive!) • Extensible: namespaces 19 Thursday 31 March 2011
  • 20. <?xml version="1.0” encoding=“UTF-8”?> <!-- processing instruction: XML follows --> <!DOCTYPE addressbook SYSTEM "http://www/~koenh/ddml/addressbook.dtd”> <!-- Document Type Declaration... --> <!-- ExternalDTDPointer --> <addressbook> <!--root element --> <person first-name="John" family-name="Doe” employee-number="1234"> <contact-info> <email address="Jdoe@home.com"/> </contact-info> <address street="Celestijnenlaan” number="200A"/> </person></addressbook> 20 Thursday 31 March 2011
  • 21. <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 22. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 23. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 24. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 25. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 26. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 27. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element 21 Thursday 31 March 2011
  • 28. Cfr. HTML markup tags <H1 align=”center” > a Heading </H1> attribute opening closing content tag tag element • Major differences: • Case sensitive • Proper nesting: No <A> … <B> … </A> … </B> • Unicode instead of ASCII 21 Thursday 31 March 2011
  • 29. Vocabularies • Agreed-upon XML tag sets for specific domain • Examples • Chemical Markup Language (CML) • Business: ebXML, RosettaNet, BizTalk • Mathematics: MathML • Multimedia: Synchronized Multimedia Integration Language (SMIL) • Etc. 22 Thursday 31 March 2011
  • 30. well-formed: follows XML syntax • Proper tag and attribute names • Tags properly closed • Attributes and text between tags do not contain ‘<‘ (escape with &lt;) • valid: well-formed and vocabulary • All elements and their attributes declared in DTD • Attribute values follow DTD type declarations • CDATA, ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, enumerated • Nesting and sequencing of elements follows DTD 23 Thursday 31 March 2011
  • 31. Elements • XML’s container for • Attributes • Character data • Other elements (“child” elements) • Delimited by opening and closing tags • Non-empty element: <name>..</name> • Empty element: <name/> • Form a simple hierarchic tree • Root = “document element” 24 Thursday 31 March 2011
  • 32. Attributes and Strings • Attributes • Name-value pairs: name=value • Only strings as value! • Strings • Enclosed by ‘...’ or “...” → replace with &apos; or &quot; • Character data • Any text that is not markup • ‘&’, ‘<’ and ‘>’ are markup → replace with &amp; &lt; and &gt; 25 Thursday 31 March 2011
  • 33. Document structure • Prolog (optional) • <?xml version="1.0” encoding=“UTF-8”?> • (compulsory) version="number" • encoding="character encoding" (optional) • Document type declaration • <!DOCTYPE document_element ... > • Body – The document element 26 Thursday 31 March 2011
  • 34. Another example <?xml version="1.0" standalone="no"?> <!DOCTYPE BankAccounts ...> <!-- This is an example XML document --> <BankAccounts> <Account accountNr="123-456789-01" use="personal"> <Owners> <Person ID="1258-a8d72-98"> <Name>John Smith</Name></Person> <Person ID="5842-df5ef-e9"> <Name>Claudia Scott</Name></Person> </Owners> <CreditCards><CreditCard number="12345"/></CreditCards> <Balance Currency="EUR">50000</Balance> </Account> ... </BankAccounts> 27 Thursday 31 March 2011
  • 35. namespaces: problem <widget type="gadget"> <head size="medium"/> <big><subwidget ref="gizmo"/></big> <info> <head><title>Gadget</title></head> <body><h1>Gadget</h1> A gadget contains a big gizmo </body> Name collision! </info> </widget> 28 Thursday 31 March 2011
  • 36. solution ? 29 Thursday 31 March 2011
  • 37. namespaces: approach • A collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names •xmlns:prefix="URI" • URI used only as identifier • does not need to point to anything • applies to all nested elements and attributes 30 Thursday 31 March 2011
  • 38. namespaces: example <widget xmlns="http://www.widget.org" xmlns:xhtml="http://www.w3.org/TR/xhtml1" type="gadget"> <head size="medium"/> <big><subwidget ref="gizmo"/></big> <info><xhtml:head><xhtml:title>Gadget </xhtml:title></xhtml:head> <xhtml:body><xhtml:h1>Gadget </xhtml:h1>A gadget contains... </xhtml:body></info> </widget> 31 Thursday 31 March 2011
  • 39. Another example <Address> <Server> <Street>Celestijnenlaan</Street> <Name>www</Name> <Nr>200A</Nr> <Address> 134.58.43.1 <City>Heverlee-Leuven</City> </Address> <Country>Belgium</Country> </Server> </Address> ? 32 Thursday 31 March 2011
  • 40. Another example (2) <Address <Server xmlns="www.all.edu/departments"> xmlns="www.dns.net/servers"> <Street>Celestijnenlaan</Street> <Name>www</Name> <Nr>200A</Nr> <Address> <City>Heverlee-Leuven</City> 134.58.43.1 </Address> <Country>Belgium</Country> </Server> </Address> <Department xmlns:edu="www.all.edu/departments" xmlns:dns="www.dns.net/servers"> <edu:Address> <Street>Celestijnenlaan</Street> ... </edu:Address> <dns:Name>www</dns:Name> <dns:Address>134.58.43.1</dns:Address> </Department> 33 Thursday 31 March 2011
  • 41. how would you process XML? 34 Thursday 31 March 2011
  • 42. Accessing XML documents • Manual text file manipulation • Cumbersome & Error-prone • Parser • Simplifies document manipulation • Ensures proper grammar, well-formedness • Abstracts content from grammar • Accessed through standard API • Document Object Model (DOM) • Simple API for XML (SAX) 35 Thursday 31 March 2011
  • 43. DOM parser • create DOM object tree • SAX parser • generates events when elements encountered • one-pass translation • no need to keep whole document tree in memory • Both can be validating or non-validating • Many available (most freeware, open source) • ibm xml4j, apache xerces, sun parser, microsoft, datachannel, oracle, ... 36 Thursday 31 March 2011
  • 44. DOM approach http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview/3_apis.html#JAXP 37 Thursday 31 March 2011
  • 45. Dom Benefits & Drawbacks • Benefits • W3C Recommendation • Language- and platform-independent • Random access • Intuitive • Drawback • Entire object tree in memory 38 Thursday 31 March 2011
  • 46. Simple API for XML (SAX) • Not an official standard • Ad-hoc product by XML developers • Primarily Java API • Event-based mechanism • Don’t call the parser, the parser calls you • No object model in memory • Programmer must keep state information 39 Thursday 31 March 2011
  • 48. SAX Benefits & Drawbacks • Benefits • Suitable when • parsing large documents • constructing proprietary object structures • only small subset of information is needed • Simple and fast • Drawbacks • Read-only • No random access • Complex searches messy to program Thursday 31 March 2011 41
  • 49. how to define valid instances? 42 Thursday 31 March 2011
  • 50. XML Schema • typering van waarden • vb. integer, string, enz. • ook beperkingen op min/max waarden • types door gebruiker gedefinieerd • is gespecificeerd in XML syntax, • meer gestandaardiseerde voorstelling • is geïntegreerd met namespaces • en nog andere mogelijkheden • lijst types, uniciteitsbeperking op sleutels, verwijssleutelbeperkingen, overerving,… 43 Thursday 31 March 2011
  • 51. XSDL • XML Schema Definition Language • documenten met suffix .xsd 44 Thursday 31 March 2011
  • 52. XML Schema: voorbeeld XML schema <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> .... <xsd:element name="PWORKER" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="HOURS" type="xsd:float"/> </xsd:sequence> <xsd:attribute name="SSN" type="xsd:IDREF" use="required"/> </xsd:complexType> </xsd:element> .... </xsd:schema> XML instantie <PWORKER SSN="_123456789"> <HOURS>7.5</HOURS> </PWORKER> 45 Thursday 31 March 2011
  • 53. XML: eenvoudige types – ingebouwde eenvoudige types • string, integer, decimal, float, boolean, date, time,… • <xsd:element name=“gebdat” type=“xsd:date” /> – door gebruiker gedefinieerde eenvoudige types • gedefinieerd met simpleType element • restriction element geeft het basistype waarop gesteund is • <xsd:simpleType name=“salaryRange”> <xsd:restriction base=“xsd:integer”> <xsd:minInclusive value=“25000” /> <xsd:maxInclusive value=“100000” /> </xsd:restriction> </xsd:simpleType> 46 Thursday 31 March 2011
  • 54. XML: eenvoudige types <xsd:simpleType name=“studentClassificatie”> <xsd:restriction base=“xsd:string”> <xsd:enumeration value=“bachelorstudent” /> <xsd:enumeration value=“masterstudent” /> <xsd:enumeration value=“doctorstudent” /> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name=“deptType”> <xsd:restriction base=“xsd:string”> <xsd:length value=“3” /> </xsd:restriction> </xsd:simpleType> 47 Thursday 31 March 2011
  • 59. how to query XML? 52 Thursday 31 March 2011
  • 60. XPath (example) ROOT COMPANY /COMPANY/EMPLOYEE EMPLOYEE SSN _123456789 EMPLOYEE SSN _333445555 EMPLOYEE SSN _999887777 53 Thursday 31 March 2011
  • 61. ROOT COMPANY / COMPANY/EMPLOYEE EMPLOYEE SSN _123456789 EMPLOYEE SSN _333445555 EMPLOYEE SSN _999887777 54 Thursday 31 March 2011
  • 62. ROOT COMPANY /COMPANY/EMPLOYEE EMPLOYEE SSN _123456789 EMPLOYEE SSN _333445555 EMPLOYEE SSN _999887777 55 Thursday 31 March 2011
  • 63. ROOT COMPANY / /COMPANY EMPLOYEE EMPLOYEE SSN _123456789 EMPLOYEE SSN _333445555 EMPLOYEE SSN _999887777 56 Thursday 31 March 2011
  • 64. ROOT COMPANY EMPLOYEE /COMPANY/ EMPLOYEE SSN _123456789 EMPLOYEE SSN _333445555 EMPLOYEE SSN _999887777 57 Thursday 31 March 2011
  • 65. XPath ROOT COMPANY /COMPANY/EMPLOYEE EMPLOYEE <EMPLOYEE SSN="_123456789" SEX="M“ SSN SUPERSSN="_333445555" DNO="_5"> <FNAME>John</FNAME> _123456789 <MINIT>B</MINIT> .... EMPLOYEE </EMPLOYEE> <EMPLOYEE SSN="_333445555" SEX="M“ SSN SUPERSSN="_888665555" DNO="_5"> <FNAME>Franklin</FNAME> <MINIT>T</MINIT> _333445555 <LNAME>Wong</LNAME> <BDATE>08-DEC-45</BDATE> </EMPLOYEE> EMPLOYEE <EMPLOYEE SSN="_999887777" SEX="F“ SUPERSSN="_987654321" DNO="_4"> SSN <FNAME>Alicia</FNAME> _999887777 ..... 58 Thursday 31 March 2011
  • 66. XML family of technologies • Xlink: hypertext • XSL: Extensible Style Sheet Language • XSL-T Transformation • Formatting Objects • Xschema: additional constraints on attribute types • and more... 59 Thursday 31 March 2011
  • 67. XML applications • RDF: Resource Description Framework • infra • XHTML: eXtensible HTML en HTML5 • XML compliant HTML • MathML • SMILE: synchronized multimedia presentation • Many others • Chemical Markup Language,Vector Graphics Markup Language, Open Software Description Format, Weather observation, astronomical data, financial data, electronic components, workflow, business cards, real estate, newspaper, classifieds, javadoc, human resource, advertising, architecture …. 60 Thursday 31 March 2011
  • 68. More XPath Features • Operator “|” used to implement union • E.g. //EMPLOYEE[count(DEPENDENT) = 1] | //EMPLOYEE[not(DEPENDENT)] • gives employees with either 0 or 1 dependents • “//” can be used to skip multiple levels of nodes • E.g. /COMPANY//FNAME • finds any FNAME element anywhere under the /COMPANY element, regardless of the element in which it is contained. • A step in the path can go to: parents, siblings, ancestors and descendants of the nodes generated by the previous step, not just to the children • “//”, described above, is a short from for specifying “all descendants” • “..” specifies the parent. • e.g. : /COMPANY//FNAME/../BDATE 61 Thursday 31 March 2011
  • 69. XQuery • laat toe om meer algemene queries te formuleren dan XPath • algemene vorm: FLWOR uitdrukking FOR < for-variabele > IN < in-uitdrukking > LET < let-variabele > := < let-uitdrukking > [ WHERE < filter-uitdrukking > ] [ ORDER BY < orde-specificatie > ] RETURN uitdrukking > < • opm: FOR en LET kunnen alleen of samen voorkomen 62 Thursday 31 March 2011
  • 70. Q1: voornaam en familienaam van alle werknemers die meer dan 70000 verdienen • FOR $x IN doc(www.company.com/info.xml) // employee [employeeSalary > 70000] / employeeName RETURN < res > $x / firstName, $x / lastName </ res > • alternatief: FOR $x IN doc(www.company.com/info.xml) company / employee WHERE $x / employeeSalary > 70000 RETURN < res > $x / employeeName / firstName, $x / employeeName / lastName </ res > 63 Thursday 31 March 2011
  • 71. Q3: voornaam en familienaam van alle werknemers die meer dan 20 uur op project nummer 5 werken, met dat aantal uren • FOR $x IN doc(www.company.com/info.xml) / company / project [projectNumber = 5] / projectWorker , $y IN doc(www.company.com/info.xml) / company / employee WHERE $x/hours > 20.0 AND $y.ssn = $x.ssn RETURN < res > $y / employeeName / firstName, $y / employeeName / lastName, $x / hours </ res > 64 Thursday 31 March 2011
  • 72. XML • NoSQL (Met dank aan Steven Noels) 65 Thursday 31 March 2011
  • 74. Hoe bovenop SQL? 67 Thursday 31 March 2011
  • 75. select fun, profit from real_world where relational=false; 68 Thursday 31 March 2011
  • 76. NoSQL • problems with existing relational approach for Amazon (Dynamo) and Google (BigTable) • flexibility, performance, scaling, cost • millions of users • application changes rolled out incrementally without downtime • now more broadly applicable (velcro) • Open source developments: Facebook,Yahoo! - Cassandra, Hadoop, MapReduce, Hive, Pig 69 Thursday 31 March 2011
  • 78. NoSQL • non-relational • distributed • open source • horizontally scalable 71 Thursday 31 March 2011
  • 79. NoSQL • non-relational • “web scale” • distributed • schema free • open source • easy replication • horizontally scalable • simple API 71 Thursday 31 March 2011
  • 80. Systems • Core: Hadoop, HBase, Cassandra, Hypertable, ... • Docs: CouchDB, MongoDB, Riak, Terrastore, ... • Key-Value, tuple: Amazon SimpleDB, Azure, ... • Graph: Neo4J, Bigdata, InfoGrid, HyperGraph, ... • Object:Versant, Perst, ZODB, ... • Grid: GigaSpaces, Hazelcast, ... • XML: Tamino, eXist, Mark Logic, Xindice, ... • ... 72 http://nosql-databases.org/ Thursday 31 March 2011
  • 86. Text 76 http://www.slideshare.net/oemebamo/database-sharding-at-netlog-presentation Thursday 31 March 2011
  • 88. no attempt to ACID • Atomicity • Consistency • Isolation • Durability • BASE: trade ACID off in favor of high availability 78 Thursday 31 March 2011
  • 90. Questions? http://erikduval.wordpress.com/ twitter: @ErikDuval 80 Thursday 31 March 2011