1. A Business Perspective on Use-Case-Driven Challenges for Software
Architectures to Document Study and Variable Information
IASSIST 2013
29.05.2013
Thomas Bosch
GESIS, Germany
thomas.bosch@gesis.org
boschthomas@blogspot.com
Matthäus Zloch
GESIS, Germany
matthaeus.zloch@gesis.org
Dennis Wegener
GESIS, Germany
dennis.wegener@gesis.org
2. Outline
• general information about MISSY
• next generation MISSY
• software architecture overview
• presentation
• business logic
3. general information about MISSY
• Microdata Information System (MISSY)
• currently, MISSY contains only the microcensus survey (largest
household survey in Europe)
• MISSY provides detailed information about individual data sets
• MISSY facilitates the data usage for research
4. general information about MISSY
• MISSY contains metadata of microdata
• MISSY is split in two parts
• Missy Web for metadata presentation (end-user front-end)
• Missy Editor for metadata documentation (back-end)
• MISSY consists of approx. 500 Variables & Questions per year
• MISSY captures 25 years, since 1973
5. next generation MISSY
further studies
we integrate further studies (e.g. EU-SILC, EU-LFS, EVS, …)
MISSY Editor
we implement the Missy Editor as a web application
modern web project architecture
we design a modern web project architecture
• multitier software architecture
• Model-View-Controller (MVC) pattern
• Apache Maven as project management software
6. next generation MISSY
physical persistence
MISSY supports multiple types of physical persistence
open source
we publish MISSY as an Open Source project
import
MISSY provides an import from SPSS and XML
export
MISSY provides an export to multiple formats like DDI-L, DDI-C, DDI-RDF, …
23. DDI-RDF Discovery Vocabulary
• contains only a small subset of DDI-XML + additional axioms
• the conceptual model is derived from use cases which are typical in
the statistical community
• statistical domain experts have formulated these use cases which
are seen as most significant to solve frequent problems
• increase visibility of microdata
• increase use of microdata
• enable inferencing on microdata
• harmonize microdata (make microdata comparable)
24. DDI-RDF Discovery Vocabulary
• enables to
• publish
• discover
microdata and metadata about microdata (research and survey
data) in the Web of Linked Data
• to link microdata to other microdata
making the data and the results of research (e.g. publications) more closely
connected
25. DDI-RDF Discovery Vocabulary
• availability of (meta)data
• Microdata may be available (typically as CSV files)
• In most cases, metadata about microdata is NOT available
• contains major types of metadata of DDI-C and DDI-L
• mappings from DDI-XML to DDI-RDF
• no straightforward Mapping from DDI-RDF to DDI-XML
• enables better support for the LD community
• partly no corresponding constructs in DDI-XML
• 26 experts from the statistics and the Linked Data community of
12 different countries have contributed
33. What comes next?
• How does the “next generation MISSY“ look like under the
hood?
• How is the data model implemented
• How does inheritance at data model level work?
• How does persistence work?
• Which modules/APIs does the MISSY Software System offer?
33
34. thank you for your attention…
• feel free to download the sources from GitHub!
https://github.com/missy-project
• have a look at the unofficial draft of DDI-RDF!
[planned as specification by the DDI Alliance by 2013]
http://rdf-vocabulary.ddialliance.org/discovery
give us feedback!
feel free to criticize!
Thomas Bosch
GESIS, Germany
thomas.bosch@gesis.org
boschthomas@blogspot.com
Matthäus Zloch
GESIS, Germany
matthaeus.zloch@gesis.org
Dennis Wegener
GESIS, Germany
dennis.wegener@gesis.org
36. software architecture
• standard technologies to develop software
• multitier software architecture
• Model-View-Controller (MVC) pattern
• Apache Maven as project management software
• multitier architecture separates the project into logical parts
37. multitier software architecture
• presentation
• users can access the web application using their internet browser
• presentation control
• Maven module responsible for the view the user gets when interacting with
the web application
• business logic
• Maven modules defining the data models (DISCO, MISSY)
• data storage access
• Maven modules defining persistence functionalities for data model
components regardless of the actual type of physical persistence
• data storage
• Maven modules implementing concrete persistence functionalities (e.g. DDI-
XML, DDI-RDF, RDBs) for data model components