The document discusses SDMX and how it compares to other data standards. It analyzes SDMX's ability to represent different types of data like dimensional data, questionnaires, and business registers. It also looks at how well SDMX supports the full data process. The analysis is ongoing, with plans to further evaluate SDMX capabilities against real use cases and suggest improvements. Areas for potential improvement include adding an expression language for calculations and better features for versioning codes and representing changes over time.
2. Technical Working Group
Initial issue list:
– Global Registry
– IT tools
– Security Guidelines & Web services
– Documentation
– SDMX & other standards
Further issues:
– Expressions and calculations
– JSON implementation
29 January 2013 SDMX seminar – ISTAT 2
3. SDMX and other standards
The goals of the comparison:
Deal with all the possible kind of data
• Dimensional data
• Questionnaires data
• Business registers data
Support all the phases of the business process
• Data design
• Data exchange
• Data processing
29 January 2013 SDMX seminar - ISTAT 3
4. SDMX and other standards
The status of the analisys:
Mid-way: working plan covers 2012 and 2013
– Assess the SDMX capabilities
(ascertain the ability of SDMX of dealing with real use
cases, also versus other standards)
– Suggest the SDMX evolution
(propose SDMX upgrades, also taking advantage of the
state-of-the-art of other standards)
29 January 2013 SDMX seminar - ISTAT 4
5. Other data models / standards
DDI – Data Documentation Initiative - International standard for
describing data from the social, behavioral, and economic sciences,
developed by the DDI Alliance, a self-sustaining membership organization
GSIM – Generic Statistical Information Model - Reference
framework for the information used in the production of the official statistics,
developed by HLG-BAS (High Level Group for Strategic Development in
Business Architecture in Statistics)
Matrix model - User oriented information model developed by the Bank of
Italy and used since 1989 to design, exchange and process statistical data
by means of active metadata
RDF – Resource Description Framework –Standard model for
data interchange on the WEB consisting in a suite of recommendations of
the W3C (World Wide Web Consortium)
XBRL – eXtensible Business Reporting Language - A
royalty-free, open specification to describe financial information for public
and private companies and other organizations developed by XBRL
International, a no-profit consortium
29 January 2013 SDMX seminar - ISTAT 5
6. Some Use Cases
ESCB / Bank of Italy
• Register of Institutions and Affiliates Database (RIAD)
• Balance of Payment direct reporting questionnaires
• Securities Register
• Matrix model & INFOSTAT platform
Eurostat / ISTAT
• EuroGroups Register (EGR)
• Labour Force Survey questionnaires
• EU Survey on Income and Living Conditions of Households
FAO
• Questionnaire on Crop and Livestock production & utilization
Infostat Slovakia
• ESSnet questionnaires in eCollect-X
• Statistical Survey of the Accomodation Establishments
Metadata Technology
• The RDF/SDMX Data Cube Vocabulary
XBRL community
• Formula language
29 January 2013 SDMX seminar - ISTAT 6
7. Main Outcomes (1)
• The SDMX information model contains the
basic structures to represent dimensional,
questionnaires and register data
– SDMX represents mathematical functions (Cubes)
having independent variables (dimensions) and
dependent variables (measures and attributes)
– Dimensional, questionnaires and register data can be
represented as sets of mathematical functions
according to the approach of the Matrix model,
described in “SDMX support for different data”
(SDMX global conference – Washington - 2011)
29 January 2013 SDMX seminar - ISTAT 7
8. Main Outcomes (2)
• The SDMX standard needs an
Expression Language to define
validation and calculation rules
– Matrix model EXL (EXpression Language)
– XBRL language Formula language
– DDI, GSIM no calculation language
• New TWG WP Item “Expressions and
calculations”
29 January 2013 SDMX seminar - ISTAT 8
9. Main Outcomes (3)
Improve features for historical representation:
• Versioning of Codes to simplify the management of
Codelists, DSDs …
– Codelists are versioned, not Codes
– Problem: If a Code changes, a new version is needed for its
Codelist and for each DSDs that uses such Codelist
– Solution: time validity of Codes ( see Matrix model)
• Standard representation of changes with time
– merge, incorporate, split … of entities (countries, institutional
units …) ( see Matrix model)
29 January 2013 SDMX seminar - ISTAT 9
10. Main Outcomes (4)
Improve the data representation features:
• Allow using many measures in exchanging multi-
measure data, without mandatory measure dimension
• Allow many dimensions in the role of “measure
dimension”
• No mandatory names Obs_Value and Time_period,
(they force to change the originary names of measures and time)
• For an “interval” Time_period, allow specifying
start_date / end_date, not only start_date / duration
• Improve the specification of the action to be performed
(Insert/Update/Delete …) and allow specifying the order
of the actions, essential for integrity in some use cases
29 January 2013 SDMX seminar - ISTAT 10
11. Main outcomes (5)
• Support semantic WEB
– Allow dissemination based on RDF ( see Data Cube Vocab.)
• Support to “active” questionnaires
– SDMX doesn’t repesent questions
– Representation of questions and mapping between
questions and variables (concepts) of the data structure (
see DDI model; see eCollect-X solution)
– To be better analyzed, this is a field of possible integration with
DDI
• Possible unification of DSD and MSD ?
– metatada are data themselves
– To be better evaluated (different opinions were expressed on
this topic)
29 January 2013 SDMX seminar - ISTAT 11
12. In synthesis
• Considerable inputs and findings, which require deeper
analysis to be implemented
• Need of prioritizing the suggestions and identifying
what can be achieved in the current SDMX version 2.1
and what should be faced in the following versions
• It would be appropriate a general comparison of
SDMX, GSIM and DDI IMs, mappingcorresponding
artefacts as much as possible; contributions for this
topics can come from:
– SDMX – DDI dialogue
– GSIM work
29 January 2013 SDMX seminar - ISTAT 12
13. Some references
GSIM - from UNECE wiki
http://www1.unece.org/stat/platform/display/metis/Generic+Statistical+Information+Model+%28GSIM%29
SDMX-DDI Dialogue - from UNECE wiki
http://www1.unece.org/stat/platform/display/metis/SDMX+DDI+Dialogue+-+Overview+Page
http://www1.unece.org/stat/platform/display/metis/Usage+scenarios+for+SDMX+and+DDI
SDMX - eCollect-X
http://www.ecollect-x.eu/en/about/project-history.aspx
SDMX-RDF Data Cube Vocabulary
http://www.w3.org/2011/gld/wiki/Data_Cube_Vocabulary
The Matrix model
http://www.bancaditalia.it/statistiche/quadro_norma_metodo/modell_SIS
http://www.czso.cz/conference2009/proceedings/data/process/piazza_paper.pdf
XBRL Formula
http://www.xbrl.org/SpecRecommendations
29 January 2013 SDMX seminar - ISTAT 13
14. Contributors
– European Central Bank
– Eurostat
– FAO
– Infostat Slovakia
– ISTAT
– Metadata Technology
– National Bank of Italy
– National Bank of Poland
– UNECE
29 January 2013 SDMX seminar - ISTAT 14
15. SDMX versus other standards
Vincenzo Del Vecchio
vincenzo.delvecchio@bancaditalia.it
29 January 2013 SDMX seminar - ISTAT 15