1. Pan-European infrastructure for Ocean & Marine Data management
An EU Integrated research Infrastructure Initiative (I3)
How to implement CDI ?
M. Fichaut, V. Tosello
Training course - June 2007, Oostende, Belgium
2. 15/08/13
How to proceed with the files for training
In the shared directory :
With 2 files
CDI_implementation.ppt : this presentation
training2_programme.doc : programme sent by email
And 5 directories :
CDI documentation
Detailed documentation on CDI and mapping
Example database
3 directories for ACCESS, MySQL and EXCEL database
Exercises
Mikado
Mikado software and configuration files
Solutions
Solution of the exercises, empty now, will be filled during the training after
each exercises session
Now : Copy the common directory on your computer
3. 15/08/13
Example database (1)
10 tables describing : the stations of datasets collected during cruises
5 cruises on 2 ships, 10 datasets, 1399 stations
the datasets are either CTD or Bottle stations (mikado_dataset.database_id)
Measured parameters are linked to the datasets (mikado_dataset_parameter)
Mapping with common vocabularies is made through mapping tables
(mikado_map_…)
Document : example_databaseexample_database_description.rtf
4. 15/08/13
Example database (1)
One database in several environments
On Microsoft ACCESS
accesscdi_training_course_2.mdb
On MySQL
Name of the database : mikado, user : mikado, password :
mikado
On EXCEL
excel cdi_training_course_2.xls
5. 15/08/13
First set of Exercises
Manipulate the example database in different
environments
Exercise 1 : MySQL database
Exercise 2 : ACCESS database
Exercise 3 : Excel file
8. 15/08/13
Database mapping to XML schema (1)
Using the documents :
CDI documentation CDI_XML_V0_documentation_3_00_June2007.doc:
Metadata Format and full description of XML schema version 2.04 – as
prepared in Sea-Search
CDI documentation CDI_V1_04_mapping.xls :
Excel file describing the mapping between the CDI fields, ISO-19115 and the
local database
example_databaseexample_database_description.rtf :
Description of the tables of the example database
9. 15/08/13
On the right hand side : local database reference
Table_name.field_name
Database mapping to XML schema (2)
Level 2 Level 3 Level 4 …
WHERE Latitude1 (var1) <dataIdInfo> (15) <geoBox> (42) <southBL> (346)
WHERE Longitude1 (var2) <dataIdInfo> (15) <geoBox> (42) <westBL> (344)
WHERE Latitude2 (var3) <dataIdInfo> (15) <geoBox> (42) <northBL> (347)
WHERE Longitude2 (var4) <dataIdInfo> (15) <geoBox> (42) <eastBL> (345)
WHERE Measuring area type (var5) <spatRepInfo> (12) <VectSpatRep> (176) <geometObjs> (178)
WHERE Data Coordinate system (var6) <refSysInfo> (13) <MdCoRefSys> (189) <datum> (192)
…
WHEN Start date (var12) <dataIdInfo> (15) <dataExt> (45) <tempEle> (337)
WHEN Start time (UT) var(12) see Start date
WHAT Parameters measured (var50) <dataIdInfo> (15) <descKeys> (33) <keyword> (53)
<KeyTyp> (54)
HOW Instrument or gear type (var42) <dataIdInfo> (15) <descKeys> (33) <keyword> (53)
<KeyTyp> (54)
HOW Type of platform (var43) <dataIdInfo> (15) <descKeys> (33) <keyword> (53)
<KeyTyp> (54)
WHO Originator (var16) <dataIdInfo> (15) <idPoc> (29) <rpOrgNameId>
(not ISO)
<role> (379)
PART CDI FIELD XML TAGS LOCAL DATABASE FIELD
(TABLE_NAME.FIELD_NAME)
On the left hand side : CDI reference
For each : XML tree (XML tags, with ISO-19115 number)
10. 15/08/13
Database mapping to XML schema (3)
WHERE Latitude1 (var1) MIKADO_STATION.STATION_LAT
WHERE Longitude1 (var2) MIKADO_STATION.STATION_LON
WHERE Latitude2 (var3) not relevant
WHERE Longitude2 (var4) not relevant
WHERE Measuring area type (var5)
WHERE Data Coordinate system (var6)
…
WHEN Start date (var12)
WHEN Start time (UT) var(12)
WHAT Parameters measured (var50)
HOW Instrument or gear type (var42)
HOW Type of platform (var43)
WHO Originator (var16)
PART CDI FIELD LOCAL DATABASE FIELD
(TABLE_NAME.FIELD_NAME)
Mandatory CDI fields in bold characters
Example of mapping
11. 15/08/13
Second set of Exercises
Map the local database to the CDI fields
Exercise 4 : Find all CDI fields mapping with
MIKADO_CRUISE table
Exercise 5 : Find the mapping for the CDI field Data
Coordinate system and Sampling Interval
Exercise 6 : Full mapping
Look at the final mapping file. Questions?
12. 15/08/13
CDI implementation progress
ISO-19115
XML SCHEMA
defined for the CDI
by SDN – TTT
USER
LOCAL
DATABASE
ORACLE
MSSERVER
MYSQL
…
EXCEL FILES
STATION METADATA
(CTDs, Bottles, Current meters, …)
MAPPING
Common
vocabularies
MAPPING
13. 15/08/13
Principles of the CDI
ISO-19115
XML SCHEMA
defined for the CDI
by SDN – TTT
USER
LOCAL
DATABASE
ORACLE
MSSERVER
MYSQL
…
EXCEL FILES
STATION METADATA
(CTDs, Bottles, Current meters, …)
MAPPING
Central CDI
Common
vocabularies
MAPPING
TOOLS
Mikado
Others
Configuration
Files
Compliant to
XML CDI files
15. 15/08/13
MIKADO : Reminder
MIKADO has been developed by IFREMER in the framework of the
SEA-SEARCH project.
MIKADO can be used to generate EDMED, CSR and CDI XML files.
MIKADO can be used into 2 different ways :
One manual way, to input manually information for EDMED,
CSR and CDI fields in order to generate XML files.
One automatic way, to generate these descriptions automatically
if information is catalogued in a relational database.
16. 15/08/13
MIKADO principle
MIKADO
Java code
JDBC
Java DataBase Connectivity
JVM – Java Virtual Machine
Native Drivers
MYSQL
ORACLE
POSTGRES
SQLServer
Bridge Drivers
using Microsoft ODBC
(ACCESS, EXCEL, SQL SERVER)
DATABASE
Configuration
Files
(*.properties)
Download drivers
from web sites
if not provided with Mikado
17. 15/08/13
List of drivers provided with Mikado
ODBC Bridge (from Java JDK 1.5)
MS Access 97, 2000, 2003
MS Excel
JTDS 1.1 (Open Source licence)
Sybase 10, 11, 12, 15
SQL Server
MySQL 3.1.10 (GPL licence)
MySQL 4.1, 5.x
Oracle 14 (Oracle Technology Network Development and Distribution License)
Oracle 8.1.7, 9i, 10i (with restrictions)
PostgreSQL 8.0.312 (BSD licence)
PostgreSQL
MS SQL server 2005 driver (Microsoft download)
SQL Server 2000 with Service Pack 1 or higher
Please note that Mikado source level is JDK 1.5
18. 15/08/13
How to find the appropriate JDBC
driver?
Native drivers must be downloaded from the appropriate vendors
websites
(i.e Oracle, Sybase, MySQL, PostGres, …)
It must fit with the database and the JDBC versions.
This link may help you to find out the native drivers you need :
http://developers.sun.com/product/jdbc/drivers
For Oracle RDBMS , a native driver may be found on
http://www.oracle.com/technology/software/tech/java/sqlj_jdbc/index.html
For Microsoft SQL server, a native driver may be found on Microsoft
MSDN
http://msdn2.microsoft.com/en-us/data/default.aspx
Bridge drivers (using Microsoft-ODBC Open DataBase Connectivity) are
bundled with the Java 2 SDK.
http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/bridge.doc.html
Drivers have then to be installed following the given recommendations
The corresponding .jar files must be copied in the mikadodist directory
19. 15/08/13
MIKADO configuration files
See Mikado user manual
mikadosdn_Mikado_Manual_V0.1.doc
Mikado needs 2 configuration files in the
mikadoconf directory
files.properties file which contains
all the information about the name and the location of the
files used by the software
Cdi.properties file which contains:
Database connection parameters
Queries to retrieve data from the database
Mapping between the queries results (SQL variables) and
the CDI fields
20. 15/08/13
Mikado files.properties file
All the information about the name and the location of
the files used by the software
Name of the files containing the list of
values for CDI [and EDMED and CSR] :
Parameters codes, platforms codes, …
##############
# CDI
##############
#cdiPartnerCodes
cdiPartnerCodes.file = ./conf/cdi_partner_codes.csv
#cdiMeasurementAreaTypeCodes
cdiMeasurementAreaTypeCodes.file = ./conf/cdi_measurement_codes.csv
Name of XML schemas used for CDI [and
EDMED and CSR]
#xsd
xsd.cdi = ./conf/CDI_V1_04.xsd
xsd.cdi_multirecords = ./conf/CDI_V2_01_multirecords.xsd
Name of the files used for CDI XML
automatic generation [and EDMED and
CSR]
21. 15/08/13
Mikado cdi.properties file
3 parts in the file
Database connection parameters
Depending on the DATABASE : ACCESS, MySQL, ORACLE
…
Queries to extract data from the database
To be written from the mapping
Mapping between the queries results (SQL variables) and
the CDI fields
Must not be modified!!!
22. 15/08/13
cdi.properties file : database connection
(1)
DATABASE ACCESS JDBC (Java Data Base
Connectivity) connection parameters
###############################
JDBC parameters for querying database
###############################
bdd.query.timeout = database connection time out
bdd.className= driver Java Class Name (the *.jar file which
implement this Class Name must exist in the
mikadoconfdist directory)
bdd.connectURL = connect string to the database
bdd.user.name = user name
bdd.user.password = user password
24. 15/08/13
cdi.properties file : database connection
(3)
Examples of Native drivers
MySQL database
bdd.className = com.mysql.jdbc.Driver
bdd.connectURL = jdbc:mysql://localhost/mikado
bdd.user.name = mikado
bdd.user.password = mikado
ORACLE
bdd.className = oracle.jdbc.OracleDriver
bdd.connectURL = jdbc:oracle:thin:@server:port:instance
User and password must be provided
Example : jdbc:oracle:thin:@josephine.ifremer.fr:1521:btest920
• MS SERVER
bdd.className = com.microsoft. jdbc.sqlserver. SQLServerDriver
bdd.connectURL =
jdbc:sqlserver://serverName;instanceName:portNumber;property=value[;property=valu
e]
User and password must be provided
Example :
jdbc:sqlserveur://www.bkrclass.org:1065;databaseName=io_mikado;
integratedSecurity=false;
25. 15/08/13
cdi.properties file : mapping
Mapping of the query results to the CDI fields
Mapping table : SQL variables = CDI fields
var1 = southLatitude
var2 = westLongitude
var3 = northLatitude
…..
var42 = instrumentOrGearType
var43 = typeOfPlatform
var50 = parametersMeasured.value
WHERE Latitude1 (var1) MIKADO_STATION.STATION_LAT
WHERE Longitude1 (var2) MIKADO_STATION.STATION_LON
WHERE Latitude2 (var3) not relevant
WHERE Longitude2 (var4) not relevant
PART CDI FIELD LOCAL DATABASE FIELD
(TABLE_NAME.FIELD_NAME)
Must not be
changed
26. 15/08/13
cdi.properties file : SQL queries (1)
Main query
It is the query which identifies all CDI entries that must be exported.
This request must return only one column which is the identifier of
each entry and that will be used for sub-queries.
SQL syntax must be adapted to your local database
############################
# Main query (station identifier list)
############################
query.=select mikado_station.station_id from mikado_station
27. 15/08/13
cdi.properties file : SQL queries (2)
Sub-queries
For each row of the main query, it is possible to define single or
multi sub-queries.
Single sub-queries : return one row for each row of the main query
(ex: dataset_id, cruise_name)
Multi sub-queries : return several rows for each row of the main
query (ex: parameters).
The links between the fields of the database and the variables of
the mapping (var1 to varN) table have to be defined in theses
sub-queries as synonyms of the selected columns
28. 15/08/13
cdi.properties file : SQL queries (3)
Single Sub-queries : return one row
The number of single sub-queries must be
defined
queries.single.number=5
The keyword for single sub-queries is :
queries.single.i=
where i varies from 1 to queries.single.number
Example :
queries.single.1 = select tabA.colX var1 from tabA where
tableA.colY= :$
var1 = synonym for the column name from the mapping table
:$ = the identifier of the row returned by the main query
29. 15/08/13
cdi.properties file : SQL queries (4)
Multiple Sub-queries : return several rows
All the names of the multi sub-queries are
defined and must not be changed
For CDI one multi sub-query is defined
The keyword is :
queries.multi.parametersMeasured =
Example :
queries.multi.parametersMeasured =select tabA.colX var50 from
tabA, tabB where tabB.colX = :$ and tabB.colY = tabA.colZ
var50 = synonym for the column name from the mapping table
:$ = the identifier of the row returned by the main query
30. 15/08/13
Third set of Exercises
Write the cdi.properties files for ACCESS or MY SQL or
EXCEL files
In the exercises only the stations of the Cruise « CITHER2 LEG 1 » will
selected to have a smaller number of stations
Exercise 7 : Look at the main query. Questions?
Exercise 8 : Fulfil the missing information in query.single.1.
Query on MIKADO_DATASET table
Exercise 9 : Fulfil the missing information in query.single.2.
Query on MIKADO_STATION table
Exercise 10 : Write the query.single.4 to find the
datum coordinate system
Exercise 11 : Have a look at the final property file,
questions?
31. 15/08/13
ISO date format
Date format must have the following syntax
YYYY-MM-DD
YYYY-MM-DDTHH24:MI
YYYY-MM-DDTHH24:MI:SS
32. 15/08/13
CDI implementation progress
ISO-19115
XML SCHEMA
defined for the CDI
by SDN – TTT
USER
LOCAL
DATABASE
ORACLE
MSSERVER
MYSQL
…
EXCEL FILES
STATION METADATA
(CTDs, Bottles, Current meters, …)
MAPPING
Common
vocabularies
MAPPING
TOOLS
Mikado
Others
Configuration
Files
33. 15/08/13
XML files generation - Third STEP
MAP YOUR DATABASE to THE CDI FIELDS
MIKADO CONFIGURATION
Generate the XML files
34. 15/08/13
XML CDI files generation
Use the Mikado software to create the XML CDI
files
Make sure that you have the right « properties » files in the
mikadoconf directory files before running the software
Change the connection to the database in the
CDI.properties file, with the path to your local database
Run Mikado
Import the common vocabularies lists for CDI using Mikado
: Menu configuration, library, CDI
Create and output directory for your XML files
35. 15/08/13
XML files check - Fourth STEP
MAP YOUR DATABASE to THE CDI FIELDS
MIKADO CONFIGURATION
Generate the XML files
Check the XML files
36. 15/08/13
CDI implementation progress
ISO-19115
XML SCHEMA
defined for the CDI
by SDN – TTT
USER
LOCAL
DATABASE
ORACLE
MSSERVER
MYSQL
…
EXCEL FILES
STATION METADATA
(CTDs, Bottles, Current meters, …)
MAPPING
Common
vocabularies
MAPPING
TOOLS
Mikado
Others
Configuration
Files
Compliant to
XML CDI files
37. 15/08/13
Send XML files - Fifth STEP
MAP YOUR DATABASE to THE CDI FIELDS
MIKADO CONFIGURATION
Generate the XML files
Check the XML files
Send the XML files
to central CDI
38. 15/08/13
Fourth set of Exercises
Generate the XML CDI files and check them
Exercise 12 : Generate the XML files with Mikado
Exercise 13 : Check the XML files
Notes de l'éditeur
On one hand we have data points that can be described in databases containing only metadata, or metatdata and data, or in flat files like EXCEL files. On the other hand we have the XML SCHEMA defined for the CDI during the SEA-SEARCH project. And we have the common vocabularies that must be used in the frame of SeaDataNet. The first step for implementing a CDI is to map the data point description to the XML schema. Then once the mapping is done, tools can use this mapping to generate XML CDI files which will be delivered to the central CDI in MARIS. During this course we will detail the different steps that are necessary to implement a CDI using the MIKADO tool