SlideShare a Scribd company logo
1 of 16
Download to read offline
Welcome to my presentation on ODD. I have been working with ODD for about 4
years, first because I wanted universal dependency based linguistics in Frisian
corpora,
later because I wanted a strict TEI and universal dependency based dictionary format.
I want to show you how ODD has helped me to deliver reliable, interoperable
solutions.
First I will show you what ODD is, then I will show you how ODD can be used in
development pipelines. We will have a look at how ODD's can inherit from each
other,
and have a glance atteipublisher. Finally I will share some pro's and con's and the
reasons why we at the Fryske Akademy stick to ODD as a basis for solution
development.
1
TEI and ODD for
LINGUISTICS
A solid basis for development?
edrenth@fryske-akademy.nl
So what is an ODD? An ODD is a regular TEI document in which you define your data
model using a schemaSpec. In the ODD you can document your data model using TEI
elements such as div, p, gloss and def. In this documentation you can include the
actual specifications, to which you can refer from within schemaSpec. This gives you a
nice way to specify your data model in a documented manner.
Inside the specification of an element you can indicate how this element is to be
processed. We don't use this yet, but since the latest versions ofteipublisher it has
become a very interesting mechanism. Once you have your ODD you can generate
validation schemes, documentation and more.
2
What is ODD
• One Document Doesall
• It is a TEI document
• Holding one schemaSpec element
• It is the mechanism to customize TEI
• TEI is designed to be customized
• What canit do
• Generate validation
• Generate documentation
• Describe processing model
• https://tei-c.org/guidelines/customization/
• https://tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#IM-unified
• https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDmodules
• https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPMPM
Here you see a basic example of a schema definition in ODD. Often the structure will
be a schemaSpec with references to modules you want to work with and
specifications of elements you want to change.
The start attribute on schemaSpec tells which root element(s) is(are) allowed. Each
moduleRef points to an existing module, available in TEI (online)
or elsewhere. A source attribute, left out in this example, allows you to point to the
file where module definitions can be found, I will show you more on this later.
To limit the vast number of elements in modules you use the include or except
attributes on moduleRef. NOTE that if you omit an element in the include attribute
and refer to it later from
an elementRef, schema generation will not fail, instead the element will just not be
there. After the module references you usually list some elementSpecs for elements
that you want to change.
NOTE that omitting the mode attribute on elementSpec means add, not change.
Adding already existing elements is weird but again often does not make
transformation fail!
Including a content element in an elementSpec will overwrite existing content. The
content of content resembles for example xsd, basically you can use sequence and
alternate.
3
A very nice feature I think is the ability to use constraints in element specifications,
most people will use schematron with assert, report and xpath.
3
Besides elements also attributes can be specified, in this case I add linguistic feature
attributes to the analysis module. These attributes can be defined in their own
namespace.
You can define the datatype of attributes which can be an xml schema datatype using
the name attribute, or like here, a TEI datatype using key.
4
You can refer to previously defined attributes via memberOf. When an element is a
member of an attribute class, the attributes defined in this class are allowed for that
element.
NOTE the "mode is change" on classes, if you omit it the default will be "replace"
meaning you will loose all other class memberships.
5
More possibilities worth mentioning but not in detail. The first two keep things
organized, the model specifies element processing.
6
• specGrp –specRef: grouping specs
• macroSpec –macroRef: expanding spec content
• model: definebehaviourofelements
More possibilities
Now, this is where the benefits really start. Once you have your ODD you can
construct a pipeline without the need for coding that will give you validation and
documentation,
which you can use in for example editing environments like oxygen.
First thing to do is "compile" or better said expand your ODD using the available TEI
stylesheet. The necessary parts from modules in the TEI source will be combined with
your specifications.
After this is done you can transform to rng, again using the available stylesheet, or,
transform to a separate schematron. Rng can be transformed to xsd, which you may
want to generate jaxb classes. Last but not least there is a nice library that deals with
the complexity of transforming schematron to xslt, the execution of validation and
with the processing of validation results.
You can also use oxygen to transform, or oxgarage oryou can use roma to construct
ODD online, but the downside is this gives you less control and insight and you get
the version of TEI source and stylesheets available in these tools at the time.
7
odd odd2odd
• .compiled
odd2rng
• rng with
schematron
odd2sch
• .sch
trang
• .xsd
dmaus
schxslt
• xslt and/or
java
validation
ODD, processing
1. Maven: https://bitbucket.org/fryske-akademy/online-dictionaries/src/master/pom.xml
2. Oxygen
3. https://oxgarage.tei-c.org/
4. https://roma.tei-c.org/
5. Command line / maven
1. https://github.com/TEIC/Stylesheets/tags
2 – 4 use a version you may not want!
This makes me really happy! Recently I discovered it is possible, though verbose, to
define a maven pipeline that implements a lot of steps I mostly performed by hand
before. Now I can just do mvn verify, no ant needed either and no dependencies to
online sources.
8
ODD, processing, maven
<transformationSet>
<stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/odd2odd.xsl</stylesheet>
<parameters>….</parameters>
<outputDir>src/main/resources/odd</outputDir>
<fileMappers>….</fileMappers>
</transformationSet>
<transformationSet>
<stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/odd2relax.xsl</stylesheet>
<parameters>….</parameters>
</transformationSet>
<transformationSet>
<stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/extract-isosch.xsl</stylesheet>
<outputDir>src/main/resources/schematron</outputDir>
<fileMappers>….</fileMappers>
</transformationSet>
<plugin>
<groupId>net.sigmalab.trang</groupId>
<artifactId>trang-maven-plugin</artifactId>
<version>1.2</version>
1. <dependency>
<groupId>name.dmaus.schxslt</groupId>
<artifactId>java</artifactId>
<version>2.0.3</version>
https://bitbucket.org/fryske-akademy/online-dictionaries/src/master/pom.xml
On top of the available transformations from the TEI community I found it very useful
to write transformations from ODD. For example to generate a configuration file for
blacklab,
which in turn is used to build lucene indexes. Transformations like that help to stay
consistent and in control for example in case of data model changes. Naturally they
can be included in maven pipelines.
9
ODD, generation
https://search.maven.org/search?q=a:TeiLinguisticsFa
https://bitbucket.org/fryske-akademy/tei-encoding/src/master/reusables/
Something about inheritance now. I must admit I recently abandoned it, because of
added complexity and lack of use-case. The basics are simple, write an ODD, compile
it, write another ODD that uses the compiled first. The source attribute is crucial, you
can specify it on schemaSpec, which means all moduleRef without a source attribute
will retrieve their content from that source. All moduleRef wíth a source attribute
will retrieve their content from there.
An elementRef can also have a source attribute allowing you for example to re-add an
element left out by the parent ODD.
Despite these simple basics it is kind of cumbersome to find out exactly which
elements and modules come from exactly where, how they are defined, modified,
etc.
Rule of thumb: use fixed versions and keep It simple.
10
Compile odd1
• odd2odd.xsl
Create odd2
using compiled
odd1
• @source=...
ODD, chaining
http://teic.github.io/PDF/howtoChain.pdf
Inherit from other odd's
Now, a glance at perhaps one of the most promising possibilities of ODD, especially
when looking at the teipublisher implementation of it. You can specify a processing
model for elements.
This allows you to decouple element definition from visual element behaviour.
A model defines behaviour and can do so conditionally. You can provide parameters
for the processing. Parameter values originate from the actual element at the time of
processing.
OutputRendition should I think be avoided, instead rendition definitions should be
external, like (s)css and classes.
Teipublishertakes processing model a step further through the use of templates, web
components and xquery instead of xpath. We will probably be using it for digital
editions.
11
ODD, processing model
https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPM
https://teipublisher.com
https://e-editiones.org/
Very promising!
These are some examples of solutions at the Fryske Akademy. For corpora we
generate blacklab config and javascript from ODD and we use the html stylesheet
from TEI to build a fully functional corpus query system.
For dictionaries we generate rng, xsd and schematron that are used in a
validationhelper which is published to maven central. This library is then used in an
app that publishes approved dictionary articles. An exist-db app allows querying the
dictionary and presents results in either json or html.
Another example is a library for linguistics in corpora where the generatedxsd is
translated into jaxb classes using an also generated bind.xml. This library is used in a
Frisian lexicon service.
12
ODD
•corpora
blacklab
config, js
docs
borpus
linguistics
Usage in
applications
eclipse moxy
jax-rs, json rest
apache cxf wsdl2java
jaxb jax-ws, soap ws
maven central
jaxb2/xjc
jaxb classes
ODD
rng/xsd bind.xml
ODD
•dictionaries
rng/xsd/schematron validationhelper maven central
publish app, json
service, gui
Frisian lexicon
https://web2.fa.knaw.nl/corpus-frontend
https://web2.fa.knaw.nl/exist/apps/onfw/index.html (TEST!)
https://web2.fa.knaw.nl/foarkarswurdlist-ws/
Wrapping up I give you a list of pro's and con's of ODD based developments. The pro's
weigh heavier for us, perhaps the most problematic in practice is the complexity of
the development pipelines
that often consist out of multiple generation and publication steps and possibly
inherited dependencies.
For me as a java adapt it is a pitty that TEI focus is on rng, not xsd. I realy like and
benefit from jaxb and still hope xsd 1.1 will be a success and find it's way into a
follow-up for jaxb.
13
pros
• Reliable build processes that guaranteeinteroperability
• Maintain data logic inoneplace
• Generation ofrng, schematron,xsd
• Generation using xslt
• Sticking closeto TEI, benefit from updates and tools
• Limit knowledge and technologies to maintain
cons
• Niche (complex) knowledge
• Stylesheets maynot generatewhatyou want
• Chaining (inheritance) canbe confusing
• Hard to debug and test
• ODD change may cascadeupdates oflibs and applications
• Xsd support(via trang) less stablethen rng
For us at the Fryske Akademy there are a lot of reasons to stick to our ODD based
approach. Perhaps I raised some curiosity that will lead to increased use of ODD
which in turn will lead to a load of github issues on ODD that will be solved,
improving the usability of ODD.
14
To ODD or not to ODD
• It is possible to maintain stable build processes based on ODD
•With code generation
• Active community, active maintenance of stylesheets
• It is possible to build reusable libraries based on ODD
• Over the past 4 years little problems
• ODD syntax is rather simple
• ODD with teipublisher for digital editions and integration in blacklab
Thank you for watching, my live version will now be available if you have any
questions.
15
Thanks
Eduard Drenth
edrenth@fryske-akademy.nl
I would like odd to get a more prominent
place in the TEI stack and community. It
could be a well known goldmine

More Related Content

Similar to TEI ODD based development

The Drupal 7 Worst Practices Catalogue
The Drupal 7 Worst Practices CatalogueThe Drupal 7 Worst Practices Catalogue
The Drupal 7 Worst Practices CatalogueAlexandre Israël
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD MicrothesauriMarcia Zeng
 
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011jbarclay
 
Terraform training - Modules 🎒
Terraform training - Modules 🎒Terraform training - Modules 🎒
Terraform training - Modules 🎒StephaneBoghossian1
 
MVC Frameworks for building PHP Web Applications
MVC Frameworks for building PHP Web ApplicationsMVC Frameworks for building PHP Web Applications
MVC Frameworks for building PHP Web ApplicationsVforce Infotech
 
Django tutorial
Django tutorialDjango tutorial
Django tutorialKsd Che
 
Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Robin O'Brien
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Contributing to Drupal
Contributing to DrupalContributing to Drupal
Contributing to DrupalChris Skene
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentDiego Delon
 

Similar to TEI ODD based development (20)

The Drupal 7 Worst Practices Catalogue
The Drupal 7 Worst Practices CatalogueThe Drupal 7 Worst Practices Catalogue
The Drupal 7 Worst Practices Catalogue
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
 
Terraform training - Modules 🎒
Terraform training - Modules 🎒Terraform training - Modules 🎒
Terraform training - Modules 🎒
 
MVC Frameworks for building PHP Web Applications
MVC Frameworks for building PHP Web ApplicationsMVC Frameworks for building PHP Web Applications
MVC Frameworks for building PHP Web Applications
 
Drupal - Introduction to Drupal Creating Modules
Drupal - Introduction to Drupal Creating ModulesDrupal - Introduction to Drupal Creating Modules
Drupal - Introduction to Drupal Creating Modules
 
Django tutorial
Django tutorialDjango tutorial
Django tutorial
 
Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Introduction to Behavior Driven Development
Introduction to Behavior Driven Development
 
10 Ways To Improve Your Code
10 Ways To Improve Your Code10 Ways To Improve Your Code
10 Ways To Improve Your Code
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Contributing to Drupal
Contributing to DrupalContributing to Drupal
Contributing to Drupal
 
Dn D Custom 1
Dn D Custom 1Dn D Custom 1
Dn D Custom 1
 
Dn D Custom 1
Dn D Custom 1Dn D Custom 1
Dn D Custom 1
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Unit Test for ZF SlideShare Component
Unit Test for ZF SlideShare ComponentUnit Test for ZF SlideShare Component
Unit Test for ZF SlideShare Component
 
Demo
DemoDemo
Demo
 

Recently uploaded

^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In sowetokasambamuno
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanNeo4j
 
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...Abortion Clinic
 
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdfThe Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdfkalichargn70th171
 
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbankkasambamuno
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfsteffenkarlsson2
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024SimonedeGijt
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Eraconfluent
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfSrushith Repakula
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Soroosh Khodami
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)Roberto Bettazzoni
 
The Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test AutomationThe Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test AutomationElement34
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdftimtebeek1
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AIAGATSoftware
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...naitiksharma1124
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMarkus Moeller
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio, Inc.
 
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14VMware Tanzu
 
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckMarc Lester
 

Recently uploaded (20)

^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
 
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...
Sinoville Clinic ](+27832195400*)[🏥Abortion Pill Prices Sinoville ● Women's A...
 
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdfThe Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
 
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
^Clinic ^%[+27788225528*Abortion Pills For Sale In witbank
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdf
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)
 
The Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test AutomationThe Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test Automation
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdf
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
 
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
 

TEI ODD based development

  • 1. Welcome to my presentation on ODD. I have been working with ODD for about 4 years, first because I wanted universal dependency based linguistics in Frisian corpora, later because I wanted a strict TEI and universal dependency based dictionary format. I want to show you how ODD has helped me to deliver reliable, interoperable solutions. First I will show you what ODD is, then I will show you how ODD can be used in development pipelines. We will have a look at how ODD's can inherit from each other, and have a glance atteipublisher. Finally I will share some pro's and con's and the reasons why we at the Fryske Akademy stick to ODD as a basis for solution development. 1 TEI and ODD for LINGUISTICS A solid basis for development? edrenth@fryske-akademy.nl
  • 2. So what is an ODD? An ODD is a regular TEI document in which you define your data model using a schemaSpec. In the ODD you can document your data model using TEI elements such as div, p, gloss and def. In this documentation you can include the actual specifications, to which you can refer from within schemaSpec. This gives you a nice way to specify your data model in a documented manner. Inside the specification of an element you can indicate how this element is to be processed. We don't use this yet, but since the latest versions ofteipublisher it has become a very interesting mechanism. Once you have your ODD you can generate validation schemes, documentation and more. 2 What is ODD • One Document Doesall • It is a TEI document • Holding one schemaSpec element • It is the mechanism to customize TEI • TEI is designed to be customized • What canit do • Generate validation • Generate documentation • Describe processing model • https://tei-c.org/guidelines/customization/ • https://tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#IM-unified • https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDmodules • https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPMPM
  • 3. Here you see a basic example of a schema definition in ODD. Often the structure will be a schemaSpec with references to modules you want to work with and specifications of elements you want to change. The start attribute on schemaSpec tells which root element(s) is(are) allowed. Each moduleRef points to an existing module, available in TEI (online) or elsewhere. A source attribute, left out in this example, allows you to point to the file where module definitions can be found, I will show you more on this later. To limit the vast number of elements in modules you use the include or except attributes on moduleRef. NOTE that if you omit an element in the include attribute and refer to it later from an elementRef, schema generation will not fail, instead the element will just not be there. After the module references you usually list some elementSpecs for elements that you want to change. NOTE that omitting the mode attribute on elementSpec means add, not change. Adding already existing elements is weird but again often does not make transformation fail! Including a content element in an elementSpec will overwrite existing content. The content of content resembles for example xsd, basically you can use sequence and alternate. 3
  • 4. A very nice feature I think is the ability to use constraints in element specifications, most people will use schematron with assert, report and xpath. 3
  • 5. Besides elements also attributes can be specified, in this case I add linguistic feature attributes to the analysis module. These attributes can be defined in their own namespace. You can define the datatype of attributes which can be an xml schema datatype using the name attribute, or like here, a TEI datatype using key. 4
  • 6. You can refer to previously defined attributes via memberOf. When an element is a member of an attribute class, the attributes defined in this class are allowed for that element. NOTE the "mode is change" on classes, if you omit it the default will be "replace" meaning you will loose all other class memberships. 5
  • 7. More possibilities worth mentioning but not in detail. The first two keep things organized, the model specifies element processing. 6 • specGrp –specRef: grouping specs • macroSpec –macroRef: expanding spec content • model: definebehaviourofelements More possibilities
  • 8. Now, this is where the benefits really start. Once you have your ODD you can construct a pipeline without the need for coding that will give you validation and documentation, which you can use in for example editing environments like oxygen. First thing to do is "compile" or better said expand your ODD using the available TEI stylesheet. The necessary parts from modules in the TEI source will be combined with your specifications. After this is done you can transform to rng, again using the available stylesheet, or, transform to a separate schematron. Rng can be transformed to xsd, which you may want to generate jaxb classes. Last but not least there is a nice library that deals with the complexity of transforming schematron to xslt, the execution of validation and with the processing of validation results. You can also use oxygen to transform, or oxgarage oryou can use roma to construct ODD online, but the downside is this gives you less control and insight and you get the version of TEI source and stylesheets available in these tools at the time. 7 odd odd2odd • .compiled odd2rng • rng with schematron odd2sch • .sch trang • .xsd dmaus schxslt • xslt and/or java validation ODD, processing 1. Maven: https://bitbucket.org/fryske-akademy/online-dictionaries/src/master/pom.xml 2. Oxygen 3. https://oxgarage.tei-c.org/ 4. https://roma.tei-c.org/ 5. Command line / maven 1. https://github.com/TEIC/Stylesheets/tags 2 – 4 use a version you may not want!
  • 9. This makes me really happy! Recently I discovered it is possible, though verbose, to define a maven pipeline that implements a lot of steps I mostly performed by hand before. Now I can just do mvn verify, no ant needed either and no dependencies to online sources. 8 ODD, processing, maven <transformationSet> <stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/odd2odd.xsl</stylesheet> <parameters>….</parameters> <outputDir>src/main/resources/odd</outputDir> <fileMappers>….</fileMappers> </transformationSet> <transformationSet> <stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/odd2relax.xsl</stylesheet> <parameters>….</parameters> </transformationSet> <transformationSet> <stylesheet>src/main/Stylesheets-${stylesheetversion}/odds/extract-isosch.xsl</stylesheet> <outputDir>src/main/resources/schematron</outputDir> <fileMappers>….</fileMappers> </transformationSet> <plugin> <groupId>net.sigmalab.trang</groupId> <artifactId>trang-maven-plugin</artifactId> <version>1.2</version> 1. <dependency> <groupId>name.dmaus.schxslt</groupId> <artifactId>java</artifactId> <version>2.0.3</version> https://bitbucket.org/fryske-akademy/online-dictionaries/src/master/pom.xml
  • 10. On top of the available transformations from the TEI community I found it very useful to write transformations from ODD. For example to generate a configuration file for blacklab, which in turn is used to build lucene indexes. Transformations like that help to stay consistent and in control for example in case of data model changes. Naturally they can be included in maven pipelines. 9 ODD, generation https://search.maven.org/search?q=a:TeiLinguisticsFa https://bitbucket.org/fryske-akademy/tei-encoding/src/master/reusables/
  • 11. Something about inheritance now. I must admit I recently abandoned it, because of added complexity and lack of use-case. The basics are simple, write an ODD, compile it, write another ODD that uses the compiled first. The source attribute is crucial, you can specify it on schemaSpec, which means all moduleRef without a source attribute will retrieve their content from that source. All moduleRef wíth a source attribute will retrieve their content from there. An elementRef can also have a source attribute allowing you for example to re-add an element left out by the parent ODD. Despite these simple basics it is kind of cumbersome to find out exactly which elements and modules come from exactly where, how they are defined, modified, etc. Rule of thumb: use fixed versions and keep It simple. 10 Compile odd1 • odd2odd.xsl Create odd2 using compiled odd1 • @source=... ODD, chaining http://teic.github.io/PDF/howtoChain.pdf Inherit from other odd's
  • 12. Now, a glance at perhaps one of the most promising possibilities of ODD, especially when looking at the teipublisher implementation of it. You can specify a processing model for elements. This allows you to decouple element definition from visual element behaviour. A model defines behaviour and can do so conditionally. You can provide parameters for the processing. Parameter values originate from the actual element at the time of processing. OutputRendition should I think be avoided, instead rendition definitions should be external, like (s)css and classes. Teipublishertakes processing model a step further through the use of templates, web components and xquery instead of xpath. We will probably be using it for digital editions. 11 ODD, processing model https://tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPM https://teipublisher.com https://e-editiones.org/ Very promising!
  • 13. These are some examples of solutions at the Fryske Akademy. For corpora we generate blacklab config and javascript from ODD and we use the html stylesheet from TEI to build a fully functional corpus query system. For dictionaries we generate rng, xsd and schematron that are used in a validationhelper which is published to maven central. This library is then used in an app that publishes approved dictionary articles. An exist-db app allows querying the dictionary and presents results in either json or html. Another example is a library for linguistics in corpora where the generatedxsd is translated into jaxb classes using an also generated bind.xml. This library is used in a Frisian lexicon service. 12 ODD •corpora blacklab config, js docs borpus linguistics Usage in applications eclipse moxy jax-rs, json rest apache cxf wsdl2java jaxb jax-ws, soap ws maven central jaxb2/xjc jaxb classes ODD rng/xsd bind.xml ODD •dictionaries rng/xsd/schematron validationhelper maven central publish app, json service, gui Frisian lexicon https://web2.fa.knaw.nl/corpus-frontend https://web2.fa.knaw.nl/exist/apps/onfw/index.html (TEST!) https://web2.fa.knaw.nl/foarkarswurdlist-ws/
  • 14. Wrapping up I give you a list of pro's and con's of ODD based developments. The pro's weigh heavier for us, perhaps the most problematic in practice is the complexity of the development pipelines that often consist out of multiple generation and publication steps and possibly inherited dependencies. For me as a java adapt it is a pitty that TEI focus is on rng, not xsd. I realy like and benefit from jaxb and still hope xsd 1.1 will be a success and find it's way into a follow-up for jaxb. 13 pros • Reliable build processes that guaranteeinteroperability • Maintain data logic inoneplace • Generation ofrng, schematron,xsd • Generation using xslt • Sticking closeto TEI, benefit from updates and tools • Limit knowledge and technologies to maintain cons • Niche (complex) knowledge • Stylesheets maynot generatewhatyou want • Chaining (inheritance) canbe confusing • Hard to debug and test • ODD change may cascadeupdates oflibs and applications • Xsd support(via trang) less stablethen rng
  • 15. For us at the Fryske Akademy there are a lot of reasons to stick to our ODD based approach. Perhaps I raised some curiosity that will lead to increased use of ODD which in turn will lead to a load of github issues on ODD that will be solved, improving the usability of ODD. 14 To ODD or not to ODD • It is possible to maintain stable build processes based on ODD •With code generation • Active community, active maintenance of stylesheets • It is possible to build reusable libraries based on ODD • Over the past 4 years little problems • ODD syntax is rather simple • ODD with teipublisher for digital editions and integration in blacklab
  • 16. Thank you for watching, my live version will now be available if you have any questions. 15 Thanks Eduard Drenth edrenth@fryske-akademy.nl I would like odd to get a more prominent place in the TEI stack and community. It could be a well known goldmine