Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Mapping and Integration of Multiple Forms into Relational Databases
1. MAPPING & INTEGRATING
MULTIPLE FORMS INTO A DATABASE
Yuan An, Ritu Khare, Il-Yeol Song, Xiaohua Hu
Background The FormMapper System Desirable Characteristics
of Database
Semantic Form Tree (w.r.t. the input form)
Patient Information PatientInformation Tree Extraction Component Form Mapping and Integration Component
root Completeness
FORM
Date: piId Date Patient HPI VitalSign
X1 Layered Hidden Markov Initial Correspondence Correctness
Patient Y1 z1 Models(HMMs) x1 x2
Generation and Validation Compactness
Name: X2 DATABASE
Patient Y1 Y2 Y3 Merging
Y2 z2 Parent Child Association Normalization (3NF)
Gender: M F pId Name Gender DOB Database Birthing Algorithm NEW Algorithm
Y3 z3 Rules DB Optimization (minimize
DOB: z1 z2 z3 potential NULL values & the
HPI: Input Form
Gender Vital Signs Fig. 3 The FormMapper System has two components: (1) Tree Extraction (2) Form Integration. number of database
Vital Sign elements)
Height: gId options vId Height Weight BP
Key Techniques Tj Tj
Weight: 001 Male Tj Tj
Hierarchical Representation of Forms as Form Trees ID c ID f ID
BP: 002 Female ID f
Hidden Markov Models for Form Information Extraction
Sophisticated Matching techniques for Deriving Mapping Tr
Fig. 1 Using forms as the front-end interface mapping to a back-end database is a T T T
Correspondences between tree and database ID fj f
ID Options ID
standard way for data collection. Figure shows a scenario in healthcare domain textbox radiobutton checkbox
ID ck
Form Tree Patterns and DB design principles to translate a 1 Vk
form tree into an equivalent database (See Fig. 4) a)Textbox Pattern
Motivation and Focus Quantitative metric (quality tuning factor) to facilitate the d)Category – Subcategory Pattern
In the quest for database usability, several DIY and WYSIWYG approaches decision of merging(or not merging) two mapped tables b)Radiobutton Pattern c)Checkbox Pattern
enable non-technical users to design forms. Such approaches (e.g. Fig. 4 Some Form Tree to Database Mapping Patterns.
FormAssembly) automatically translate forms into databases while
shielding the users from technical details. Such approaches, however,
neither support database evolution due to changing user requirements Implications
nor support multiple users managing a common database. Empirical Study in Healthcare FormMapper
Vs Gold 1
FormMapper
Vs Gold 2
High potential to replace the
human experts
While there exist many techniques to forward engineer a single form to Perfect
6%
As more forms are mapped, the
200
an individual back-end database, mapping multiple forms to an existing Datasets Tree Extraction Component Database 1 FormMapper 20% Match database grows automatically in
150
Expectation Maximization Gold 1 Positive
structured database remains unexplored. This work addresses the 16 highly complex data- 100 Mismatch
40 a principled manner .
Algorithm on 52 clinical forms Gold 2 28% 52% 54
50 % It is challenging to automate the
problem of automatically mapping multiple(possibly overlapping) entry forms from 3 Negative %
Viterbi Algorithm for decoding 0 Mismatch aspects of mapping that rely on
forms to an existing structured database. healthcare institutions.
5 parent child association rules Tables Columns Values Foreign human understanding of domain
Average 57 form Fig. 6. Comparison of Tables.
Keys semantics.
Healthy Living Program Challenges in Mapping Forms to Databases elements per form Accuracy: 96.93% 200
Date: How to automatically understand a user- 150 Database 2 FormMapper Vs Gold DB
Benchmarks Duration: 0.07 sec per form 100
Patient created form and extract semantic On an average, 87% of the database
16 Gold Standard Trees 50 Work in Progress
Name: relationships among form elements? tables are either identical or
Prepared Using a DIY
Form Integration Component 0 Leverage Ontology and Controlled
DOB: superior(positive mismatch) to the
How to automatically map the semantic Indexing using Lucene Tables Columns Values Foreign Vocabularies to handle semantic
form design tool. Keys gold database tables based on the
Social Activities model extracted from a form to the Quality tuning factor = 0.5 defined database characteristics. heterogeneity.
Smokes: existing database? Two sets of 3 Gold 200 Database 3
Inferior cases (negative mismatch) is More sophisticated
Standard Databases Duration: 3 sec per form 150
Correspondence Generation and
Alcohol: How to automatically evolve the existing prepared by 2 database 100 mostly due to the missing
database with desired properties and 50 correspondences (due to extraction Validation Techniques
Hours Watching TV: experts each with at
what are these properties? least 10 years of 0 inaccuracies) and imprecisely derived Consider more complicated
Hours Exercise:
experience. Tables Columns Values Foreign cardinalities among merging situations (e.g. a table
Keys category/subcategory in forms. corresponds to a column)
Fig. 2 A New Form representing a Fig. 5. Scale of the evolved Databases
new (or evolved) user requirement
CVDI is a collaboration between the University of Louisiana at Lafayette & Drexel University