SlideShare une entreprise Scribd logo
1  sur  20
Nicoletta Cibella , Tiziana Tuoto Istituto Nazionale di Statistica  – ISTAT – Direzione centrale per le tecnologie e il supporto metodologico (DCMT) RELAIS, a powerful instrument to support public statistic RELAIS, un valido strumento di supporto alla statistica pubblica
Outline ,[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
The problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Record linkage in Istat ,[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Possible Solutions for Record Linkage A very jeopardized picture, not only in Istat. Different approaches to deal with record linkage: Exact RL - Deterministic RL - Probabilistic RL (Fellegi and Sunter theory) - Bayesian RL - Machine Learning - Knowledge Representation … No particular technique has emerged as the best solution for all cases  (maybe because such a solution does not exist…) Several software and tools proposed, based on different approaches, free or commercial.  Nicoletta Cibella,  VSP, APRILE 2011
RELAIS, brief history ,[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
RELAIS: a solution ,[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1. Decompose RL in phases Nicoletta Cibella,  VSP, APRILE 2011
2. Choose the most appropriate techniques Nicoletta Cibella,  VSP, APRILE 2011
3. Build ad-hoc RL workflows Nicoletta Cibella,  VSP, APRILE 2011 Preprocessing Search Space Reduction Comparison Function Decision Model Normalization UpperLowerCase Blocking SNM Edit Distance Jaro Equality Probabilistic Deterministic RecLink WF Appl2 SNM Probabilistic RecLink WF Appl1 Normalization UpperLowerCase Blocking Jaro Deterministic Equality
Main features of RELAIS  ,[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
RELAIS and the open-source EUPL: European Union Public Licence Winning choice of the open-source philosophy and of the overcoming of ad-hoc approaches Sharing experiences and solutions with NSIs of Spain, UK, Tunisia, Brazil, … Training on the job in Uk on January 2011 and in Latvia on July  Thanks to the modular approach and the OS, adding new techniques to the pool already available is really easy Nicoletta Cibella,  VSP, APRILE 2011
RELAIS 2.0 in June 2009 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Relational database architecture - to optimize the performances with respect to the management of huge amount of data through the whole record linkage project (input, intermediate phase and output). Two modalities to process blocks: a) step by step executions when blocks are few or in exploratory phase and b) one-shot execution to deal with a large amount of blocks (on Spanish NSI suggestion). Explicit management of the output and residual files to iterate several processes and back-up management. Adds on RELAIS 2.0 Nicoletta Cibella,  VSP, APRILE 2011
RELAIS 2.1 is already available on OSOR and Istat websites. Relational database support: input of data from database Oracle or MySQL. New default input values for the parameter estimation of the probabilistic model and new definition of the candidate pairs for the optimal 1:1 reduction.  More than one variable for search space reduction by sorted neighborhood method. Minor bugs have been solved. RELAIS 2.1 in May 2010 Nicoletta Cibella,  VSP, APRILE 2011
A glance on RELAIS 2.1
RELAIS 2.2 in May 2011  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Next challenges ,[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Future research projects  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Nicoletta Cibella,  VSP, APRILE 2011
Thanks and Invitation to Cooperations  RELAIS Contacts: Computer Scientists: Monica Scannapieco  E-mail:  [email_address] Laura Tosco E-mail:  [email_address] Luca Valentino E-mail:  [email_address] Statisticians: Nicoletta Cibella E-mail:  [email_address] Tiziana Tuoto  E-mail:  [email_address] http://www.istat.it/strumenti/metodi/software/analisi_dati/relais/  http://www.osor.eu/projects/relais

Contenu connexe

Tendances

M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...Martin Scharm
 
Comparison of Meta-Modeling Languages
Comparison  of Meta-Modeling LanguagesComparison  of Meta-Modeling Languages
Comparison of Meta-Modeling Languagesheigoo
 
Interoperability of Meta-Modeling Tools
Interoperability of Meta-Modeling ToolsInteroperability of Meta-Modeling Tools
Interoperability of Meta-Modeling Toolsheigoo
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...Martin Scharm
 
Mapping-Based Exchange of Models between Meta-Modeling Tools
Mapping-Based Exchange of Models between Meta-Modeling ToolsMapping-Based Exchange of Models between Meta-Modeling Tools
Mapping-Based Exchange of Models between Meta-Modeling Toolsheigoo
 
Modeling and Evaluation of Performance and Reliability of Component-based So...
Modeling and Evaluation of Performance and Reliability  of Component-based So...Modeling and Evaluation of Performance and Reliability  of Component-based So...
Modeling and Evaluation of Performance and Reliability of Component-based So...Editor IJCATR
 
FOMI2017 - A method to generate a modular ifcOWL ontology
FOMI2017 - A method to generate a modular ifcOWL ontologyFOMI2017 - A method to generate a modular ifcOWL ontology
FOMI2017 - A method to generate a modular ifcOWL ontologyPieter Pauwels
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Sebastiano Panichella
 
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...Pieter Pauwels
 
Resume_Larry_Sun_2015
Resume_Larry_Sun_2015Resume_Larry_Sun_2015
Resume_Larry_Sun_2015Larry Sun
 
Toward a Recommendation System for focusing Testing
Toward a Recommendation System for focusing TestingToward a Recommendation System for focusing Testing
Toward a Recommendation System for focusing Testingrsse2008
 

Tendances (14)

M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
Comparison of Meta-Modeling Languages
Comparison  of Meta-Modeling LanguagesComparison  of Meta-Modeling Languages
Comparison of Meta-Modeling Languages
 
Interoperability of Meta-Modeling Tools
Interoperability of Meta-Modeling ToolsInteroperability of Meta-Modeling Tools
Interoperability of Meta-Modeling Tools
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
Mapping-Based Exchange of Models between Meta-Modeling Tools
Mapping-Based Exchange of Models between Meta-Modeling ToolsMapping-Based Exchange of Models between Meta-Modeling Tools
Mapping-Based Exchange of Models between Meta-Modeling Tools
 
Modeling and Evaluation of Performance and Reliability of Component-based So...
Modeling and Evaluation of Performance and Reliability  of Component-based So...Modeling and Evaluation of Performance and Reliability  of Component-based So...
Modeling and Evaluation of Performance and Reliability of Component-based So...
 
IR tutorial
IR tutorialIR tutorial
IR tutorial
 
FOMI2017 - A method to generate a modular ifcOWL ontology
FOMI2017 - A method to generate a modular ifcOWL ontologyFOMI2017 - A method to generate a modular ifcOWL ontology
FOMI2017 - A method to generate a modular ifcOWL ontology
 
TEFSE05.ppt
TEFSE05.pptTEFSE05.ppt
TEFSE05.ppt
 
ML Schema: Machine Learning Schema
ML Schema: Machine Learning SchemaML Schema: Machine Learning Schema
ML Schema: Machine Learning Schema
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...
 
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...
FOMI2017 - Reusing Domain Ontologies in Linked Building Data: the Case of Bui...
 
Resume_Larry_Sun_2015
Resume_Larry_Sun_2015Resume_Larry_Sun_2015
Resume_Larry_Sun_2015
 
Toward a Recommendation System for focusing Testing
Toward a Recommendation System for focusing TestingToward a Recommendation System for focusing Testing
Toward a Recommendation System for focusing Testing
 

En vedette

WPACOMInitiationOverview
WPACOMInitiationOverviewWPACOMInitiationOverview
WPACOMInitiationOverviewRyan Hunter
 
SplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunk
 
Big Data at Ancestry.com
Big Data at Ancestry.comBig Data at Ancestry.com
Big Data at Ancestry.comLeonid Zhukov
 
Intro to ancestry.com
Intro to ancestry.comIntro to ancestry.com
Intro to ancestry.comLarry Naukam
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint201014161
 
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.com
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.comFree Resources on Ancestry.com, Fold3, Newspapers.com and Archives.com
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.comAncestry.com
 

En vedette (8)

WPACOMInitiationOverview
WPACOMInitiationOverviewWPACOMInitiationOverview
WPACOMInitiationOverview
 
SplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.com
 
The jobs crisis : trends and policy
The jobs crisis : trends and policyThe jobs crisis : trends and policy
The jobs crisis : trends and policy
 
Big Data at Ancestry.com
Big Data at Ancestry.comBig Data at Ancestry.com
Big Data at Ancestry.com
 
Intro to ancestry.com
Intro to ancestry.comIntro to ancestry.com
Intro to ancestry.com
 
sisvsp2012_sessione7_montella_righi
sisvsp2012_sessione7_montella_righisisvsp2012_sessione7_montella_righi
sisvsp2012_sessione7_montella_righi
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint
 
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.com
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.comFree Resources on Ancestry.com, Fold3, Newspapers.com and Archives.com
Free Resources on Ancestry.com, Fold3, Newspapers.com and Archives.com
 

Similaire à Ws2001 sessione8 cibella_tuoto

Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for RequirementsClément Portet
 
PhD Thesis: Operationalization of Collaborative Blended Learning Scripts
PhD Thesis: Operationalization of Collaborative Blended Learning ScriptsPhD Thesis: Operationalization of Collaborative Blended Learning Scripts
PhD Thesis: Operationalization of Collaborative Blended Learning ScriptsMar Pérez-Sanagustín
 
Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open DataBlerina Spahiu
 
Linking data, models and tools an overview
Linking data, models and tools an overviewLinking data, models and tools an overview
Linking data, models and tools an overviewGennadii Donchyts
 
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...Martin Chapman
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Daniel Beucke
 
BOTTARI: Location based Social Media Analysis with Semantic Web
BOTTARI: Location based Social Media Analysis with Semantic WebBOTTARI: Location based Social Media Analysis with Semantic Web
BOTTARI: Location based Social Media Analysis with Semantic WebEmanuele Della Valle
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Sharmila Sathish
 
Working in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language ModelsWorking in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language ModelsZachary S. Brown
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring reportpathsproject
 
NRNB Annual Report 2012
NRNB Annual Report 2012NRNB Annual Report 2012
NRNB Annual Report 2012Alexander Pico
 
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...
A N  E XTENSION OF  P ROTÉGÉ FOR AN AUTOMA TIC  F UZZY - O NTOLOGY BUILDING U...A N  E XTENSION OF  P ROTÉGÉ FOR AN AUTOMA TIC  F UZZY - O NTOLOGY BUILDING U...
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...ijcsit
 
Towards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignTowards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignDanilo Pianini
 
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...Ralf Klamma
 
A PNML extension for the HCI design
A PNML extension for the HCI designA PNML extension for the HCI design
A PNML extension for the HCI designWaqas Tariq
 
Visualizing Networked Collaboration
Visualizing Networked CollaborationVisualizing Networked Collaboration
Visualizing Networked CollaborationAhmet Soylu
 
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...Marco Brambilla
 
IRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET Journal
 

Similaire à Ws2001 sessione8 cibella_tuoto (20)

Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for Requirements
 
PhD Thesis: Operationalization of Collaborative Blended Learning Scripts
PhD Thesis: Operationalization of Collaborative Blended Learning ScriptsPhD Thesis: Operationalization of Collaborative Blended Learning Scripts
PhD Thesis: Operationalization of Collaborative Blended Learning Scripts
 
Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
 
Linking data, models and tools an overview
Linking data, models and tools an overviewLinking data, models and tools an overview
Linking data, models and tools an overview
 
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
 
BOTTARI: Location based Social Media Analysis with Semantic Web
BOTTARI: Location based Social Media Analysis with Semantic WebBOTTARI: Location based Social Media Analysis with Semantic Web
BOTTARI: Location based Social Media Analysis with Semantic Web
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
 
Working in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language ModelsWorking in NLP in the Age of Large Language Models
Working in NLP in the Age of Large Language Models
 
The Value and Benefits of Data-to-Text Technologies
The Value and Benefits of Data-to-Text TechnologiesThe Value and Benefits of Data-to-Text Technologies
The Value and Benefits of Data-to-Text Technologies
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring report
 
NRNB Annual Report 2012
NRNB Annual Report 2012NRNB Annual Report 2012
NRNB Annual Report 2012
 
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...
A N  E XTENSION OF  P ROTÉGÉ FOR AN AUTOMA TIC  F UZZY - O NTOLOGY BUILDING U...A N  E XTENSION OF  P ROTÉGÉ FOR AN AUTOMA TIC  F UZZY - O NTOLOGY BUILDING U...
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...
 
STI Summit 2011 - Visual analytics and linked data
STI Summit 2011 - Visual analytics and linked dataSTI Summit 2011 - Visual analytics and linked data
STI Summit 2011 - Visual analytics and linked data
 
Towards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignTowards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems Design
 
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...
A Media-Theoretical Approach to Technology Enhanced Learnng in Non-Technical ...
 
A PNML extension for the HCI design
A PNML extension for the HCI designA PNML extension for the HCI design
A PNML extension for the HCI design
 
Visualizing Networked Collaboration
Visualizing Networked CollaborationVisualizing Networked Collaboration
Visualizing Networked Collaboration
 
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
IRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural Network
 

Plus de Gruppo Valorizzazione delle Statistiche Pubbliche

Plus de Gruppo Valorizzazione delle Statistiche Pubbliche (20)

sisvsp2012_sessione7_albisinni_marzilli_pintaldi
sisvsp2012_sessione7_albisinni_marzilli_pintaldisisvsp2012_sessione7_albisinni_marzilli_pintaldi
sisvsp2012_sessione7_albisinni_marzilli_pintaldi
 
sisvsp2012_sessione6_vignani_auci
sisvsp2012_sessione6_vignani_aucisisvsp2012_sessione6_vignani_auci
sisvsp2012_sessione6_vignani_auci
 
sisvsp2012_sessione6_serafini
sisvsp2012_sessione6_serafinisisvsp2012_sessione6_serafini
sisvsp2012_sessione6_serafini
 
sisvsp2012_sessione6_righi_recchini
sisvsp2012_sessione6_righi_recchinisisvsp2012_sessione6_righi_recchini
sisvsp2012_sessione6_righi_recchini
 
sisvsp2012 sessione6 biggeri_laureti_secondi
sisvsp2012 sessione6 biggeri_laureti_secondisisvsp2012 sessione6 biggeri_laureti_secondi
sisvsp2012 sessione6 biggeri_laureti_secondi
 
sisvsp2012_sessione5 tola_romeo
sisvsp2012_sessione5 tola_romeosisvsp2012_sessione5 tola_romeo
sisvsp2012_sessione5 tola_romeo
 
sisvsp2012 sessione1_lucarelli_baussola_mussida
sisvsp2012 sessione1_lucarelli_baussola_mussidasisvsp2012 sessione1_lucarelli_baussola_mussida
sisvsp2012 sessione1_lucarelli_baussola_mussida
 
sisvsp2012_sessione1_gallo_oteri_scalisi
sisvsp2012_sessione1_gallo_oteri_scalisisisvsp2012_sessione1_gallo_oteri_scalisi
sisvsp2012_sessione1_gallo_oteri_scalisi
 
sisvsp2012_sessione1_calzola
sisvsp2012_sessione1_calzolasisvsp2012_sessione1_calzola
sisvsp2012_sessione1_calzola
 
sisvsp2012_sessione1_biffignandi_toninelli
sisvsp2012_sessione1_biffignandi_toninellisisvsp2012_sessione1_biffignandi_toninelli
sisvsp2012_sessione1_biffignandi_toninelli
 
sisvsp2012 sessione5_cardacino_vignola
sisvsp2012 sessione5_cardacino_vignolasisvsp2012 sessione5_cardacino_vignola
sisvsp2012 sessione5_cardacino_vignola
 
sisvsp2012sessione3_bruzzone_tuoto_cibella_valentini_pappagallo_baldassarre
sisvsp2012sessione3_bruzzone_tuoto_cibella_valentini_pappagallo_baldassarresisvsp2012sessione3_bruzzone_tuoto_cibella_valentini_pappagallo_baldassarre
sisvsp2012sessione3_bruzzone_tuoto_cibella_valentini_pappagallo_baldassarre
 
sisvsp2012_sessione3_rossetti
sisvsp2012_sessione3_rossettisisvsp2012_sessione3_rossetti
sisvsp2012_sessione3_rossetti
 
sisvsp2012_sessione3_mazziotta_bernardini_de gaetano_soriani
sisvsp2012_sessione3_mazziotta_bernardini_de gaetano_sorianisisvsp2012_sessione3_mazziotta_bernardini_de gaetano_soriani
sisvsp2012_sessione3_mazziotta_bernardini_de gaetano_soriani
 
sisvsp2012_sessione3_da valle_faustini_tessitore_valentini
sisvsp2012_sessione3_da valle_faustini_tessitore_valentinisisvsp2012_sessione3_da valle_faustini_tessitore_valentini
sisvsp2012_sessione3_da valle_faustini_tessitore_valentini
 
sisvsp2012_sessione4_viviani_mantegazza_pisani
sisvsp2012_sessione4_viviani_mantegazza_pisanisisvsp2012_sessione4_viviani_mantegazza_pisani
sisvsp2012_sessione4_viviani_mantegazza_pisani
 
sisvsp2012_sessione4_fusco_de francesco_moretti_mortara_broccoli
sisvsp2012_sessione4_fusco_de francesco_moretti_mortara_broccolisisvsp2012_sessione4_fusco_de francesco_moretti_mortara_broccoli
sisvsp2012_sessione4_fusco_de francesco_moretti_mortara_broccoli
 
sisvsp2012_sessione4_bini_nascia_zeli
sisvsp2012_sessione4_bini_nascia_zelisisvsp2012_sessione4_bini_nascia_zeli
sisvsp2012_sessione4_bini_nascia_zeli
 
sisvsp2012_sessione9_montella_dishnica
sisvsp2012_sessione9_montella_dishnicasisvsp2012_sessione9_montella_dishnica
sisvsp2012_sessione9_montella_dishnica
 
sisvsp2012_sessione9_giusti_marchetti_pratesi_
sisvsp2012_sessione9_giusti_marchetti_pratesi_sisvsp2012_sessione9_giusti_marchetti_pratesi_
sisvsp2012_sessione9_giusti_marchetti_pratesi_
 

Dernier

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Dernier (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Ws2001 sessione8 cibella_tuoto

  • 1. Nicoletta Cibella , Tiziana Tuoto Istituto Nazionale di Statistica – ISTAT – Direzione centrale per le tecnologie e il supporto metodologico (DCMT) RELAIS, a powerful instrument to support public statistic RELAIS, un valido strumento di supporto alla statistica pubblica
  • 2.
  • 3.
  • 4.
  • 5. Possible Solutions for Record Linkage A very jeopardized picture, not only in Istat. Different approaches to deal with record linkage: Exact RL - Deterministic RL - Probabilistic RL (Fellegi and Sunter theory) - Bayesian RL - Machine Learning - Knowledge Representation … No particular technique has emerged as the best solution for all cases (maybe because such a solution does not exist…) Several software and tools proposed, based on different approaches, free or commercial. Nicoletta Cibella, VSP, APRILE 2011
  • 6.
  • 7.
  • 8.
  • 9. 2. Choose the most appropriate techniques Nicoletta Cibella, VSP, APRILE 2011
  • 10. 3. Build ad-hoc RL workflows Nicoletta Cibella, VSP, APRILE 2011 Preprocessing Search Space Reduction Comparison Function Decision Model Normalization UpperLowerCase Blocking SNM Edit Distance Jaro Equality Probabilistic Deterministic RecLink WF Appl2 SNM Probabilistic RecLink WF Appl1 Normalization UpperLowerCase Blocking Jaro Deterministic Equality
  • 11.
  • 12. RELAIS and the open-source EUPL: European Union Public Licence Winning choice of the open-source philosophy and of the overcoming of ad-hoc approaches Sharing experiences and solutions with NSIs of Spain, UK, Tunisia, Brazil, … Training on the job in Uk on January 2011 and in Latvia on July Thanks to the modular approach and the OS, adding new techniques to the pool already available is really easy Nicoletta Cibella, VSP, APRILE 2011
  • 13.
  • 14. Relational database architecture - to optimize the performances with respect to the management of huge amount of data through the whole record linkage project (input, intermediate phase and output). Two modalities to process blocks: a) step by step executions when blocks are few or in exploratory phase and b) one-shot execution to deal with a large amount of blocks (on Spanish NSI suggestion). Explicit management of the output and residual files to iterate several processes and back-up management. Adds on RELAIS 2.0 Nicoletta Cibella, VSP, APRILE 2011
  • 15. RELAIS 2.1 is already available on OSOR and Istat websites. Relational database support: input of data from database Oracle or MySQL. New default input values for the parameter estimation of the probabilistic model and new definition of the candidate pairs for the optimal 1:1 reduction. More than one variable for search space reduction by sorted neighborhood method. Minor bugs have been solved. RELAIS 2.1 in May 2010 Nicoletta Cibella, VSP, APRILE 2011
  • 16. A glance on RELAIS 2.1
  • 17.
  • 18.
  • 19.
  • 20. Thanks and Invitation to Cooperations RELAIS Contacts: Computer Scientists: Monica Scannapieco E-mail: [email_address] Laura Tosco E-mail: [email_address] Luca Valentino E-mail: [email_address] Statisticians: Nicoletta Cibella E-mail: [email_address] Tiziana Tuoto E-mail: [email_address] http://www.istat.it/strumenti/metodi/software/analisi_dati/relais/ http://www.osor.eu/projects/relais

Notes de l'éditeur

  1. The record linkage techniques are a multidisciplinary set of methods and practices
  2. The record linkage techniques are a multidisciplinary set of methods and practices
  3. RELAIS has been implemented in Java and R and has a database architecture (MySQL).