SlideShare une entreprise Scribd logo
1  sur  35
University of Pisa Mastering the Spatio-Temporal Knowledge Discovery Process PhD Candidate:Roberto Trasarti PhD Thesis discussion
Spatio-Temporal context Research on moving-object data analysis has been recently fostered by the widespread diffusion of new  techniques and systems for monitoring, collecting and storing location-aware data, generated by a wealth of technological  infrastructures, such as: Global Positioning System (GPS) Global System for Mobile (GSM) Sensor networks
Knowledge Discovery Process Knowledge discovery is a multi-step process, that involves data preprocessing, pattern mining stages and pattern post-processing.
Motivations Lack of a unifying framework, where mining tools are specific components of the knowledge discovery process.   ? Models Data Having elements from different worlds causes an impedence mismatch
Related Works In the literature there aren’t proposals addressing the problem of an uniform framework There are approaches on Moving Objects Database such as Secondo and Hermes which provide some primitives. The thesis work has been inspired by well known literature works on the inductive database vision
The proposed Framework A conceptual framework that poses the basis of the proposed data mining query language and the developed system, the Two-Worlds model. This thesis proposes: ,[object Object]
A set of operators between the two-worlds,[object Object]
Object representation of Data and Models Using the object relational paradigm we represent data and models as objects The set of attribute types A can be partitioned in three subset :  AsAd   Am Ad Data Types Data World Spatial objectTemporal object Moving object AmModelstypes Model World T-Pattern objectsCluster object Flock object Object Type
Data Types y Spatial objectis an object which has a geometric shape and a position in space. Temporal objectis an object which has an absolute temporal reference and a duration. Moving objectis an object which changesin time and space.  x t y t x
Data-World The D-World represents the entities to be analyzed, as well as their properties and mutual relationships.  Intuitively the D-World is the set of entities which describe the trajectory dataset and/or a set of regions and/or a partition of the day.  The D-World is a set of tables defined only by attributes in Ad and As
Models Types T-Pattern is a concise description of frequent behaviors, in terms of both space and time Clusteris a the spatio-temrporal affinitybetween a set of moving objectsw.r.t. a distance function. Flockis the spatio-temporal coincidence between a set of moving objectswho move togheter. RegionA RegionC RegionB 10 min 5 min
Model-World The M-World contains all the movement patterns extracted from the data with their properties and relationships.  The M-World contains the collection of models, unveiled at the different stages of the knowledge discovery process. The M-World is a set of tables defined only by attributes in Am and As
Two-Worlds Operators Operators can be intra-world or inter-world and for each type different classes of operators have been defined.
The aim of this class of operators is to build objects in D-World starting from the raw data. It realizes the data acquisition step of the knowledge discovery process.  Generic Data Constructor operator is defined as OPconstructor(T,p)  Td  Data Constructor Operators
This kind operatorsrealizes the extractionof models from the D-World through data mining algorithms. Generic Model Constructor operator is defined as OPmining(Td,p)  Tm Model Constructor Operators
Transformation operators are intra-world tasks aimed at manipulating data and models  These operations are the means for expressing data pre-processing and post-processing tasks. Generic D-Transformation operator is defined as OPD-Transf(Td,p) T’d Generic M-Transformation operator is defined as OPM-Transf(Tm,p) T’m Transformation Operators
Relation operatorsinclude both intra-worldand inter-world operations and have the objective of creating relations between data, models, and the combination of the two. Generic DD-Relation operator is defined as OPDD-Relation (Tdd,f ) TRdd Generic MM-Relation operator is defined as OPMM-Relation (Tmm,f ) TRmm Generic DM-Relation operator is defined as OPDM-Relation (Tdm,f ) TRdm Relation Operators
The predicate f can assume a large variety of predicates. However, the semantics of these predicates depends on the type of the data (resp.model) objects to which they are applied. Predicates of relation operators DD DM MM
Data Mining Query Language We defined a data mining query language to support the user during knowledge discovery tasks.  Three advantages: ,[object Object]
The iterative querying
The repeatability of the process,[object Object]
The Design of the GeoPKDD system The GeoPKDD system is an implementation of the Two-Worlds model and the Data Mining Query Language.
Object Realtional Database and Database Manager As described above the object relational database contains both data and models and grants the power of SQL. It contains the representation of data and models. The database manager realizes a middle layer and using the translation libraries detaches the system from the database techonologies
Language Parser and Controller Identifies the various types of queries and builds a plan of execution of them as sequence of actions for the controller. Example: CREATE MODELS ClusteringTable USING OPTICSFROM (Select t.id, t.trajobj fromTrajectories t)SET OPTICS.distance_method = Route Similarity AND        OPTICS.eps = 50 AND        OPTICS.min_size = 100 Plan: Retrieve[ Select t.id, t.trajobj from Trajectories t ]  Translate[ Data type: Moving point ] Execute[ Mining algorithm: Optics algorithm, Parameters: ...  ] Translate[ Model type: Cluster ] Store[ Table Name: ClusteringTable ]
Algorithms Manager This component is a plug-in module capable of managing different sets of libraries Each library realizes a different sets of operators according to the Two-World framework proposed.
Algorithms Libraries Data construction library Moving object Reconstruction algorithm Spatial object Builder algotirhm Termporal object Builder algoritm Model construction library T-Pattern algorithm Optics algorithm T-Flock algorithm Transformation library Resampling algorithm Intersection algoritm Object filtering T-Anonimity algorithms Relation Library All the predicates CREATE DATA MobilityData BUILDING MOVING_POINTSFROM (SELECT userid,lon,lat,datetime 	FROM MobilityRawData 	ORDER BY userid,datetime)   SET MOVING_POINT.MAX_SPACE_GAP = 2000m AND        MOVING_POINT.MAX_TIME_GAP = 1800 sec  CREATE MODELS Patterns USING T-PATTERNFROM (Select t.id, t.trajobj from Trajectories t)    SET T-PATTERN.support = .02 AND        T-PATTERN.time = 120 sec  CREATE TRANSFORMATION AnonimizedData USING NWA     FROM (SELECT t.id, t.trajobj FROM Trajectories t)     SET ANONYMIZATION.K = 10 AND             ANONYMIZATION.TIME_SLOT = 600 sec  CREATE RELATION EntailmentTable USING ENTIAL     FROM (SELECT t.id, t.trajobj, p.id, p.obj FROM Trajectories t, Patterns p)
Extending the system The GeoPKDD system provides various way to be extended: ,[object Object]
Algorithm level: new algrorithms
Types level: new data types or model types,[object Object]
Add-ons: Location Prediction The goal is to constructs a predictive model using the set of T-patterns extracted on a set of trajectories. Given a new trajectory the predictive model can be used to predict the next location of it. Prediction Tree Local patterns Trajectory dataset CREATE TRANSFORMATION TPatternTree USING TPATTERN_TREE FROM( Select p.id, p.TpatternObj FROM PatternTable p )
Add-ons: K-Best Map Matching A new way to perform the Map Matching The shortest path assumption in real cases can be violated in situations where other external factors play a role (i.e. Traffic congestion) CREATE DATA K-MobilityData BUILDING K-MOVING_POINTS FROM( SELECT userid, lon, lat, datetime FROM MobilityRawData ORDER BY userid, datetime) SET K-MOVING_POINTS.K = 5 AND          K-MOVING_POINTS.MAP = StreetMapFile.wkt
A Case Study in a Urban Mobility Scenario A set of experiments performed on a real world case study, demonstrating the capabilities of the GeoPKDD system and how this can be exploited to extract useful knowledge from raw mobility data.  ,[object Object]
17K private cars
One week of ordinary mobility
200K trips (trajectories)

Contenu connexe

Tendances

Text classification using Text kernels
Text classification using Text kernelsText classification using Text kernels
Text classification using Text kernelsDev Nath
 
Text categorization
Text categorizationText categorization
Text categorizationKU Leuven
 
Packet Classification using Support Vector Machines with String Kernels
Packet Classification using Support Vector Machines with String KernelsPacket Classification using Support Vector Machines with String Kernels
Packet Classification using Support Vector Machines with String KernelsIJERA Editor
 
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018Codemotion
 
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...IRJET Journal
 
Vchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVijay Koushik
 
Applications of data structures
Applications of data structuresApplications of data structures
Applications of data structuresWipro
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in JuliaJiahao Chen
 
Matplotlib Review 2021
Matplotlib Review 2021Matplotlib Review 2021
Matplotlib Review 2021Bhaskar J.Roy
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...AM Publications
 
Text classification-php-v4
Text classification-php-v4Text classification-php-v4
Text classification-php-v4Glenn De Backer
 
Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structureeShikshak
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R LanguageGaurang Dobariya
 

Tendances (19)

Query trees
Query treesQuery trees
Query trees
 
Text classification using Text kernels
Text classification using Text kernelsText classification using Text kernels
Text classification using Text kernels
 
Basic data-structures-v.1.1
Basic data-structures-v.1.1Basic data-structures-v.1.1
Basic data-structures-v.1.1
 
Text categorization
Text categorizationText categorization
Text categorization
 
Packet Classification using Support Vector Machines with String Kernels
Packet Classification using Support Vector Machines with String KernelsPacket Classification using Support Vector Machines with String Kernels
Packet Classification using Support Vector Machines with String Kernels
 
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018
 
Data structures
Data structuresData structures
Data structures
 
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...
 
Vchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joinsVchunk join an efficient algorithm for edit similarity joins
Vchunk join an efficient algorithm for edit similarity joins
 
Applications of data structures
Applications of data structuresApplications of data structures
Applications of data structures
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in Julia
 
Matplotlib Review 2021
Matplotlib Review 2021Matplotlib Review 2021
Matplotlib Review 2021
 
98 34
98 3498 34
98 34
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
 
Text classification-php-v4
Text classification-php-v4Text classification-php-v4
Text classification-php-v4
 
Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structure
 
Sharbani bhattacharya VB Structures
Sharbani bhattacharya VB StructuresSharbani bhattacharya VB Structures
Sharbani bhattacharya VB Structures
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R Language
 
H1076875
H1076875H1076875
H1076875
 

En vedette

Preserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human MobilityPreserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human MobilityRoberto Trasarti
 
Individual movements and geographical data mining. Clustering algorithms for ...
Individual movements and geographical data mining. Clustering algorithms for ...Individual movements and geographical data mining. Clustering algorithms for ...
Individual movements and geographical data mining. Clustering algorithms for ...Beniamino Murgante
 
Algoritmi di clustering
Algoritmi di clusteringAlgoritmi di clustering
Algoritmi di clusteringRosario Turco
 
Mining Object Movement Patterns from Trajectory Data
Mining Object Movement Patterns from Trajectory DataMining Object Movement Patterns from Trajectory Data
Mining Object Movement Patterns from Trajectory DataNhatHai Phan
 
Trajectory clustering - Traclus Algorithm
Trajectory clustering - Traclus AlgorithmTrajectory clustering - Traclus Algorithm
Trajectory clustering - Traclus AlgorithmIván Sanchez Vera
 
Spatio-Temporal Data Mining and Classification of Ships' Trajectories
Spatio-Temporal Data Mining and Classification of Ships' TrajectoriesSpatio-Temporal Data Mining and Classification of Ships' Trajectories
Spatio-Temporal Data Mining and Classification of Ships' TrajectoriesCentre of Geographic Sciences (COGS)
 

En vedette (8)

K-BestMatch
K-BestMatchK-BestMatch
K-BestMatch
 
Preserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human MobilityPreserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human Mobility
 
Cast
CastCast
Cast
 
Individual movements and geographical data mining. Clustering algorithms for ...
Individual movements and geographical data mining. Clustering algorithms for ...Individual movements and geographical data mining. Clustering algorithms for ...
Individual movements and geographical data mining. Clustering algorithms for ...
 
Algoritmi di clustering
Algoritmi di clusteringAlgoritmi di clustering
Algoritmi di clustering
 
Mining Object Movement Patterns from Trajectory Data
Mining Object Movement Patterns from Trajectory DataMining Object Movement Patterns from Trajectory Data
Mining Object Movement Patterns from Trajectory Data
 
Trajectory clustering - Traclus Algorithm
Trajectory clustering - Traclus AlgorithmTrajectory clustering - Traclus Algorithm
Trajectory clustering - Traclus Algorithm
 
Spatio-Temporal Data Mining and Classification of Ships' Trajectories
Spatio-Temporal Data Mining and Classification of Ships' TrajectoriesSpatio-Temporal Data Mining and Classification of Ships' Trajectories
Spatio-Temporal Data Mining and Classification of Ships' Trajectories
 

Similaire à Roberto Trasarti PhD Thesis

Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...
Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...
Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...ijsrd.com
 
Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Egbert Gramsbergen
 
IJSETR-VOL-3-ISSUE-12-3358-3363
IJSETR-VOL-3-ISSUE-12-3358-3363IJSETR-VOL-3-ISSUE-12-3358-3363
IJSETR-VOL-3-ISSUE-12-3358-3363SHIVA REDDY
 
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...ijcsa
 
Introduction to Data structure and algorithm.pptx
Introduction to Data structure and algorithm.pptxIntroduction to Data structure and algorithm.pptx
Introduction to Data structure and algorithm.pptxline24arts
 
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsUsing Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsWaqas Tariq
 
Introduction to Data Structure
Introduction to Data Structure Introduction to Data Structure
Introduction to Data Structure Prof Ansari
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain홍배 김
 
Data Structures unit I Introduction - data types
Data Structures unit I Introduction - data typesData Structures unit I Introduction - data types
Data Structures unit I Introduction - data typesAmirthaVarshini80
 
DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS Adams Sidibe
 
Chapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfChapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfAxmedcarb
 
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISCORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISijseajournal
 
Data clustering using map reduce
Data clustering using map reduceData clustering using map reduce
Data clustering using map reduceVarad Meru
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri
 

Similaire à Roberto Trasarti PhD Thesis (20)

Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...
Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...
Trajectory Segmentation and Sampling of Moving Objects Based On Representativ...
 
Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.
 
IJSETR-VOL-3-ISSUE-12-3358-3363
IJSETR-VOL-3-ISSUE-12-3358-3363IJSETR-VOL-3-ISSUE-12-3358-3363
IJSETR-VOL-3-ISSUE-12-3358-3363
 
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
 
Introduction to Data structure and algorithm.pptx
Introduction to Data structure and algorithm.pptxIntroduction to Data structure and algorithm.pptx
Introduction to Data structure and algorithm.pptx
 
SECh1214
SECh1214SECh1214
SECh1214
 
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsUsing Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
 
Visualization Proess
Visualization ProessVisualization Proess
Visualization Proess
 
Introduction to Data Structure
Introduction to Data Structure Introduction to Data Structure
Introduction to Data Structure
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
Data Structures unit I Introduction - data types
Data Structures unit I Introduction - data typesData Structures unit I Introduction - data types
Data Structures unit I Introduction - data types
 
DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS
 
Bt0065
Bt0065Bt0065
Bt0065
 
B T0065
B T0065B T0065
B T0065
 
Chapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfChapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdf
 
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISCORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
 
Data clustering using map reduce
Data clustering using map reduceData clustering using map reduce
Data clustering using map reduce
 
Ch14
Ch14Ch14
Ch14
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 

Dernier

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Dernier (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Roberto Trasarti PhD Thesis

  • 1. University of Pisa Mastering the Spatio-Temporal Knowledge Discovery Process PhD Candidate:Roberto Trasarti PhD Thesis discussion
  • 2. Spatio-Temporal context Research on moving-object data analysis has been recently fostered by the widespread diffusion of new techniques and systems for monitoring, collecting and storing location-aware data, generated by a wealth of technological infrastructures, such as: Global Positioning System (GPS) Global System for Mobile (GSM) Sensor networks
  • 3. Knowledge Discovery Process Knowledge discovery is a multi-step process, that involves data preprocessing, pattern mining stages and pattern post-processing.
  • 4. Motivations Lack of a unifying framework, where mining tools are specific components of the knowledge discovery process. ? Models Data Having elements from different worlds causes an impedence mismatch
  • 5. Related Works In the literature there aren’t proposals addressing the problem of an uniform framework There are approaches on Moving Objects Database such as Secondo and Hermes which provide some primitives. The thesis work has been inspired by well known literature works on the inductive database vision
  • 6.
  • 7.
  • 8. Object representation of Data and Models Using the object relational paradigm we represent data and models as objects The set of attribute types A can be partitioned in three subset : AsAd Am Ad Data Types Data World Spatial objectTemporal object Moving object AmModelstypes Model World T-Pattern objectsCluster object Flock object Object Type
  • 9. Data Types y Spatial objectis an object which has a geometric shape and a position in space. Temporal objectis an object which has an absolute temporal reference and a duration. Moving objectis an object which changesin time and space. x t y t x
  • 10. Data-World The D-World represents the entities to be analyzed, as well as their properties and mutual relationships. Intuitively the D-World is the set of entities which describe the trajectory dataset and/or a set of regions and/or a partition of the day. The D-World is a set of tables defined only by attributes in Ad and As
  • 11. Models Types T-Pattern is a concise description of frequent behaviors, in terms of both space and time Clusteris a the spatio-temrporal affinitybetween a set of moving objectsw.r.t. a distance function. Flockis the spatio-temporal coincidence between a set of moving objectswho move togheter. RegionA RegionC RegionB 10 min 5 min
  • 12. Model-World The M-World contains all the movement patterns extracted from the data with their properties and relationships. The M-World contains the collection of models, unveiled at the different stages of the knowledge discovery process. The M-World is a set of tables defined only by attributes in Am and As
  • 13. Two-Worlds Operators Operators can be intra-world or inter-world and for each type different classes of operators have been defined.
  • 14. The aim of this class of operators is to build objects in D-World starting from the raw data. It realizes the data acquisition step of the knowledge discovery process. Generic Data Constructor operator is defined as OPconstructor(T,p)  Td Data Constructor Operators
  • 15. This kind operatorsrealizes the extractionof models from the D-World through data mining algorithms. Generic Model Constructor operator is defined as OPmining(Td,p)  Tm Model Constructor Operators
  • 16. Transformation operators are intra-world tasks aimed at manipulating data and models These operations are the means for expressing data pre-processing and post-processing tasks. Generic D-Transformation operator is defined as OPD-Transf(Td,p) T’d Generic M-Transformation operator is defined as OPM-Transf(Tm,p) T’m Transformation Operators
  • 17. Relation operatorsinclude both intra-worldand inter-world operations and have the objective of creating relations between data, models, and the combination of the two. Generic DD-Relation operator is defined as OPDD-Relation (Tdd,f ) TRdd Generic MM-Relation operator is defined as OPMM-Relation (Tmm,f ) TRmm Generic DM-Relation operator is defined as OPDM-Relation (Tdm,f ) TRdm Relation Operators
  • 18. The predicate f can assume a large variety of predicates. However, the semantics of these predicates depends on the type of the data (resp.model) objects to which they are applied. Predicates of relation operators DD DM MM
  • 19.
  • 21.
  • 22. The Design of the GeoPKDD system The GeoPKDD system is an implementation of the Two-Worlds model and the Data Mining Query Language.
  • 23. Object Realtional Database and Database Manager As described above the object relational database contains both data and models and grants the power of SQL. It contains the representation of data and models. The database manager realizes a middle layer and using the translation libraries detaches the system from the database techonologies
  • 24. Language Parser and Controller Identifies the various types of queries and builds a plan of execution of them as sequence of actions for the controller. Example: CREATE MODELS ClusteringTable USING OPTICSFROM (Select t.id, t.trajobj fromTrajectories t)SET OPTICS.distance_method = Route Similarity AND OPTICS.eps = 50 AND OPTICS.min_size = 100 Plan: Retrieve[ Select t.id, t.trajobj from Trajectories t ] Translate[ Data type: Moving point ] Execute[ Mining algorithm: Optics algorithm, Parameters: ... ] Translate[ Model type: Cluster ] Store[ Table Name: ClusteringTable ]
  • 25. Algorithms Manager This component is a plug-in module capable of managing different sets of libraries Each library realizes a different sets of operators according to the Two-World framework proposed.
  • 26. Algorithms Libraries Data construction library Moving object Reconstruction algorithm Spatial object Builder algotirhm Termporal object Builder algoritm Model construction library T-Pattern algorithm Optics algorithm T-Flock algorithm Transformation library Resampling algorithm Intersection algoritm Object filtering T-Anonimity algorithms Relation Library All the predicates CREATE DATA MobilityData BUILDING MOVING_POINTSFROM (SELECT userid,lon,lat,datetime FROM MobilityRawData ORDER BY userid,datetime) SET MOVING_POINT.MAX_SPACE_GAP = 2000m AND MOVING_POINT.MAX_TIME_GAP = 1800 sec CREATE MODELS Patterns USING T-PATTERNFROM (Select t.id, t.trajobj from Trajectories t) SET T-PATTERN.support = .02 AND T-PATTERN.time = 120 sec CREATE TRANSFORMATION AnonimizedData USING NWA FROM (SELECT t.id, t.trajobj FROM Trajectories t) SET ANONYMIZATION.K = 10 AND ANONYMIZATION.TIME_SLOT = 600 sec CREATE RELATION EntailmentTable USING ENTIAL FROM (SELECT t.id, t.trajobj, p.id, p.obj FROM Trajectories t, Patterns p)
  • 27.
  • 28. Algorithm level: new algrorithms
  • 29.
  • 30. Add-ons: Location Prediction The goal is to constructs a predictive model using the set of T-patterns extracted on a set of trajectories. Given a new trajectory the predictive model can be used to predict the next location of it. Prediction Tree Local patterns Trajectory dataset CREATE TRANSFORMATION TPatternTree USING TPATTERN_TREE FROM( Select p.id, p.TpatternObj FROM PatternTable p )
  • 31. Add-ons: K-Best Map Matching A new way to perform the Map Matching The shortest path assumption in real cases can be violated in situations where other external factors play a role (i.e. Traffic congestion) CREATE DATA K-MobilityData BUILDING K-MOVING_POINTS FROM( SELECT userid, lon, lat, datetime FROM MobilityRawData ORDER BY userid, datetime) SET K-MOVING_POINTS.K = 5 AND K-MOVING_POINTS.MAP = StreetMapFile.wkt
  • 32.
  • 34. One week of ordinary mobility
  • 37. Demo GeoPKDD system Equipped with a very simple GUI which enables the user to write down DMQL queries and visualize the results M-Atlas The new generation of the GUI where the DMQL is used to build complex analysis creating scripts.
  • 38.
  • 39. the definition of a DMQL which realizes the operators of the framework
  • 40. the implementation of a real system capable of handling large amount of data
  • 41. three extensions of the system: reasoning component, k-best map matching and location prediction algorithms
  • 42.