SlideShare une entreprise Scribd logo
1  sur  49
Search, Discovery and Analysis
of Sensory Data Streams
1
Payam Barnaghi
Centre for Vision, Speech and Signal Processing (CVSSP), University of
Surrey
Care Technology & Research Centre, The UK Dementia Research
Institute (DRI)
SAW2019: 1st International Workshop on Sensors and Actuators on
the Web
46 years ago on the 5th of November (submission
day)
2
Source: https://www.cs.princeton.edu/courses/archive/fall06/cos561/papers/cerf74.pdf
• A 32 bit IP address was
used of which the first 8
bits signified the network
and the remaining 24 bits
designated the host on
that network.
• The assumption was that
256 networks would be
sufficient for the
foreseeable future…
• Obviously this was before
LANs (Ethernet was
under development at
Xerox PARC at that time).
Around 20 years later…
3
Web search in the early days
44
And there came Google!
55
Google says that the web has now over 30
trillion unique individual pages. It is
probably not even that relevant anymore;
lots of resources are dynamic…
The Crawling problem
6Source: https://www.bruceclay.com/seo/submit-website/
The Web content search lifecycle
− Creation
− Upload
− Crawling
− Indexing
− Delete/Update
− Query
− Search and discovery
− Processing
− Ranking
− Presentation
7
Content
Access
However, not only pages are on the web…
8
Image source: Youmegeek.com
Internet of Things (IoT) Search
9
10
http://Thingful.net
11
http://Thingful.net
12
13
14
Image sources: Wolfram Alpha
Search and automation
15
source: Passler.com
Sensory data
16
Sensor Data Flow on the Web
17
P. Barnaghi, A. Sheth, “On Searching the Internet of Things: Requirements and Challenges”, IEEE Intelligent Systems, 2016.
18https://iotcrawler.eu
Searching for…
19
(Y. Fathy, P. Barnaghi, et. al, 2018)
Searching for Sensory Devices
(i.e. Resources)
20
Semantic models
21
Semantic models
22
LSM : A Semantic Approach
23
(Danh Le-Phuoc et. al, ISWC, 2011)
A discovery engine for the IoT
24(HosseiniTabatabaie, Barnaghi et. al, 2018)
A GMM model for indexing
25
Average Success rates
First attempt: 92.3%
(min)
At first DS: 92.5 % (min)
At first DSL2 : 98.5 %
(min)
Number of attempts
Percentageofthetotalqueries
0 10 20 30 40 50 60
10
-4
10
-3
10
-2
10
-1
10
0
DSL2 capacity 1
DSL2 capacity 2
DSL2 capacity 3
DSL2 capacity 4
26
However, there
are also other
possible solutions:
(Y. Fathy, P. Barnaghi, et. al, 2017)
(A. HosseiniTabatabaie, P. Barnaghi et. al, 2019)
The Crawling and Update Issue
27
The Crawling Challenge
− Uniform policy: re-visiting all pages in the collection with
the same frequency, regardless of their rates of change.
− Proportional policy: re-visiting more often the pages that
change more frequently. The visiting frequency is directly
proportional to the (estimated) change frequency.
28
Cho, Junghoo; Garcia-Molina, Hector (2003). "Effective page refresh policies for Web
crawlers". ACM Transactions on Database Systems. 28 (4): 390–426.
Web Crawling
− Cho and Garcia-Molina proved the surprising result that,
in terms of average freshness, the uniform policy
outperforms the proportional policy in both a simulated
Web and a real Web crawl.
− Allocating too many new crawls to rapidly changing
pages at the expense of less frequently updating pages.
− A proportional policy allocates more resources to
crawling frequently updating pages, but experiences less
overall freshness time from them.
29
Source: Wikipedia
Crawling and the Freshness Issue
− To improve freshness, the crawler should penalise the
elements that change too often.
− The optimal re-visiting policy is neither the uniform policy
nor the proportional policy.
− The optimal method for keeping average freshness high
includes ignoring the pages that change too often, and
the optimal for keeping average age low is to use access
frequencies that monotonically (and sub-linearly)
increase with the rate of change of each page.
30
Junghoo Cho; Hector Garcia-Molina (2003). "Estimating frequency of change". ACM
Transactions on Internet Technology. 3 (3): 256–290.
Source: Wikipedia
Searching the content of data streams
31
Patterns and segmentation of time-series data
32
But the data is often multidimensional and
multivariate
33Credit: Shirin Enshaeifar, CR&T Centre, UK Dementia Research Institute/CVSSP, Uni of Surrey
Creating patterns from streaming data
34(Gonzalez-Vidal, Barnaghi, Skarmeta, IEEE TKDE, 2018)
IoTCrawler search engine
35http://iot-crawler.ee.surrey.ac.uk/search-engine/
36http://iot-crawler.ee.surrey.ac.uk/search-engine/
Pattern analysis
37
Days
Time
Aggregated daily pattern (2weeks)
Days
Time
Aggregated daily pattern (2weeks)
(Enshaeifar, Barnaghi, et. al, PlosOne, 2018)
Developing end-to-end solutions
38
(Enshaeifar, Barnaghi, et. al, 2019)
Some of the Research Challenges
− Provenance monitoring and fact checking algorithms
and tools
− Dealing with noisy, incomplete and dynamic data.
− Handling and processing large data streams, search and
identification of patterns.
− Crawling, search and query of changing data
− Multi-modal information analysis and continual and
adaptive learning algorithms
− Security, privacy, trust and accessibility
− Solutions to keep (and make) the Web a safe, open,
inclusive and collaborative environment.
39
Some (other) important issues
40
How representative is your data?
41
The issue of trust and reliability
42
How stable are the models that you learn from
your data?
43
Credits: Roonak Rezvani, CR&T Centre, UK Dementia Research Institute/CVSSP, Uni of Surrey
Dynamicity and machine learning issue
44
Noise and missing data Pattern and change representation
Continual and adaptive learning Network and Causation analysis
Avoid (unnecessary) complexity
45
Be ready for setbacks
46
References
− S. Enshaeifar et. al, "Health management and pattern analysis of daily living activities
of people with Dementia using in-home sensors and machine learning techniques",
PLoS ONE 13(5): e0195605, 2018.
− A. González Vidal, P. Barnaghi, A. F. Skarmeta, "BEATS: Blocks of Eigenvalues
Algorithm for Time series Segmentation", IEEE Transactions on Knowledge and Data
Engineering (TKDE), 2018.
− Y. Fathy, P. Barnaghi, R. Tafazolli, "An Online Adaptive Algorithm for Change
Detection in Streaming Sensory Data", IEEE Systems Journal, 2018.
− Y. Fathy, P. Barnaghi, R. Tafazolli, "Large-Scale Indexing, Discovery and Ranking for
the Internet of Things (IoT)", ACM Computing Surveys, 2017.
− S. A. Hosieni Tabatabaei, Y. Fathy, P. Barnaghi, C. Wang, R. Tafazolli, "A Novel
Indexing Method for Scalable IoT Source Lookup", IEEE Internet of Things Journal,
2018.
− Y. Fathy, P. Barnaghi, R. Tafazolli, "Distributed Spatial Indexing for the Internet of
Things Data Management", Proc. of IFIP/IEEE International Symposium on
Integrated Network Management, Lisbon, Portugal, May 2017.
47
Acknowledgments
48
Thank you!
http://personal.ee.surrey.ac.uk/Personal/P.Barnaghi/
@pbarnaghi
p.barnaghi@surrey.ac.uk
https://ukdri.ac.uk/team/payam-barnaghi

Contenu connexe

Tendances

Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SAC
webuploader
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 
Peter (Yun-shao) Sung's Resume 2016III
Peter (Yun-shao) Sung's Resume 2016IIIPeter (Yun-shao) Sung's Resume 2016III
Peter (Yun-shao) Sung's Resume 2016III
Peter Sung
 

Tendances (20)

The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databasesIEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
IEEE 2014 JAVA DATA MINING PROJECTS Searching dimension incomplete databases
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Semantic Web Development for Traditional Chinese Medicine
Semantic Web Development for Traditional Chinese MedicineSemantic Web Development for Traditional Chinese Medicine
Semantic Web Development for Traditional Chinese Medicine
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
Biomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly LandscapeBiomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly Landscape
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SAC
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Himansu sahoo resume-ds
Himansu sahoo resume-dsHimansu sahoo resume-ds
Himansu sahoo resume-ds
 
Peter (Yun-shao) Sung's Resume 2016III
Peter (Yun-shao) Sung's Resume 2016IIIPeter (Yun-shao) Sung's Resume 2016III
Peter (Yun-shao) Sung's Resume 2016III
 
Slide 26 sept2017v2
Slide 26 sept2017v2Slide 26 sept2017v2
Slide 26 sept2017v2
 
Top data science projects
Top data science projectsTop data science projects
Top data science projects
 
Pikas bibliometricsfor21may2015
Pikas bibliometricsfor21may2015Pikas bibliometricsfor21may2015
Pikas bibliometricsfor21may2015
 
Certificate MIT
Certificate  MITCertificate  MIT
Certificate MIT
 
ML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging DataML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging Data
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 

Similaire à Search, Discovery and Analysis of Sensory Data Streams

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
Ian Foster
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble
 
Post 1We all know that our era belongs to technology; we are ve
Post 1We all know that our era belongs to technology; we are vePost 1We all know that our era belongs to technology; we are ve
Post 1We all know that our era belongs to technology; we are ve
anhcrowley
 
Post 1We all know that our era belongs to technology; we are ve.docx
Post 1We all know that our era belongs to technology; we are ve.docxPost 1We all know that our era belongs to technology; we are ve.docx
Post 1We all know that our era belongs to technology; we are ve.docx
stilliegeorgiana
 
Towards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big dataTowards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
Krzysztof Gorgolewski
 

Similaire à Search, Discovery and Analysis of Sensory Data Streams (20)

Preprint-ICDMAI,Defense Institute,20-22 January 2023.pdf
Preprint-ICDMAI,Defense Institute,20-22 January 2023.pdfPreprint-ICDMAI,Defense Institute,20-22 January 2023.pdf
Preprint-ICDMAI,Defense Institute,20-22 January 2023.pdf
 
Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 
10 problems 06
10 problems 0610 problems 06
10 problems 06
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Post 1We all know that our era belongs to technology; we are ve
Post 1We all know that our era belongs to technology; we are vePost 1We all know that our era belongs to technology; we are ve
Post 1We all know that our era belongs to technology; we are ve
 
Post 1We all know that our era belongs to technology; we are ve.docx
Post 1We all know that our era belongs to technology; we are ve.docxPost 1We all know that our era belongs to technology; we are ve.docx
Post 1We all know that our era belongs to technology; we are ve.docx
 
Bruce, "Investing in a Time of Disruptive Change"
Bruce, "Investing in a Time of Disruptive Change"Bruce, "Investing in a Time of Disruptive Change"
Bruce, "Investing in a Time of Disruptive Change"
 
A genetic based research framework 3
A genetic based research framework 3A genetic based research framework 3
A genetic based research framework 3
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
Towards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big dataTowards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
 
The FAIR Principles and FAIRsharing
The FAIR Principles and FAIRsharingThe FAIR Principles and FAIRsharing
The FAIR Principles and FAIRsharing
 
Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things
 

Plus de PayamBarnaghi

Plus de PayamBarnaghi (20)

Academic Research: A Survival Guide
Academic Research: A Survival GuideAcademic Research: A Survival Guide
Academic Research: A Survival Guide
 
Reproducibility in machine learning
Reproducibility in machine learningReproducibility in machine learning
Reproducibility in machine learning
 
Scientific and Academic Research: A Survival Guide 
Scientific and Academic Research:  A Survival Guide Scientific and Academic Research:  A Survival Guide 
Scientific and Academic Research: A Survival Guide 
 
Lecture 8: IoT System Models and Applications
Lecture 8: IoT System Models and ApplicationsLecture 8: IoT System Models and Applications
Lecture 8: IoT System Models and Applications
 
Lecture 7: Semantic Technologies and Interoperability
Lecture 7: Semantic Technologies and InteroperabilityLecture 7: Semantic Technologies and Interoperability
Lecture 7: Semantic Technologies and Interoperability
 
Lecture 6: IoT Data Processing
Lecture 6: IoT Data Processing Lecture 6: IoT Data Processing
Lecture 6: IoT Data Processing
 
Lecture 5: Software platforms and services
Lecture 5: Software platforms and services Lecture 5: Software platforms and services
Lecture 5: Software platforms and services
 
Internet of Things for healthcare: data integration and security/privacy issu...
Internet of Things for healthcare: data integration and security/privacy issu...Internet of Things for healthcare: data integration and security/privacy issu...
Internet of Things for healthcare: data integration and security/privacy issu...
 
Scientific and Academic Research: A Survival Guide 
Scientific and Academic Research:  A Survival Guide Scientific and Academic Research:  A Survival Guide 
Scientific and Academic Research: A Survival Guide 
 
Semantic Technolgies for the Internet of Things
Semantic Technolgies for the Internet of ThingsSemantic Technolgies for the Internet of Things
Semantic Technolgies for the Internet of Things
 
Internet of Things and Data Analytics for Smart Cities and eHealth
Internet of Things and Data Analytics for Smart Cities and eHealthInternet of Things and Data Analytics for Smart Cities and eHealth
Internet of Things and Data Analytics for Smart Cities and eHealth
 
Spatial Data on the Web
Spatial Data on the WebSpatial Data on the Web
Spatial Data on the Web
 
IoT-Lite: A Lightweight Semantic Model for the Internet of Things
IoT-Lite:  A Lightweight Semantic Model for the Internet of ThingsIoT-Lite:  A Lightweight Semantic Model for the Internet of Things
IoT-Lite: A Lightweight Semantic Model for the Internet of Things
 
The Future is Cyber-Healthcare
The Future is Cyber-Healthcare The Future is Cyber-Healthcare
The Future is Cyber-Healthcare
 
Internet of Things: Concepts and Technologies
Internet of Things: Concepts and TechnologiesInternet of Things: Concepts and Technologies
Internet of Things: Concepts and Technologies
 
How to make cities "smarter"?
How to make cities "smarter"?How to make cities "smarter"?
How to make cities "smarter"?
 
The Internet of Things: What's next?
The Internet of Things: What's next? The Internet of Things: What's next?
The Internet of Things: What's next?
 
Information Engineering in the Age of the Internet of Things
Information Engineering in the Age of the Internet of Things Information Engineering in the Age of the Internet of Things
Information Engineering in the Age of the Internet of Things
 
Smart Cities….Smart Future
Smart Cities….Smart FutureSmart Cities….Smart Future
Smart Cities….Smart Future
 
What makes smart cities “Smart”?
What makes smart cities “Smart”? What makes smart cities “Smart”?
What makes smart cities “Smart”?
 

Dernier

Dernier (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 

Search, Discovery and Analysis of Sensory Data Streams

  • 1. Search, Discovery and Analysis of Sensory Data Streams 1 Payam Barnaghi Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey Care Technology & Research Centre, The UK Dementia Research Institute (DRI) SAW2019: 1st International Workshop on Sensors and Actuators on the Web
  • 2. 46 years ago on the 5th of November (submission day) 2 Source: https://www.cs.princeton.edu/courses/archive/fall06/cos561/papers/cerf74.pdf • A 32 bit IP address was used of which the first 8 bits signified the network and the remaining 24 bits designated the host on that network. • The assumption was that 256 networks would be sufficient for the foreseeable future… • Obviously this was before LANs (Ethernet was under development at Xerox PARC at that time).
  • 3. Around 20 years later… 3
  • 4. Web search in the early days 44
  • 5. And there came Google! 55 Google says that the web has now over 30 trillion unique individual pages. It is probably not even that relevant anymore; lots of resources are dynamic…
  • 6. The Crawling problem 6Source: https://www.bruceclay.com/seo/submit-website/
  • 7. The Web content search lifecycle − Creation − Upload − Crawling − Indexing − Delete/Update − Query − Search and discovery − Processing − Ranking − Presentation 7 Content Access
  • 8. However, not only pages are on the web… 8 Image source: Youmegeek.com
  • 9. Internet of Things (IoT) Search 9
  • 12. 12
  • 13. 13
  • 17. Sensor Data Flow on the Web 17 P. Barnaghi, A. Sheth, “On Searching the Internet of Things: Requirements and Challenges”, IEEE Intelligent Systems, 2016.
  • 19. Searching for… 19 (Y. Fathy, P. Barnaghi, et. al, 2018)
  • 20. Searching for Sensory Devices (i.e. Resources) 20
  • 23. LSM : A Semantic Approach 23 (Danh Le-Phuoc et. al, ISWC, 2011)
  • 24. A discovery engine for the IoT 24(HosseiniTabatabaie, Barnaghi et. al, 2018)
  • 25. A GMM model for indexing 25 Average Success rates First attempt: 92.3% (min) At first DS: 92.5 % (min) At first DSL2 : 98.5 % (min) Number of attempts Percentageofthetotalqueries 0 10 20 30 40 50 60 10 -4 10 -3 10 -2 10 -1 10 0 DSL2 capacity 1 DSL2 capacity 2 DSL2 capacity 3 DSL2 capacity 4
  • 26. 26 However, there are also other possible solutions: (Y. Fathy, P. Barnaghi, et. al, 2017) (A. HosseiniTabatabaie, P. Barnaghi et. al, 2019)
  • 27. The Crawling and Update Issue 27
  • 28. The Crawling Challenge − Uniform policy: re-visiting all pages in the collection with the same frequency, regardless of their rates of change. − Proportional policy: re-visiting more often the pages that change more frequently. The visiting frequency is directly proportional to the (estimated) change frequency. 28 Cho, Junghoo; Garcia-Molina, Hector (2003). "Effective page refresh policies for Web crawlers". ACM Transactions on Database Systems. 28 (4): 390–426.
  • 29. Web Crawling − Cho and Garcia-Molina proved the surprising result that, in terms of average freshness, the uniform policy outperforms the proportional policy in both a simulated Web and a real Web crawl. − Allocating too many new crawls to rapidly changing pages at the expense of less frequently updating pages. − A proportional policy allocates more resources to crawling frequently updating pages, but experiences less overall freshness time from them. 29 Source: Wikipedia
  • 30. Crawling and the Freshness Issue − To improve freshness, the crawler should penalise the elements that change too often. − The optimal re-visiting policy is neither the uniform policy nor the proportional policy. − The optimal method for keeping average freshness high includes ignoring the pages that change too often, and the optimal for keeping average age low is to use access frequencies that monotonically (and sub-linearly) increase with the rate of change of each page. 30 Junghoo Cho; Hector Garcia-Molina (2003). "Estimating frequency of change". ACM Transactions on Internet Technology. 3 (3): 256–290. Source: Wikipedia
  • 31. Searching the content of data streams 31
  • 32. Patterns and segmentation of time-series data 32
  • 33. But the data is often multidimensional and multivariate 33Credit: Shirin Enshaeifar, CR&T Centre, UK Dementia Research Institute/CVSSP, Uni of Surrey
  • 34. Creating patterns from streaming data 34(Gonzalez-Vidal, Barnaghi, Skarmeta, IEEE TKDE, 2018)
  • 37. Pattern analysis 37 Days Time Aggregated daily pattern (2weeks) Days Time Aggregated daily pattern (2weeks) (Enshaeifar, Barnaghi, et. al, PlosOne, 2018)
  • 39. Some of the Research Challenges − Provenance monitoring and fact checking algorithms and tools − Dealing with noisy, incomplete and dynamic data. − Handling and processing large data streams, search and identification of patterns. − Crawling, search and query of changing data − Multi-modal information analysis and continual and adaptive learning algorithms − Security, privacy, trust and accessibility − Solutions to keep (and make) the Web a safe, open, inclusive and collaborative environment. 39
  • 41. How representative is your data? 41
  • 42. The issue of trust and reliability 42
  • 43. How stable are the models that you learn from your data? 43 Credits: Roonak Rezvani, CR&T Centre, UK Dementia Research Institute/CVSSP, Uni of Surrey
  • 44. Dynamicity and machine learning issue 44 Noise and missing data Pattern and change representation Continual and adaptive learning Network and Causation analysis
  • 46. Be ready for setbacks 46
  • 47. References − S. Enshaeifar et. al, "Health management and pattern analysis of daily living activities of people with Dementia using in-home sensors and machine learning techniques", PLoS ONE 13(5): e0195605, 2018. − A. González Vidal, P. Barnaghi, A. F. Skarmeta, "BEATS: Blocks of Eigenvalues Algorithm for Time series Segmentation", IEEE Transactions on Knowledge and Data Engineering (TKDE), 2018. − Y. Fathy, P. Barnaghi, R. Tafazolli, "An Online Adaptive Algorithm for Change Detection in Streaming Sensory Data", IEEE Systems Journal, 2018. − Y. Fathy, P. Barnaghi, R. Tafazolli, "Large-Scale Indexing, Discovery and Ranking for the Internet of Things (IoT)", ACM Computing Surveys, 2017. − S. A. Hosieni Tabatabaei, Y. Fathy, P. Barnaghi, C. Wang, R. Tafazolli, "A Novel Indexing Method for Scalable IoT Source Lookup", IEEE Internet of Things Journal, 2018. − Y. Fathy, P. Barnaghi, R. Tafazolli, "Distributed Spatial Indexing for the Internet of Things Data Management", Proc. of IFIP/IEEE International Symposium on Integrated Network Management, Lisbon, Portugal, May 2017. 47

Notes de l'éditeur

  1. The entropy of the (x,y,z) triple on D D is the set of data items