SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
Social Web
                                            Lecture VI
                    How can we MINE, ANALYSE and VISUALIZE the Social
                                Web? (1I) : The Web Science

                                            Lora Aroyo
                                         The Network Institute
                                        VU University Amsterdam
                               (based on slides from Les Carr, Nigel Shadbolt)




Tuesday, March 13, 12
The Web
                   the most used and one of the most transformative
                 applications in the history of computing, e.g. how the
                     Social Web has transformed the world's
                                    communication

                               approximately 1010 people
                             more than 1011 web documents

Tuesday, March 13, 12
Web is NOT a Thing
   •       it’s not a verb, or a
           noun

   •       it’s a performance, not
           an object

   •       co-constructed with
           society

   •       activity of individuals
           who create interlinked
           that both reflect and
           reinforce the
           interlinkedness of
           society and social
           interaction                  ... and a record of
                                        that performance
Tuesday, March 13, 12
The Web
                                   Great success as a technology,
                        it’s built on significant computing infrastructure,
                                                 but
                             as an entity surprisingly unstudied




Tuesday, March 13, 12
Science & Engineering
                    • physical science: analytic discipline to find laws
                        that generate or explain observed phenomena
                    • CS is mainly synthetic: formalisms & algorithms
                        are created to support specific desired
                        behaviors
                    • Web Science: web needs to be studied &
                        understood as a phenomenon but also to be
                        engineered for future growth and capabilities


Tuesday, March 13, 12
L.A. Carr, C.J. Pope,W. Hall,N.R. Shadbolt
                                  http://webscience.ecs.soton.ac.uk/
Tuesday, March 13, 12
Simple micro rules give
                   rise to complex macro
                         phenomena
                        •   at microscale an infrastructure of artificial languages and
                            protocols: a piece of engineering
                        •   however, interaction of people creating, linking and
                            consuming information generates web's behavior as
                            emergent properties at macroscale
                        •   properties require new analytic methods to be
                            understood
                        •   some properties are desirable and are to be engineered
                            in, others are undesirable and if possible engineered out
Tuesday, March 13, 12
A new way of software
                       development
                    •   software applications designed based on appropriate
                        technology (algorithm, design) and with envisioned
                        'social' construct
                    •   usually tested in the small, testing microscale properties
                    •   a macrosystem evolving from people using the
                        microsystem and interacting in often unpredicted ways, is
                        far more interesting and must be analyzed in different
                        ways
                    •   also the macrosystems exhibit challenges that do not
                        exist at microscale

Tuesday, March 13, 12
Evolution of Search
                                 Engines
                              1: techniques designed to rank documents
                          2: people were gaming to influence algorithms &
                                      improve their search rank
                        3: adapt search technologies to defeat this influence




Tuesday, March 13, 12
The Web Graph
        •       to understand the web, in good
                CS tradition, we look at the graph
              •         nodes are web pages (HTML)
              •         edges are hypertext links
                        between nodes
        •       first analysis shows that in-degree
                and out-degree follow power law
                distribution => shown to hold for
                large samples
        •       this gave insight into the growth of
                the web


Tuesday, March 13, 12
Search Algorithms
       • the Web graph also at
               basis of algorithms for
               search engines:
             • HITS or PageRank
                        assume that inserting
                        a hyperlink symbolizes
                        an endorsement of
                        authority of the page
                        linked to

Tuesday, March 13, 12
User State is Important
                    •   the original Web graph is too simple, starts from quasi static
                        HTML
                        •   for personalization or customization different representations
                            (of sources) may be served to different requesters, e.g. cookies
                    •   graph based models often do not account for this sort of user-
                        dependent state, and not fit for all the information behind the
                        servers, in Deep Web
                    •   it’s not a simple HTTP-GET anymore (but HTTP-POST or
                        HTTP-GET with complex URI) that is the basis for defining
                        nodes in the graph
                    •   URis that carry user state are heavily used in Web applications,
                        but are not in the model and largely unanalyzed


Tuesday, March 13, 12
According to Google
               each day 20-25% of searches have not been seen before, i.e.
                              generate a new identifier
                            thus a new node in the graph

                   more than 20 million new links per day, 200 per second

                    do they follow the same power laws & growth models?

Tuesday, March 13, 12
validating such models is hard

                        According to Google
                       exponential growth of content
                 changes in number & power of servers
               each day 20-25% of searches have not been seen before, i.e.
                         increasing adiversity in users
                              generate new identifier
                               thus a new node in the graph

                   more than 20 million new links per day, 200 per second

                    do they follow the same power laws & growth models?

Tuesday, March 13, 12
Social Web Sites
                •       modern websites (on the social web)
                        •  have large script systems running in browser
                        •  store personal information
           many Social Web sites are not part of the (open) graph model
                        do these systems show a similar behavior? (macro)
                        are they stable? are they fair?
                        do they need to be regulated?
                        are the access restrictions, for personal
                        information, assured?
             there is a need for understanding and intervening/engineering
Tuesday, March 13, 12
Wikipedia
                •       purely mathematical (technology-based) models do not capture the
                        whole story
                •       the Wikipedia structure (link labels) shows a Zipf-like distribution
                        just like other tag-based systems
                •       Wikipedia is built on MediaWiki software
                •       but other MediaWiki-based applications did not generate such
                        significant use
                         •   the pure 'technological' explanation cannot explain it
                         •   must be related to the 'social model' of how Wikipedia is
                             organized


    this is referred to as the dynamics of a 'social machine' (already in TBL’s original vision of WWW)

Tuesday, March 13, 12
Social Machines
                    •   today's interactive applications are very early
                        social machines limited by being largely isolated from
                        one another
                        •   more effective social machines can be expected
                        •   social processes in society interlink, so they
                            should also interlink on the web
                        •   technology needed to allow user communities to
                            construct, share & adapt social machines to get
                            success through trial, use & refinement


Tuesday, March 13, 12
Next Generation
                           Social Machines
                    •   what are fundamental theoretical properties of social
                        machines, what algorithms are needed to create them?
                    •   what underlying architectural principles a needed to
                        effectively engineer new web components for this social
                        software?
                    •   how can we extend current web infrastructure with
                        mechanisms that make the social properties of information
                        sharing explicit and conform to relevant social-policy
                        expectations?
                    •   how do cultural differences affect development and use of
                        social mechanisms?

Tuesday, March 13, 12
Modeling the Social
                           Machines
                    •   trustworthiness, reliability or silent expectations about
                        use of information
                    •   privacy, copyright, legal rules


                    •   we lack structures for formally representing &
                        reasoning over such properties
                    •   thus, without scalable models for these issues it is
                        hard to help the web go in the best possible
                        direction
Tuesday, March 13, 12
Tuesday, March 13, 12
L.A. Carr, C.J. Pope,W. Hall,N.R. Shadbolt
                                  http://webscience.ecs.soton.ac.uk/
Tuesday, March 13, 12
Web Science is about
       additionality


         not the union of
          disciplines, but
           intersection




Tuesday, March 13, 12
Society is Diverse
     different parts of society have different objectives and hence incompatible
     Web requirements, e.g. openness, security, transparency, privacy




Tuesday, March 13, 12
Understanding the
                          Socio-Cultural
     •       POWER DISTANCE: The extent to which
             power is distributed equally within a society
             and the degree that society accepts this
             distribution.
     •       UNCERTAINTY AVOIDANCE: The degree to
             which individuals require set boundaries and
             clear structures
     •       INDIVIDUALISM vs COLLECTIVISM: The degree
             to which individuals base their actions on self-
             interest versus the interests of the group.
     •       MASCULINITY vs FEMININITY: A measure of a
             society's goal orientation
     •       TIME ORIENTATION: The degree to which a
             society does or does not value long-term
             commitments and respect for tradition.


Tuesday, March 13, 12
Understanding the
                            variation
      •       Ecology of the Web - structure
              of the environment, producers
              and consumers
      •       Populations (individuals and
              species), traits/characteristics,
              heredity, genotypes and
              phenotypes
      •       Mechanisms - variation
              (mutation, migration, HGT,
              genetic drift), selection
      •       Outcomes - adaption, co-
              evolution, competition, co-
              operation, speciation,
              extinction
Tuesday, March 13, 12
Understanding the
                            variation
      •       Ecology of the Web - structure
              of the environment, producers
              and consumers
      •       Populations (individuals and
              species), traits/characteristics,
              heredity, genotypes and
              phenotypes
      •       Mechanisms - variation
              (mutation, migration, HGT,
              genetic drift), selection
      •       Outcomes - adaption, co-
              evolution, competition, co-
              operation, speciation,
              extinction
Tuesday, March 13, 12
Understanding the
                            variation
      •       Ecology of the Web - structure
              of the environment, producers
              and consumers
      •       Populations (individuals and
              species), traits/characteristics,
              heredity, genotypes and
              phenotypes
      •       Mechanisms - variation
              (mutation, migration, HGT,
              genetic drift), selection
      •       Outcomes - adaption, co-
              evolution, competition, co-
              operation, speciation,
              extinction
Tuesday, March 13, 12
Understanding the
                            variation
      •       Ecology of the Web - structure
              of the environment, producers
              and consumers
      •       Populations (individuals and
              species), traits/characteristics,
              heredity, genotypes and
              phenotypes
      •       Mechanisms - variation
              (mutation, migration, HGT,
              genetic drift), selection
      •       Outcomes - adaption, co-
              evolution, competition, co-
              operation, speciation,
              extinction
Tuesday, March 13, 12
but
                        How to do the Science?



Tuesday, March 13, 12
Web Science
                                    Reflections
                        Is the Web changing faster than our ability to observe it?
                                How to measure or instrument the Web?
                                How to identify behaviors and patterns?
                           How to analyze the changing structure of the Web?



Tuesday, March 13, 12
Big Bang:
                          Web Information
                    • assumption of the open exchange of
                        information is being imposed on the society
                    • is the Web, open access, open data and
                        scientific and creative commons offer a
                        beneficial opportunity or dangerous cul-de-
                        sac?



Tuesday, March 13, 12
Open Questions
                    •   How is the world changing as other parts of society
                        impose their requirements on the Web?, e.g. current
                        examples with SOTA/PIPA, ACTA requirements for
                        security and policing taking over free exchange of
                        information, unrestricted transfer of knowledge
                    •   Are the public and open aspects of the Web a
                        fundamental change in society’s information
                        processes, or just a temporary glitch?, e.g. are open
                        source, open access, open science & creative commons
                        efficient alternatives to free-based knowledge transfer?


Tuesday, March 13, 12
Open Questions
                    •   do we take Web for granted as provider of a free
                        and unrestricted information exchange?
                    •   is Web Science the response to the pressure for the
                        Web to change - to respond to the issues of
                        security, commerce, criminality and privacy?
                    •   What are the challenges for Web science?
                        •to explain how the Web impacts society?
                        •to predict the outcomes of proposed changes
                         to Web infrastructure on business & society?


Tuesday, March 13, 12
What can you do as a
                        Computer Scientist?
                            specifically for the Social Web




Tuesday, March 13, 12
Hands-on Teaser


         •       Q&A on Assignments
         •       Pitch of the Social Web Apps




                                                image source: http://www.flickr.com/photos/bionicteaching/1375254387/

Tuesday, March 13, 12

Contenu connexe

Plus de Lora Aroyo

Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Lora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AILora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorLora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataLora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsLora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesLora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the LoopLora Aroyo
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 

Plus de Lora Aroyo (20)

Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 

Dernier

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Dernier (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Lecture 6: Social Web & Web Science (2012)

  • 1. Social Web Lecture VI How can we MINE, ANALYSE and VISUALIZE the Social Web? (1I) : The Web Science Lora Aroyo The Network Institute VU University Amsterdam (based on slides from Les Carr, Nigel Shadbolt) Tuesday, March 13, 12
  • 2. The Web the most used and one of the most transformative applications in the history of computing, e.g. how the Social Web has transformed the world's communication approximately 1010 people more than 1011 web documents Tuesday, March 13, 12
  • 3. Web is NOT a Thing • it’s not a verb, or a noun • it’s a performance, not an object • co-constructed with society • activity of individuals who create interlinked that both reflect and reinforce the interlinkedness of society and social interaction ... and a record of that performance Tuesday, March 13, 12
  • 4. The Web Great success as a technology, it’s built on significant computing infrastructure, but as an entity surprisingly unstudied Tuesday, March 13, 12
  • 5. Science & Engineering • physical science: analytic discipline to find laws that generate or explain observed phenomena • CS is mainly synthetic: formalisms & algorithms are created to support specific desired behaviors • Web Science: web needs to be studied & understood as a phenomenon but also to be engineered for future growth and capabilities Tuesday, March 13, 12
  • 6. L.A. Carr, C.J. Pope,W. Hall,N.R. Shadbolt http://webscience.ecs.soton.ac.uk/ Tuesday, March 13, 12
  • 7. Simple micro rules give rise to complex macro phenomena • at microscale an infrastructure of artificial languages and protocols: a piece of engineering • however, interaction of people creating, linking and consuming information generates web's behavior as emergent properties at macroscale • properties require new analytic methods to be understood • some properties are desirable and are to be engineered in, others are undesirable and if possible engineered out Tuesday, March 13, 12
  • 8. A new way of software development • software applications designed based on appropriate technology (algorithm, design) and with envisioned 'social' construct • usually tested in the small, testing microscale properties • a macrosystem evolving from people using the microsystem and interacting in often unpredicted ways, is far more interesting and must be analyzed in different ways • also the macrosystems exhibit challenges that do not exist at microscale Tuesday, March 13, 12
  • 9. Evolution of Search Engines 1: techniques designed to rank documents 2: people were gaming to influence algorithms & improve their search rank 3: adapt search technologies to defeat this influence Tuesday, March 13, 12
  • 10. The Web Graph • to understand the web, in good CS tradition, we look at the graph • nodes are web pages (HTML) • edges are hypertext links between nodes • first analysis shows that in-degree and out-degree follow power law distribution => shown to hold for large samples • this gave insight into the growth of the web Tuesday, March 13, 12
  • 11. Search Algorithms • the Web graph also at basis of algorithms for search engines: • HITS or PageRank assume that inserting a hyperlink symbolizes an endorsement of authority of the page linked to Tuesday, March 13, 12
  • 12. User State is Important • the original Web graph is too simple, starts from quasi static HTML • for personalization or customization different representations (of sources) may be served to different requesters, e.g. cookies • graph based models often do not account for this sort of user- dependent state, and not fit for all the information behind the servers, in Deep Web • it’s not a simple HTTP-GET anymore (but HTTP-POST or HTTP-GET with complex URI) that is the basis for defining nodes in the graph • URis that carry user state are heavily used in Web applications, but are not in the model and largely unanalyzed Tuesday, March 13, 12
  • 13. According to Google each day 20-25% of searches have not been seen before, i.e. generate a new identifier thus a new node in the graph more than 20 million new links per day, 200 per second do they follow the same power laws & growth models? Tuesday, March 13, 12
  • 14. validating such models is hard According to Google exponential growth of content changes in number & power of servers each day 20-25% of searches have not been seen before, i.e. increasing adiversity in users generate new identifier thus a new node in the graph more than 20 million new links per day, 200 per second do they follow the same power laws & growth models? Tuesday, March 13, 12
  • 15. Social Web Sites • modern websites (on the social web) • have large script systems running in browser • store personal information many Social Web sites are not part of the (open) graph model do these systems show a similar behavior? (macro) are they stable? are they fair? do they need to be regulated? are the access restrictions, for personal information, assured? there is a need for understanding and intervening/engineering Tuesday, March 13, 12
  • 16. Wikipedia • purely mathematical (technology-based) models do not capture the whole story • the Wikipedia structure (link labels) shows a Zipf-like distribution just like other tag-based systems • Wikipedia is built on MediaWiki software • but other MediaWiki-based applications did not generate such significant use • the pure 'technological' explanation cannot explain it • must be related to the 'social model' of how Wikipedia is organized this is referred to as the dynamics of a 'social machine' (already in TBL’s original vision of WWW) Tuesday, March 13, 12
  • 17. Social Machines • today's interactive applications are very early social machines limited by being largely isolated from one another • more effective social machines can be expected • social processes in society interlink, so they should also interlink on the web • technology needed to allow user communities to construct, share & adapt social machines to get success through trial, use & refinement Tuesday, March 13, 12
  • 18. Next Generation Social Machines • what are fundamental theoretical properties of social machines, what algorithms are needed to create them? • what underlying architectural principles a needed to effectively engineer new web components for this social software? • how can we extend current web infrastructure with mechanisms that make the social properties of information sharing explicit and conform to relevant social-policy expectations? • how do cultural differences affect development and use of social mechanisms? Tuesday, March 13, 12
  • 19. Modeling the Social Machines • trustworthiness, reliability or silent expectations about use of information • privacy, copyright, legal rules • we lack structures for formally representing & reasoning over such properties • thus, without scalable models for these issues it is hard to help the web go in the best possible direction Tuesday, March 13, 12
  • 21. L.A. Carr, C.J. Pope,W. Hall,N.R. Shadbolt http://webscience.ecs.soton.ac.uk/ Tuesday, March 13, 12
  • 22. Web Science is about additionality not the union of disciplines, but intersection Tuesday, March 13, 12
  • 23. Society is Diverse different parts of society have different objectives and hence incompatible Web requirements, e.g. openness, security, transparency, privacy Tuesday, March 13, 12
  • 24. Understanding the Socio-Cultural • POWER DISTANCE: The extent to which power is distributed equally within a society and the degree that society accepts this distribution. • UNCERTAINTY AVOIDANCE: The degree to which individuals require set boundaries and clear structures • INDIVIDUALISM vs COLLECTIVISM: The degree to which individuals base their actions on self- interest versus the interests of the group. • MASCULINITY vs FEMININITY: A measure of a society's goal orientation • TIME ORIENTATION: The degree to which a society does or does not value long-term commitments and respect for tradition. Tuesday, March 13, 12
  • 25. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinction Tuesday, March 13, 12
  • 26. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinction Tuesday, March 13, 12
  • 27. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinction Tuesday, March 13, 12
  • 28. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinction Tuesday, March 13, 12
  • 29. but How to do the Science? Tuesday, March 13, 12
  • 30. Web Science Reflections Is the Web changing faster than our ability to observe it? How to measure or instrument the Web? How to identify behaviors and patterns? How to analyze the changing structure of the Web? Tuesday, March 13, 12
  • 31. Big Bang: Web Information • assumption of the open exchange of information is being imposed on the society • is the Web, open access, open data and scientific and creative commons offer a beneficial opportunity or dangerous cul-de- sac? Tuesday, March 13, 12
  • 32. Open Questions • How is the world changing as other parts of society impose their requirements on the Web?, e.g. current examples with SOTA/PIPA, ACTA requirements for security and policing taking over free exchange of information, unrestricted transfer of knowledge • Are the public and open aspects of the Web a fundamental change in society’s information processes, or just a temporary glitch?, e.g. are open source, open access, open science & creative commons efficient alternatives to free-based knowledge transfer? Tuesday, March 13, 12
  • 33. Open Questions • do we take Web for granted as provider of a free and unrestricted information exchange? • is Web Science the response to the pressure for the Web to change - to respond to the issues of security, commerce, criminality and privacy? • What are the challenges for Web science? •to explain how the Web impacts society? •to predict the outcomes of proposed changes to Web infrastructure on business & society? Tuesday, March 13, 12
  • 34. What can you do as a Computer Scientist? specifically for the Social Web Tuesday, March 13, 12
  • 35. Hands-on Teaser • Q&A on Assignments • Pitch of the Social Web Apps image source: http://www.flickr.com/photos/bionicteaching/1375254387/ Tuesday, March 13, 12