SlideShare une entreprise Scribd logo
1  sur  18
Guessing the Unknown
The Quest
                                  What can we
                                  say about this
                                    black box?
                                           E.g., What is the
                                          probability that it
                                         generates a number
               5   12 3   9 28
                                            bigger than 5?




Observations
Distributions

What if we had many many
      observations?

                           Value   Frequency
                           -1      0.3
                           0       0.2
                                                  Sum of
                           1       0.1         frequencies
  This table is the        2       0.1              is 1
    distribution           3       0.1
  associated with
                           4       0.1
   this black box
                           5       0.1
Distributions Graphically




                             -1 0 1 2 3 4 5




            Area under the
              curve is 1
The Challenge
We do not have many many
      observations


                So we cannot infer the
                 distribution from the
                     observations



                                 What can we do then?
What can we do with few
             observations?
Assume distribution is known
                                                 E.g., Normal, Binomial
  (from prior knowledge or
                                                           etc
        other means)


           I.e., model approximately using
                 a canonical distribution


                               But the parameters are not
                                         known


                                          Can these parameters be
                                            determined from the
                                               observations?
Why Canonical Distributions

 Value   Frequency
 -1      0.3
 0       0.2
 1       0.1                        Too verbose a
                                  description for the
 2       0.1                         distribution
 3       0.1
 4       0.1
 5       0.1


                   Can the entire distribution be
               described (even approximately) by just
               a few parameters, while modeling the
                           data accurately
Example: Binomial Distribution
       A coin that yields 1 with                           Observations
       probability p and 0 with
       probability 1- p, tossed n           1 0 1 1 1 ….
         times, independently

                                    Value    Frequency
            Number of 1’s?          0

                                    1

    Distribution,                   2
  μ=np,σ2=np(1-p)

                                    n-1
Can one determine p
   from the (few)                   n
   observations?
Other Canonical Distributions
               Normal μ, σ2




                         Poisson μ =r,σ2=r



                                  Negative Binomial μ =rp/(1-p),
                                          σ2= rp/(1-p)2
  What are
these? Later
    talk
                                                 Gamma μ=kθ, σ2 =kθ2
Back to the Quest

We have few observations

     Assume these are from a
     known distribution family

                 But with unknown
                    parameters

                     How do we determine the
                          parameters?

                             How do we determine μ,
                                      σ2?
Estimating Mean


          μ, σ2




Estimate for the
 mean; a good
  estimate??
μ, σ2




               What is the mean and variance
Normal!! For       of this distribution?
 modest n.
μ, σ2




 Unbiased




 Tight as n
grows larger
Estimating Variance


            μ, σ2




Estimate for the
variance; a good
   estimate??
μ, σ2
           μ, σ2




Bias
Estimating Variance Correctly


         μ, σ2




Unbiased!!
A Mind Reading Game
• Your friend chooses a number (one of 1,3,5) in his/her
  mind
   – Call this i

• He/She then rolls a 6-faced die 30 times, privately
   – For each roll, he/she declares Heads if the number on the
     die is <=i, and Tails otherwise

• Your goal is to guess i solely from this sequence of n
  Heads and Tails.

• Can you read your friend’s mind?
Thank You

Contenu connexe

Similaire à Introduction to statistics

Binomail distribution 23 jan 21
Binomail distribution 23 jan 21Binomail distribution 23 jan 21
Binomail distribution 23 jan 21Arun Mishra
 
Classics 2011
Classics 2011Classics 2011
Classics 2011goodbeem
 
The renyi entropy and the uncertainty relations in quantum mechanics
The renyi entropy and the uncertainty relations in quantum mechanicsThe renyi entropy and the uncertainty relations in quantum mechanics
The renyi entropy and the uncertainty relations in quantum mechanicswtyru1989
 
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.Peter Coles
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSLiemNguyenDuy
 
lecture 8
lecture 8lecture 8
lecture 8sajinsc
 
Diffraction,unit 2
Diffraction,unit  2Diffraction,unit  2
Diffraction,unit 2Kumar
 
Standard Scores
Standard ScoresStandard Scores
Standard Scoresshoffma5
 
Chapter 2 Probabilty And Distribution
Chapter 2 Probabilty And DistributionChapter 2 Probabilty And Distribution
Chapter 2 Probabilty And Distributionghalan
 
Normal distribution and hypothesis testing
Normal distribution and hypothesis testingNormal distribution and hypothesis testing
Normal distribution and hypothesis testingLorelyn Turtosa-Dumaug
 
Probability distribution
Probability distributionProbability distribution
Probability distributionRanjan Kumar
 

Similaire à Introduction to statistics (12)

6주차
6주차6주차
6주차
 
Binomail distribution 23 jan 21
Binomail distribution 23 jan 21Binomail distribution 23 jan 21
Binomail distribution 23 jan 21
 
Classics 2011
Classics 2011Classics 2011
Classics 2011
 
The renyi entropy and the uncertainty relations in quantum mechanics
The renyi entropy and the uncertainty relations in quantum mechanicsThe renyi entropy and the uncertainty relations in quantum mechanics
The renyi entropy and the uncertainty relations in quantum mechanics
 
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.
Talk given at Kobayashi-Maskawa Institute, Nagoya University, Japan.
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
lecture 8
lecture 8lecture 8
lecture 8
 
Diffraction,unit 2
Diffraction,unit  2Diffraction,unit  2
Diffraction,unit 2
 
Standard Scores
Standard ScoresStandard Scores
Standard Scores
 
Chapter 2 Probabilty And Distribution
Chapter 2 Probabilty And DistributionChapter 2 Probabilty And Distribution
Chapter 2 Probabilty And Distribution
 
Normal distribution and hypothesis testing
Normal distribution and hypothesis testingNormal distribution and hypothesis testing
Normal distribution and hypothesis testing
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 

Plus de Strand Life Sciences Pvt Ltd (7)

Dynamic programming for simd
Dynamic programming for simdDynamic programming for simd
Dynamic programming for simd
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
Converting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional OnesConverting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional Ones
 
Searching using Quantum Rules
Searching using Quantum RulesSearching using Quantum Rules
Searching using Quantum Rules
 
Randomized algorithms
Randomized algorithmsRandomized algorithms
Randomized algorithms
 
Suffix arrays
Suffix arraysSuffix arrays
Suffix arrays
 
Alignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGSAlignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGS
 

Dernier

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Dernier (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Introduction to statistics

  • 2. The Quest What can we say about this black box? E.g., What is the probability that it generates a number 5 12 3 9 28 bigger than 5? Observations
  • 3. Distributions What if we had many many observations? Value Frequency -1 0.3 0 0.2 Sum of 1 0.1 frequencies This table is the 2 0.1 is 1 distribution 3 0.1 associated with 4 0.1 this black box 5 0.1
  • 4. Distributions Graphically -1 0 1 2 3 4 5 Area under the curve is 1
  • 5. The Challenge We do not have many many observations So we cannot infer the distribution from the observations What can we do then?
  • 6. What can we do with few observations? Assume distribution is known E.g., Normal, Binomial (from prior knowledge or etc other means) I.e., model approximately using a canonical distribution But the parameters are not known Can these parameters be determined from the observations?
  • 7. Why Canonical Distributions Value Frequency -1 0.3 0 0.2 1 0.1 Too verbose a description for the 2 0.1 distribution 3 0.1 4 0.1 5 0.1 Can the entire distribution be described (even approximately) by just a few parameters, while modeling the data accurately
  • 8. Example: Binomial Distribution A coin that yields 1 with Observations probability p and 0 with probability 1- p, tossed n 1 0 1 1 1 …. times, independently Value Frequency Number of 1’s? 0 1 Distribution, 2 μ=np,σ2=np(1-p) n-1 Can one determine p from the (few) n observations?
  • 9. Other Canonical Distributions Normal μ, σ2 Poisson μ =r,σ2=r Negative Binomial μ =rp/(1-p), σ2= rp/(1-p)2 What are these? Later talk Gamma μ=kθ, σ2 =kθ2
  • 10. Back to the Quest We have few observations Assume these are from a known distribution family But with unknown parameters How do we determine the parameters? How do we determine μ, σ2?
  • 11. Estimating Mean μ, σ2 Estimate for the mean; a good estimate??
  • 12. μ, σ2 What is the mean and variance Normal!! For of this distribution? modest n.
  • 13. μ, σ2 Unbiased Tight as n grows larger
  • 14. Estimating Variance μ, σ2 Estimate for the variance; a good estimate??
  • 15. μ, σ2 μ, σ2 Bias
  • 16. Estimating Variance Correctly μ, σ2 Unbiased!!
  • 17. A Mind Reading Game • Your friend chooses a number (one of 1,3,5) in his/her mind – Call this i • He/She then rolls a 6-faced die 30 times, privately – For each roll, he/she declares Heads if the number on the die is <=i, and Tails otherwise • Your goal is to guess i solely from this sequence of n Heads and Tails. • Can you read your friend’s mind?