SlideShare une entreprise Scribd logo
1  sur  33
Public Health Methodologies

        Biostatistics

drrkb@hotmail.com
Data
• Data is a collection of facts, such as values or
  measurements.
                          OR
• Data is information that has been translated into
  a form that is more convenient to move or
  process.
                          OR
• Data are any facts, numbers, or text that can be
  processed by a computer.

3/3/2012              Dr. Riaz A. Bhutto              2
Statistics

    Statistics    is    the    study    of    the
    collection, summarizing, organization, analysi
    s, and interpretation of data.




3/3/2012              Dr. Riaz A. Bhutto         3
Vital statistics

    Vital                statistics                 is
    collecting, summarizing, organizing, analysis,
    presentation, and interpretation of data
    related to vital events of life as births, deaths,
    marriages, divorces,
    health & diseases.


3/3/2012                Dr. Riaz A. Bhutto           4
Biostatistics

    Biostatistics is the application of statistical
    techniques to scientific research in health-
    related             fields,          including
    medicine, biology, and public health.




3/3/2012               Dr. Riaz A. Bhutto         5
Descriptive Statistics
    The term descriptive statistics refers to
    statistics that are used to describe. When
    using descriptive statistics, every member of a
    group or population is measured. A good
    example of descriptive statistics is the
    Census, in which all members of a population
    are counted.



3/3/2012               Dr. Riaz A. Bhutto             6
Inferential or Analytical Statistics

 Inferential statistics are used to draw
 conclusions and make predictions based on the
 analysis of numeric data.




3/3/2012            Dr. Riaz A. Bhutto       7
Primary & Secondary Data
• Raw or Primary data: when data collected
  having lot of unnecessary, irrelevant & un
  wanted information
• Treated or Secondary data: when we treat &
  remove this unnecessary, irrelevant & un
  wanted information
• Cooked data: when data collected not
  genuinely and is false and fictitious

3/3/2012            Dr. Riaz A. Bhutto         8
Ungrouped & Grouped Data
• Ungrouped data: when data presented or observed individually. For
  example if we observed no. of children in 6 families

                               2, 4, 6, 4, 6, 4

• Grouped data: when we grouped the identical data by frequency.
  For example above data of children in 6 families can be grouped as:
                              No. of children      Families
                                      2              1
                                      4              3
                                      6              2

    or alternatively we can make classes:

                           No. of children      Frequency
                                2-4                 4
                                5-7                 2


3/3/2012                          Dr. Riaz A. Bhutto                    9
Variable

    A variable is something that can be
    changed, such as a characteristic or value. For
    example age, height, weight, blood pressure
    etc




3/3/2012               Dr. Riaz A. Bhutto         10
Types of Variable
    Independent variable: is typically the
    variable representing the value being
    manipulated or changed. For example
    smoking
    Dependent variable: is the observed result of
    the independent variable being manipulated.
    For example ca of lung
    Confounding variable: is associated with both
    exposure and disease. For example age is
    factor for many events
3/3/2012              Dr. Riaz A. Bhutto        11
Categories of DATA




9/3/2012          Dr. Riaz A. Bhutto   12
Quantitative or Numerical data
    This data is used to describe a type of
    information that can be counted or expressed
    numerically (numbers)

               2, 4 , 6, 8.5, 10.5




9/3/2012               Dr. Riaz A. Bhutto          13
Quantitative or Numerical data (cont.)
This data is of two types
1. Discrete Data: it is in whole numbers or values and
   has no fraction. For example
   Number of children in a family = 4
   Number of patients in hospital = 320

2. Continuous Data (Infinite Number): measured on a
   continuous scale. It can be in fraction. For example
   Height of a person = 5 feet 6 inches 5”.6’
   Temperature            = 92.3 °F

9/3/2012                Dr. Riaz A. Bhutto                14
Qualitative or Categorical data
This is non numerical data as
                Male/Female, Short/Tall
This is of two types
1. Nominal Data: it has series of unordered categories
    ( one can not √ more than one at a time) For example

      Sex = Male/Female           Blood group = O/A/B/AB

2.     Ordinal or Ranked Data: that has distinct ordered/ranked
       categories. For example

      Measurement of height can be = Short / Medium / Tall
      Degree of pain can be = None / Mild /Moderate / Severe

9/3/2012                        Dr. Riaz A. Bhutto                15
Measures of Central Tendency &
       Variation (Dispersion)




9/3/2012       Dr. Riaz A. Bhutto   16
Measures of Central Tendency
    are quantitative indices that describe the
    center of a distribution of data. These are

• Mean
• Median          (Three M M M)
• Mode



9/3/2012               Dr. Riaz A. Bhutto         17
Mean
  Mean or arithmetic mean is also called AVERAGE and
  only calculated for numerical data. For example
• What average age of children in years?
        Children 1 2 3 4 5 6 7
        Age       6443246

           Formula   -- = ∑ X
                     X    ___
                           n

     Mean = 6 4 4 3 2 4 5 = 28 = 4 years
                 7          7
9/3/2012                Dr. Riaz A. Bhutto             18
Median
• It is central most value. For example what is
  central value in 2, 3, 4, 4, 4, 5, 6 data?
• If we divide data in two equal groups
  2, 3, 4, 4, 4, 5, 6 hence 4 is the central
  most value
• Formula to calculate central value is:
        Median = n + 1 (here n is the total no. of value)
                   2
9/3/2012
        Median = (n + 1)/2 = 7 + 1 = 8/2 = 4
                         Dr. Riaz A. Bhutto                 19
Mode
• is the most frequently (repeated) occurring
  value in set of observations. Example
• No mode
  Raw data:       10.3 4.9 8.9 11.7 6.3 7.7
• One mode
  Raw data:        2 3 4 4 4 5 6
• More than 1 mode
  Raw data:        21 28 28 41 43 43

9/3/2012            Dr. Riaz A. Bhutto          20
Measures of Dispersion
quantitative indices that describe the spread of
  a data set. These are
• Range
• Mean deviation
• Variance
• Standard deviation
• Coefficient of variation
• Percentile

9/3/2012             Dr. Riaz A. Bhutto            21
Range
    It is difference between highest and lowest
    values in a data series. For example:

           the ages (in Years) of 10 children are
              2, 6, 8, 10, 11, 14, 1, 6, 9, 15

  here the range of age will be 15 – 1 = 14 years

9/3/2012                   Dr. Riaz A. Bhutto       22
Mean Deviation

    This is average deviation of all observation
    from the mean
                                           -
               Mean Deviation = ∑ І X – X І
                                    _______
                        _               n
      here X = Value, X = Mean
      n = Total no. of value



9/3/2012               Dr. Riaz A. Bhutto          23
Mean Deviation Example
  A student took 5 exams in a class and had scores of
 92, 75, 95, 90, and 98. Find the mean deviation for her
                       test scores.
• First step find the _
                      mean.
                      x = ___
                          ∑x
                           n
                        = 92+75+95+90+98
                                5
                        = 450
                           5
                        = 90
9/3/2012                Dr. Riaz A. Bhutto             24
• 2nd step find mean deviation
                                                   Deviation from   Absolute value of
                                  ˉ                             ˉ   Deviation
    Values = X             Mean = X                 Mean = X - X
                                                                    Ignoring + signs

            92               90                          2                   2

            75               90                         -15                 15

            95               90                          5                   5

            90               90                          0                   0

            98               90                          8                   8

  Total = 450                                                                --
                                                                     ∑ X - X = 30
                                                        _
n= 5                   Mean Deviation =
                                                 ∑І X – X І
                                                 _______ = 30/5     =6
                                                     n
                                                                    Average deviation
                                                                    from mean is 6

 9/3/2012                          Dr. Riaz A. Bhutto                              25
Variance
• It is measure of variability which takes into
  account the difference between each
  observation and mean.
• The variance is the sum of the squared
  deviations from the mean divided by the
  number of values in the series minus 1.
• Sample variance is s² and population variance
    is σ²
9/3/2012              Dr. Riaz A. Bhutto          26
Variance (cont.)

• The Variance is defined as:
• The average of the squared differences from the
  Mean.
• To calculate the variance follow these steps:
• Work out the Mean (the simple average of the
  numbers)
• Then for each number: subtract the Mean and
  square the result (the squared difference)
• Then work out the average of those squared
  differences.

9/3/2012             Dr. Riaz A. Bhutto             27
Example: House hold size of 5 families was recorded as following:
                   2, 5, 4, 6, 3       Calculate variance for above data.


    Step 1               Step 2                            Step 3         Step 4
                                                      Deviation from             ˉ
   Values = X                     ˉ                                        ( X – X)²
                           Mean = X                                ˉ
                                                       Mean = X - X
           2                  4                             -2               4
           5                  4                              1               1
           4                  4                              0               0
           6                  4                              2               4
           3                  4                              -1              1

                                                                       ∑ = 10 Step 5
                                                            _
                                                    ∑ ( X – X )²
                   Step 6                s²       = _______ = 10/5 = 2
                                                         n         S²= 2 persons²
9/3/2012                              Dr. Riaz A. Bhutto                               28
Standard Deviation
• The Standard Deviation is a measure of how
  spread out numbers are.
• Its symbol is σ (the greek letter sigma)
• The formula is easy: it is the square root of
  the Variance.i-e           s = √ s²
• SD is most useful measure of dispersion
              s = √ (x - x²)
                       n        (if n > 30)
             s = √ (x - x²)
                     n-1                   (if n < 30)
9/3/2012              Dr. Riaz A. Bhutto                 29
Example
    You and your friends have just measured the heights of your
                       dogs (in millimeters):




• The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
    300mm.
•   Find out the Mean, the Variance, and the Standard Deviation.

9/3/2012                             Dr. Riaz A. Bhutto                30
Your first step is to find the Mean:
                                   Answer:
             Mean = 600 + 470 + 170 + 430 + 300 = 1970 = 394
                                      5                 5
     so the mean (average) height is 394 mm. Let's plot this on the chart:




9/3/2012                              Dr. Riaz A. Bhutto                     31
Now, we calculate each dogs difference from the Mean:




      To calculate the Variance, take each difference, square it, and then average
      the result:




9/3/2012                      So, the Variance is 21,704.                            32
                              Dr. Riaz A. Bhutto
And the Standard Deviation is just the square root of Variance, so:
           Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm)

   And the good thing about the Standard Deviation is that it is useful. Now we can
    show which heights are within one Standard Deviation (147mm) of the Mean:




• So, using the Standard Deviation we have a "standard" way of knowing
  what is normal, and what is extra large or extra small.


9/3/2012                               Dr. Riaz A. Bhutto                             33

Contenu connexe

Tendances (7)

Chapter1:introduction to medical statistics
Chapter1:introduction to medical statisticsChapter1:introduction to medical statistics
Chapter1:introduction to medical statistics
 
Introduction biostatistics
Introduction biostatisticsIntroduction biostatistics
Introduction biostatistics
 
Community Medicine Presentation
Community Medicine PresentationCommunity Medicine Presentation
Community Medicine Presentation
 
Biostatistics CH Lecture Pack
Biostatistics CH Lecture PackBiostatistics CH Lecture Pack
Biostatistics CH Lecture Pack
 
Normality tests
Normality testsNormality tests
Normality tests
 
Basics of biostatistic
Basics of biostatisticBasics of biostatistic
Basics of biostatistic
 
Biostatics ppt
Biostatics pptBiostatics ppt
Biostatics ppt
 

Similaire à Lec. biostatistics

Lec. biostatistics introduction
Lec. biostatistics  introductionLec. biostatistics  introduction
Lec. biostatistics introduction
Riaz101
 
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghd
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghdBiostatistics.pptxhgjfhgfthfujkolikhgjhcghd
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghd
madanshresthanepal
 
Intro biostat1&2
Intro biostat1&2Intro biostat1&2
Intro biostat1&2
Lucidante1
 
Lecture 10. Measurement of study variables (2).pptx
Lecture 10. Measurement of study variables (2).pptxLecture 10. Measurement of study variables (2).pptx
Lecture 10. Measurement of study variables (2).pptx
PadmaBhatia1
 
Statistical Analysis Of Data Final
Statistical Analysis Of Data FinalStatistical Analysis Of Data Final
Statistical Analysis Of Data Final
Saba Butt
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docx
darwinming1
 

Similaire à Lec. biostatistics (20)

Lec. biostatistics introduction
Lec. biostatistics  introductionLec. biostatistics  introduction
Lec. biostatistics introduction
 
Data Display and Summary
Data Display and SummaryData Display and Summary
Data Display and Summary
 
Intro to Biostat. ppt
Intro to Biostat. pptIntro to Biostat. ppt
Intro to Biostat. ppt
 
Bioststistic mbbs-1 f30may
Bioststistic  mbbs-1 f30mayBioststistic  mbbs-1 f30may
Bioststistic mbbs-1 f30may
 
Understanding statistics in research
Understanding statistics in researchUnderstanding statistics in research
Understanding statistics in research
 
statistics introduction.ppt
statistics introduction.pptstatistics introduction.ppt
statistics introduction.ppt
 
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghd
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghdBiostatistics.pptxhgjfhgfthfujkolikhgjhcghd
Biostatistics.pptxhgjfhgfthfujkolikhgjhcghd
 
Bio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical researchBio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical research
 
Introduction.pdf
Introduction.pdfIntroduction.pdf
Introduction.pdf
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Medical Statistics.ppt
Medical Statistics.pptMedical Statistics.ppt
Medical Statistics.ppt
 
Biostatistics khushbu
Biostatistics khushbuBiostatistics khushbu
Biostatistics khushbu
 
Intro biostat1&2
Intro biostat1&2Intro biostat1&2
Intro biostat1&2
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Lecture 10. Measurement of study variables (2).pptx
Lecture 10. Measurement of study variables (2).pptxLecture 10. Measurement of study variables (2).pptx
Lecture 10. Measurement of study variables (2).pptx
 
BIOSTATISTICS
BIOSTATISTICSBIOSTATISTICS
BIOSTATISTICS
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostatics
 
Statistical Analysis Of Data Final
Statistical Analysis Of Data FinalStatistical Analysis Of Data Final
Statistical Analysis Of Data Final
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docx
 

Plus de Riaz101

Epidemiology
EpidemiologyEpidemiology
Epidemiology
Riaz101
 
Thallasmia
ThallasmiaThallasmia
Thallasmia
Riaz101
 
Thallasmia
ThallasmiaThallasmia
Thallasmia
Riaz101
 
Lec on rabies
Lec on rabiesLec on rabies
Lec on rabies
Riaz101
 

Plus de Riaz101 (6)

Epidemiology
EpidemiologyEpidemiology
Epidemiology
 
Leprosy
LeprosyLeprosy
Leprosy
 
Thallasmia
ThallasmiaThallasmia
Thallasmia
 
Thallasmia
ThallasmiaThallasmia
Thallasmia
 
Lec on rabies
Lec on rabiesLec on rabies
Lec on rabies
 
Cancer
CancerCancer
Cancer
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Lec. biostatistics

  • 1. Public Health Methodologies Biostatistics drrkb@hotmail.com
  • 2. Data • Data is a collection of facts, such as values or measurements. OR • Data is information that has been translated into a form that is more convenient to move or process. OR • Data are any facts, numbers, or text that can be processed by a computer. 3/3/2012 Dr. Riaz A. Bhutto 2
  • 3. Statistics Statistics is the study of the collection, summarizing, organization, analysi s, and interpretation of data. 3/3/2012 Dr. Riaz A. Bhutto 3
  • 4. Vital statistics Vital statistics is collecting, summarizing, organizing, analysis, presentation, and interpretation of data related to vital events of life as births, deaths, marriages, divorces, health & diseases. 3/3/2012 Dr. Riaz A. Bhutto 4
  • 5. Biostatistics Biostatistics is the application of statistical techniques to scientific research in health- related fields, including medicine, biology, and public health. 3/3/2012 Dr. Riaz A. Bhutto 5
  • 6. Descriptive Statistics The term descriptive statistics refers to statistics that are used to describe. When using descriptive statistics, every member of a group or population is measured. A good example of descriptive statistics is the Census, in which all members of a population are counted. 3/3/2012 Dr. Riaz A. Bhutto 6
  • 7. Inferential or Analytical Statistics Inferential statistics are used to draw conclusions and make predictions based on the analysis of numeric data. 3/3/2012 Dr. Riaz A. Bhutto 7
  • 8. Primary & Secondary Data • Raw or Primary data: when data collected having lot of unnecessary, irrelevant & un wanted information • Treated or Secondary data: when we treat & remove this unnecessary, irrelevant & un wanted information • Cooked data: when data collected not genuinely and is false and fictitious 3/3/2012 Dr. Riaz A. Bhutto 8
  • 9. Ungrouped & Grouped Data • Ungrouped data: when data presented or observed individually. For example if we observed no. of children in 6 families 2, 4, 6, 4, 6, 4 • Grouped data: when we grouped the identical data by frequency. For example above data of children in 6 families can be grouped as: No. of children Families 2 1 4 3 6 2 or alternatively we can make classes: No. of children Frequency 2-4 4 5-7 2 3/3/2012 Dr. Riaz A. Bhutto 9
  • 10. Variable A variable is something that can be changed, such as a characteristic or value. For example age, height, weight, blood pressure etc 3/3/2012 Dr. Riaz A. Bhutto 10
  • 11. Types of Variable Independent variable: is typically the variable representing the value being manipulated or changed. For example smoking Dependent variable: is the observed result of the independent variable being manipulated. For example ca of lung Confounding variable: is associated with both exposure and disease. For example age is factor for many events 3/3/2012 Dr. Riaz A. Bhutto 11
  • 12. Categories of DATA 9/3/2012 Dr. Riaz A. Bhutto 12
  • 13. Quantitative or Numerical data This data is used to describe a type of information that can be counted or expressed numerically (numbers) 2, 4 , 6, 8.5, 10.5 9/3/2012 Dr. Riaz A. Bhutto 13
  • 14. Quantitative or Numerical data (cont.) This data is of two types 1. Discrete Data: it is in whole numbers or values and has no fraction. For example Number of children in a family = 4 Number of patients in hospital = 320 2. Continuous Data (Infinite Number): measured on a continuous scale. It can be in fraction. For example Height of a person = 5 feet 6 inches 5”.6’ Temperature = 92.3 °F 9/3/2012 Dr. Riaz A. Bhutto 14
  • 15. Qualitative or Categorical data This is non numerical data as Male/Female, Short/Tall This is of two types 1. Nominal Data: it has series of unordered categories ( one can not √ more than one at a time) For example Sex = Male/Female Blood group = O/A/B/AB 2. Ordinal or Ranked Data: that has distinct ordered/ranked categories. For example Measurement of height can be = Short / Medium / Tall Degree of pain can be = None / Mild /Moderate / Severe 9/3/2012 Dr. Riaz A. Bhutto 15
  • 16. Measures of Central Tendency & Variation (Dispersion) 9/3/2012 Dr. Riaz A. Bhutto 16
  • 17. Measures of Central Tendency are quantitative indices that describe the center of a distribution of data. These are • Mean • Median (Three M M M) • Mode 9/3/2012 Dr. Riaz A. Bhutto 17
  • 18. Mean Mean or arithmetic mean is also called AVERAGE and only calculated for numerical data. For example • What average age of children in years? Children 1 2 3 4 5 6 7 Age 6443246 Formula -- = ∑ X X ___ n Mean = 6 4 4 3 2 4 5 = 28 = 4 years 7 7 9/3/2012 Dr. Riaz A. Bhutto 18
  • 19. Median • It is central most value. For example what is central value in 2, 3, 4, 4, 4, 5, 6 data? • If we divide data in two equal groups 2, 3, 4, 4, 4, 5, 6 hence 4 is the central most value • Formula to calculate central value is: Median = n + 1 (here n is the total no. of value) 2 9/3/2012 Median = (n + 1)/2 = 7 + 1 = 8/2 = 4 Dr. Riaz A. Bhutto 19
  • 20. Mode • is the most frequently (repeated) occurring value in set of observations. Example • No mode Raw data: 10.3 4.9 8.9 11.7 6.3 7.7 • One mode Raw data: 2 3 4 4 4 5 6 • More than 1 mode Raw data: 21 28 28 41 43 43 9/3/2012 Dr. Riaz A. Bhutto 20
  • 21. Measures of Dispersion quantitative indices that describe the spread of a data set. These are • Range • Mean deviation • Variance • Standard deviation • Coefficient of variation • Percentile 9/3/2012 Dr. Riaz A. Bhutto 21
  • 22. Range It is difference between highest and lowest values in a data series. For example: the ages (in Years) of 10 children are 2, 6, 8, 10, 11, 14, 1, 6, 9, 15 here the range of age will be 15 – 1 = 14 years 9/3/2012 Dr. Riaz A. Bhutto 22
  • 23. Mean Deviation This is average deviation of all observation from the mean - Mean Deviation = ∑ І X – X І _______ _ n here X = Value, X = Mean n = Total no. of value 9/3/2012 Dr. Riaz A. Bhutto 23
  • 24. Mean Deviation Example A student took 5 exams in a class and had scores of 92, 75, 95, 90, and 98. Find the mean deviation for her test scores. • First step find the _ mean. x = ___ ∑x n = 92+75+95+90+98 5 = 450 5 = 90 9/3/2012 Dr. Riaz A. Bhutto 24
  • 25. • 2nd step find mean deviation Deviation from Absolute value of ˉ ˉ Deviation Values = X Mean = X Mean = X - X Ignoring + signs 92 90 2 2 75 90 -15 15 95 90 5 5 90 90 0 0 98 90 8 8 Total = 450 -- ∑ X - X = 30 _ n= 5 Mean Deviation = ∑І X – X І _______ = 30/5 =6 n Average deviation from mean is 6 9/3/2012 Dr. Riaz A. Bhutto 25
  • 26. Variance • It is measure of variability which takes into account the difference between each observation and mean. • The variance is the sum of the squared deviations from the mean divided by the number of values in the series minus 1. • Sample variance is s² and population variance is σ² 9/3/2012 Dr. Riaz A. Bhutto 26
  • 27. Variance (cont.) • The Variance is defined as: • The average of the squared differences from the Mean. • To calculate the variance follow these steps: • Work out the Mean (the simple average of the numbers) • Then for each number: subtract the Mean and square the result (the squared difference) • Then work out the average of those squared differences. 9/3/2012 Dr. Riaz A. Bhutto 27
  • 28. Example: House hold size of 5 families was recorded as following: 2, 5, 4, 6, 3 Calculate variance for above data. Step 1 Step 2 Step 3 Step 4 Deviation from ˉ Values = X ˉ ( X – X)² Mean = X ˉ Mean = X - X 2 4 -2 4 5 4 1 1 4 4 0 0 6 4 2 4 3 4 -1 1 ∑ = 10 Step 5 _ ∑ ( X – X )² Step 6 s² = _______ = 10/5 = 2 n S²= 2 persons² 9/3/2012 Dr. Riaz A. Bhutto 28
  • 29. Standard Deviation • The Standard Deviation is a measure of how spread out numbers are. • Its symbol is σ (the greek letter sigma) • The formula is easy: it is the square root of the Variance.i-e s = √ s² • SD is most useful measure of dispersion s = √ (x - x²) n (if n > 30) s = √ (x - x²) n-1 (if n < 30) 9/3/2012 Dr. Riaz A. Bhutto 29
  • 30. Example You and your friends have just measured the heights of your dogs (in millimeters): • The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm. • Find out the Mean, the Variance, and the Standard Deviation. 9/3/2012 Dr. Riaz A. Bhutto 30
  • 31. Your first step is to find the Mean: Answer: Mean = 600 + 470 + 170 + 430 + 300 = 1970 = 394 5 5 so the mean (average) height is 394 mm. Let's plot this on the chart: 9/3/2012 Dr. Riaz A. Bhutto 31
  • 32. Now, we calculate each dogs difference from the Mean: To calculate the Variance, take each difference, square it, and then average the result: 9/3/2012 So, the Variance is 21,704. 32 Dr. Riaz A. Bhutto
  • 33. And the Standard Deviation is just the square root of Variance, so: Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm) And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean: • So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small. 9/3/2012 Dr. Riaz A. Bhutto 33