4. Biostatistics
The study of statistics as applied to biological areas
such as Biological laboratory experiments, medical
research (including clinical research), and public health
services research.
Biostatistics, far from being an unrelated mathematical
science, is a discipline essential to modern medicine – a
pillar in its edifice’ (Journal of the American Medical
Association (1966)
4
5. Bioinformatics
Bioinformatics is a science straddling the domains of
biomedical, informatics, mathematics and statistics.
Applying computational techniques to biology data
Functional Genomics
Proteomics
Sequence Analysis
Phylogenetic
Etc,.
5
6. “Informatics” in Bioinformatics
Databases
Building, Querying
Object DB
•Text String Comparison
Text Search
Finding Patterns
AI / Machine Learning
Clustering
Data mining
etc
6
7. Current Research
Statistical methods for high-throughput data analyses
particularly in Next generation sequencing (NGS) data
(Whole genome-seq, Exome-seq and RNA-seq).
RNA microarray expression studies and GWAS in cancer
and cardiovascular diseases.
Classification in NGS data.
R-Graphical User Interface (R-GUI) for high-throughput
data analyses.
8. Course Outline
Basic concept Research
Problem identification and hypothesis
Literature Review
Research Design
Quantitative research
Make Scientific report/paper
9. Course Workload
40% Theory, 60% practice
Group Project (5 students)
Presentation every week
Slides can be seen at :
http://www.slideshare.net/hafidztio/
Setia Pramana Survival Data Analysis
9
10. Research
An organized, systematic, data-based, critical,
objective, scientific inquiry or investigation into a
specific problem, undertaken with the purpose of
finding answers or solutions to it.
It provides the needed information that guides
managers to make informed decisions to successfully
deal with problems.
The information provided could be the result of a
careful analysis of data gathered firsthand or of data
that are already available (in the company, industry,
archives, etc.).
11. Purpose of A Research
Review or synthesize existing knowledge
Investigate existing situations or problems
Provide solutions to problems
Explore and analyze more general issues
Construct or create new procedures or systems
Explain new phenomenon
Generate new knowledge
or a combination of any of the above!
18. Deductive Reasoning
Starts out with a general statement, or hypothesis, and
examines the possibilities to reach a specific, logical
conclusion.
The scientific method uses deduction to test hypotheses
and theories.
Ex: "All men are mortal. Harold is a man. Therefore,
Harold is mortal."
Theory
Hypothesis
Observation
Confirmation
19. Inductive Reasoning
The opposite of deductive reasoning.
Makes broad generalizations from specific observations.
Ex: "Harold is a grandfather. Harold is bald. Therefore,
all grandfathers are bald."
Theory
Tentative
Hypothesis
Pattern
Confirmation
21. Basic Steps
1. Develop a research question
2. Conduct thorough literature review
3. Re-define research question/ hypothesis
4. Design research methodology/study
5. Create research proposal
22. Basic Steps
6. Apply for funding
7. Apply for ethics approval
8. Collect and analyze data/Software
developing and testing
9. Draw conclusions and relate findings
25. Research Question Development
Problem Identification
Limit the research
scope
Research Question
Identification
Goals Identification
Hypothesis
Statistical Hypothesis
Hypothetical
Statement
27. Possible Source of RQs
Observational Research
Discussions, brainstorming
Experts, academics and industry
Bibliography, journals, research report,
Populas science magazine, etc.
28. A Research Question Should
Have research value: Original, can be
tested/evaluated.
Fisible: Can be answered, data available,
cost and can be solved in time.
Match to the researchers qualification
29. FINER criteria for RQ
F Feasible Adequate number of subjects
Adequate technical expertise
Affordable in time and money
Manageable in scope
I Interesting Getting the answer intrigues
investigator, peers and community
N Novel Confirms, refutes or extends previous
findings
Hulley S, Cummings S, Browner W, et al. Designing clinical research. 3rd ed.
Philadelphia (PA): Lippincott Williams and Wilkins; 2007.
30. FINER criteria for RQ
E Ethical Amenable to a study that
institutional review board will approve
R Relevant To scientific knowledge
To clinical and health policy
To future research
Hulley S, Cummings S, Browner W, et al. Designing clinical
research. 3rd ed. Philadelphia (PA): Lippincott Williams and
Wilkins; 2007.
33. Research Hypothesis
The primary research question should be driven by the
hypothesis rather than the data.
The research question and hypothesis should be
developed before the start of the study.
A good hypothesis must be based on a good research
question at the start of a study and drive data collection
for the study.
34. Hypothesis
Is a clear statement of what is intended to be
investigated.
It should be specified before research is conducted and
openly stated in reporting the results.
It allows to Identify:
the research objectives
the key abstract concepts involved in the research
its relationship to both the problem statement and the
literature review
35. Source of Hypothesis
Environment
Literature
Other Empirical Data
Personal Experience
39. Example
There is no significant gain between pre-test and post-
test scored of students exposed to Computer-Aided
Instruction in Analytic Geometry
43. 1-sided or 2-sided hypotheses?
A 2-sided hypothesis states that there is a difference
between the experimental group and the control group,
but it does not specify in advance the expected
direction of the difference.
A 1-sided hypothesis states a specific direction (e.g.,
there is an improvement in outcomes with computer-
assisted surgery).
A 2-sided hypothesis should be used unless there is a
good justification for using a 1-sided hypothesis.
47. Research objective
The primary objective should be coupled with the
hypothesis of the study.
Study objectives define the specific aims of the study and
should be clearly stated in the introduction of the
research protocol.
Example:
Hypothesis : there is no difference in functional outcomes
between computer-assisted acetabular component placement
and free-hand placement,
The primary objective can be stated as follows: this study will
compare the functional outcomes of computer-assisted
acetabular component insertion versus free-hand placement
in patients undergoing total hip arthroplasty.
48. Research objective
The study objective is an active statement about
how the study is going to answer the specific
research question.
Objectives state exactly which outcome measures
are going to be used within their statements.
They are important to not only guide the
development of the protocol and design of study
but also play a role in sample size calculations and
determining the power of the study.
50. Literature Review
Is an evaluative report of studies found in the
literature related to your selected area.
Should describe, summarize, evaluate and
clarify this literature.
Give a theoretical basis for the research and
help you determine the nature of your own
research.
51. Select a limited number of works that are central
to your area rather than trying to collect a large
number of works that are not as closely connected
to your topic area.
Boote, D.N. & Beile, P. (2005). Scholars before
researchers: On the centrality of the dissertation
literature review in research preparation. Educational
Researcher 34/6, 3-15.
52. Literature Review Purpose
Provide a context for the research
Justify the research
Ensure the research hasn't been done before
(or that it is not just a "replication study")
Show where the research fits into the
existing body of knowledge
Enable the researcher to learn from
previous theory on the subject
53. Literature Review Purpose
Illustrate how the subject has been studied
previously
Highlight flaws in previous research
Outline gaps in previous research
Show that the work is adding to the
understanding and knowledge of the field
Help refine, refocus or even change the
topic
60. What you should do
Compare
Contrast
Criticize
Synthesize
Summarize
Hasibuan, 2007, Metode Penelitian
Komputasi
61. Sources
Articles in International Journal
Thesis
Disertasi
Proceeding
Magazines
Abstract book
Websites
62. Literature Citation
Whenever you quote, summarize, paraphrase or refer to
the work of another person you need to cite it.
Giving credit to the original author for any information
that you learn through our research process and share
with the readers.
Citing is the way to give credit to other's work when you
use it in your papers, speeches and projects.
Citing other's work is a very important step in the
academic writing process and the best way to avoid
plagiarism.
63. Literature Citation
Two ways:
Use sentence that introduce the author
Add the author’s name at the end of the sentence
We must provide last name and year of publication
Paraphrase signal phrase:
“According to Smith (2004) the cost of treating alcoholism is increasing
dramatically.”
Direct Quote:
“ the cost of treating alcoholism is exceeded only by the cost of
treating illness from tobacco use, and is increasing exponentially”
(Smith, 2004)
65. Research Design
A plan or strategy for conducting the
research
Spells out the basic strategies that
researchers adopt to develop evidence that
is accurate and interpretable.
Deals with matters such as selecting
participants for the research and preparing
for data collection.
66. Purposes of Research Design
1. To provide answers to
research questions
2. To control variance
67. Purposes of Research Design
1. To provide answers to
research questions
2. To control variance
68. Characteristics for good research
design
1. Freedom from bias
2. Freedom from confusing
3. Control of extraneous
variables
4. Statistical correctness for
testing hypothesis
69. TYPES OF RESEARCH
1. Experimental research – involves
manipulating condition and studying
effects – (IPO-Input-Process-Output)
2. Correlational research – involves studying
relationship s among variables within a
single group, and frequently suggests the
possibility of cause and effect.
3. Survey research – involves describing the
characteristics of a group by means of
such instruments as interview schedules,
questionnaires, and tests.
70. Ethnographic research - concentrates on documenting
or portraying the everyday experiences of people using
observation and interviews.
Involve how well, how much, how efficiently,
knowledge, attitudes or opinion in the like exists.
Case study – is a detailed analysis of one or a few
individuals
Historical research – involves studying some aspect of
the past
Action research – is a type of research by practitioners
designed to help improve their practice.
71. GENERAL RESEARCH TYPES
It is useful to consider the various
research methodologies we have
described as falling within one or
more general research categories –
Descriptive
Associational
Intervention-type Studies
72. 1. DESCRIPTIVE STUDIES
It describe a given state of affairs as fully and carefully
as possible.
Examples:
- In Biology, where each variety of plant and
animal species is meticulously described and
information is organized into useful taxonomic
categories.
- In educational research, the most common
descriptive methodology is the survey, as when
researchers summarize the characteristics
(abilities, preferences, behaviors, and so on) of
individuals or groups or physical environment
(school)
73. 2. ASSOCIATIONAL RESEARCH
Research that investigates relationships is
often referred to as associational research
Correlational and causal-comparative
methodologies are the principal examples
of associational research.
Example: Studying relationship
(a) between achievement and attitude
(b) between childhood experiences and
adult characteristics
74. (c) between teacher characteristic and
student achievement
(d) between methods of instruction &
achievement (comparing students who
have been taught by each method)
(e) between gender and attitude
(comparing attitudes of males and females)
75. Descriptive research is not satisfying since
most researchers want to have complete
understanding of people and things not just
merely describing but need further analysis.
Associational studies are, they too are
ultimately unsatisfying.
- because it did not permit researchers to
“do something” to influence or change
outcomes.
- Simply determining interest or achievement
of students does not tell us how to change or
improve either interest or achievement.
76. 3. INTERVENTION STUDIES
To find out whether one thing will have an
effect on something else, researchers need
to conduct some form of intervention
study.
Is a particular treatment is expected to
influence one or more outcomes.
Such studies enable researchers to assess
77. For example:
- the effectiveness of various teaching
methods,
- curriculum models,
- classroom arrangements
- and other efforts to influence the
characteristics of individuals or groups.
Experiment is the primary methodology
used in intervention research
Some types of research may combine these
3 general types
78. Quantitative vs. qualitative research
Areas Quantitative Qualitative
Goals -Theory testing,
establishing
facts, statistical
description,
prediction,
relationship
between
variables
- Sensitizing
concepts,
describe
multiple
realities,
grounded
theory, develop
understanding
Design - Structured,
predetermined,
formal, specific
detailed plan of
operation
- Evolving,
flexible
79. Areas Quantitative Qualitative
Data -Quantitative,
quantifiable
coding counts,
measures,
operationalized
variables
statistics
- Descriptive,
personal
documents,
field notes,
photographs,
people’s own
words, official
documents
Sample - Large,
stratified,
control groups,
precise,
random, control
of extraneous
variables
- Small, non-
representative,
focused,
purposeful,
convenient
80. Areas Quantitative Qualitative
Technique or
methods
- Experiments,
surveys,
structured
interviewing,
structured
observation
- Observation,
participant
observation,
review of
documents,
open-ended
interviewing,
first person
accounts.
Relationship
with subjects
- Detached,
short term,
distant, subject-
researcher
restricted
- Empathy,
emphasis on
trust,
democratic
81. Areas Quantitative Qualitative
Data analysis - Deductive,
statistical
- Ongoing models, themes,
concepts, inductive,
analytic,constant
comparative.
Problems - Controlling
other
variables,
validity,
reliability
- Time consuming, data
reduction difficulties,
procedures not
standardized, difficulty to
study large
populations,Empathy,
emphasis on trust,
democratic
82. Research Types under Quantitative &
Qualitative
Quantitative Qualitative
1.Experimental
Research
2.Single-Subject
Research
3.Correlational
Research
4.Causal-Comparative
Research
5.Survey Research
1.Ethnographic
Research
2.Historical
Research
83. IDENTIFY WHAT TYPE OF RESEARCH
Historical study of college entrance
requirements over time that examine the
relationship between those requirements
and achievement in mathematics.
An ethnographic study that describes in
detail the daily activities of an inner-city
high school and also finds a relationship
between media attention and teacher
morale in school
An investigation of the effects of different
teaching methods on concept learning and
gender
84. We can classify designs into a simple
threefold classification by asking some
key questions.
85. This threefold classification is especially useful for
describing the design with respect to internal validity.
A randomized experiment generally is the strongest of
the three designs when your interest is in establishing a
cause-effect relationship.
A non-experiment is generally the weakest in this
respect only to internal validity or causal assessment.
In fact, the simplest form of non-experiment is a one-
shot survey design that consists of nothing but a single
observation O.
The most common forms of research descriptive ones
86. What research type would be appropriate for
these research problem?
1. How do parents feel about the elementary school
counseling program?
2. Do students who have high score on reading tests also
have high scores on writing tests?
3. What effect does the gender of a counselor have on how
he or she is “received by counselees”?
4. How can Tom Adams be helped to learn to read?
87. ANSWER
1. ETHNOGRAPHIC STUDY
2. CORRELATIONAL STUDY
3. CAUSAL-CORRELATION STUDY/INTERVENTION STUDY
4. EXPERIMENT/CORRELATIONAL OR
ASSOCIATIONAL-INTERVENTION STUDY
91. What exactly IS a “sample”?
A subset of the population, selected
by either “probability” or “non-
probability” methods. If you have a
“probability sample” you simply
know the likelihood of any member
of the population being included (not
necessarily that it is “random.”
92. SAMPLING
9
3
A sample is “a smaller (but hopefully
representative) collection of units from a
population used to determine truths about that
population” (Field, 2005)
Why sample?
Resources (time, money) and workload
Gives results with known accuracy that can be
calculated mathematically
The sampling frame is the list from which the
potential respondents are drawn
Registrar’s office
Class rosters
Must assess sampling frame errors
93. SAMPLING…….
94
3 factors that influence sample representative-
ness
Sampling procedure
Sample size
Participation (response)
When might you sample the entire population?
When your population is very small
When you have extensive resources
When you don’t expect a very high response
94. Assumptions of quantitative
sampling
We want to generalize to the
population.
Random events are
predictable.
Therefore…We can compare random
events to our results.
Probability sampling is the
best approach.
97. Process
98
The sampling process comprises several stages:
Defining the population of concern
Specifying a sampling frame, a set of items or
events possible to measure
Specifying a sampling method for selecting
items or events from the frame
Determining the sample size
Implementing the sampling plan
Sampling and data collecting
Reviewing the sampling process
98. Assumptions of qualitative
sampling
Social actors are not
predictable like objects.
Randomized events are
irrelevant to social life.
Probability sampling is
expensive and inefficient.
Therefore…
Non-probability sampling is
the best approach.
100. Types of Samples 101
Probability (Random) Samples
Simple random sample
Systematic random sample
Stratified random sample
Multistage sample
Multiphase sample
Cluster sample
Non-Probability Samples
Convenience sample
Purposive sample
Quota
101. Simple Random Sample
1. Get a list or “sampling frame”
a. This is the hard part! It must not systematically exclude
anyone.
b. Remember the famous sampling mistake?
2. Generate random numbers
3. Select one person per random number
102. SIMPLE RANDOM SAMPLING……..
103
Estimates are easy to calculate.
Simple random sampling is always an EPS design, but not all
EPS designs are simple random sampling.
Disadvantages
If sampling frame large, this method impracticable.
Minority subgroups of interest in population may not be
present in sample in sufficient numbers for study.
103. Systematic Random Sample
1. Select a random number, which will be known as k
2. Get a list of people, or observe a flow of people (e.g.,
pedestrians on a corner)
3. Select every kth person
a. Careful that there is no systematic rhythm to the flow
or list of people.
b. If every 4th person on the list is, say, “rich” or “senior”
or some other consistent pattern, avoid this method
104. SYSTEMATIC SAMPLING……
105
ADVANTAGES:
Sample easy to select
Suitable sampling frame can be identified easily
Sample evenly spread over entire reference population
DISADVANTAGES:
Sample may be biased if hidden periodicity in population
coincides with that of selection.
Difficult to assess precision of estimate from one survey.
105. Stratified Random Sample
1. Separate your population into groups or “strata”
2. Do either a simple random sample or systematic
random sample from there
a. Note you must know easily what the “strata” are before
attempting this
b. If your sampling frame is sorted by, say, school district,
then you’re able to use this method
106. STRATIFIED SAMPLING……
107
Drawbacks to using stratified sampling.
First, sampling frame of entire population has to be
prepared separately for each stratum
Second, when examining multiple criteria, stratifying
variables may be related to some, but not to others,
further complicating the design, and potentially reducing
the utility of the strata.
Finally, in some cases (such as designs with a large number
of strata, or those with a specified minimum sample size
per group), stratified sampling can potentially require a
larger sample than would other methods
107. Multi-stage Cluster Sample
1. Get a list of “clusters,” e.g., branches of a company
2. Randomly sample clusters from that list
3. Have a list of, say, 10 branches
4. Randomly sample people within those branches
a. This method is complex and expensive!
109. The Snowball Sample
1. Find a few people that are relevant to your topic.
2. Ask them to refer you to more of them.
110. The Quota Sample
1. Determine what the population looks like in terms of
specific qualities.
2. Create “quotas” based on those qualities.
3. Select people for each quota.
113. Jenis Penelitian untuk Skripsi
Komputasi Statistik STIS
Pengembangan sistem informasi statistik
Sistem informasi berbasis komputer yang dikembangkan
untuk mendukung kegiatan pada domain/area statistik.
Contoh: Sistem Informasi Rujukan Statistik, Sistem
Informasi Geografis yang menggunakan data (hasil olahan)
statistik, Sistem Informasi Diseminasi Statistik, serta
Sistem Informasi Data Entri dan Monitoring dalam kegiatan
pengumpulan data statistik.
114. Jenis Penelitian untuk Skripsi
Komputasi Statistik STIS
Pengembangan aplikasi statistik
Program aplikasi yang dibuat untuk mendukung pemecahan
masalah di bidang statistika.
Program harus dibuat sendiri dan pemecahan masalah tersebut
belum bisa dilakukan dengan menggunakan paket program
pengolahan data statistik yang sudah ada; atau program boleh
dibuat dengan menggunakan pustaka/library yang sudah ada
namun belum ada interface nya; atau bisa dilakukan dengan
paket program namun proses/prosedurnya tidak/belum efisien
sehingga perlu dibuat suatu aplikasi yang terintegrasi.
Contoh: Pengembangan Aplikasi Fitting Regresi, Aplikasi
Pengujian Hipotesis Menggunakan Permutation Test dalam
Resampling.
115. Jenis Penelitian untuk Skripsi
Komputasi Statistik STIS
Kajian teknologi di bidang komputasi statistik
Kajian yang dilakukan pada dua bidang keilmuan tersebut yang
hasilnya dapat bermanfaat bagi perkembangan ilmu komputer
maupun statistik.
Tema penelitian yang tidak masuk dalam jenis pertama dan kedua
bisa dimasukkan ke dalam jenis ketiga ini jika dipandang tema
penelitiannya memiliki orisinalitas dan inovasi serta tingkat
kontribusi yang tinggi bagi perkembangan ilmu komputer maupun
statistik, Badan Pusat Statistik, maupun bagi masyarakat.
Contoh: Pengembangan Inference Engine Sistem Pakar Berbasis
Database (Studi Kasus Penentuan Metode Penyusunan Indeks Harga
dan Produksi), Pengembangan Mesin Pencari Statistik Berbasiskan
Supervised Learning dan Relevant Feedback.
117. Questioners
The most common instrument or tool of research for
obtaining the data beyond the physical reach of the
observer which
Closed form / Closed-ended
Open form / Open-ended
118. Questioners
Clarity of language
Singleness of purpose
Relevant to the objective of the study
Correct grammar
119. Questioner: Advantages
Facilitates data gathering
Is easy to test data for reliability and validity
Is less time-consuming than interview and
observation
Preserves the anonymity and confidentiality of the
respondents’ reactions and answers
120. Questioner: Disadvantages
Printing and mailing are costly
Response rate maybe low
Respondents may provide only socially acceptable
answers
There is less chance to clarify ambiguous answer
Respondents must be literate and with no physical
handicaps
Rate of retrieval can be low because retrieval itself is
difficult
121. Interview
Purpose:
to verify information gathered from written sources
to clarify points of information
to update information and
to collect data
123. How to measure the instruments?
Validity- measure what is intends to measure
External validity: is the results of a study can be generalized from a
sample to a population?
Content validity: The appropriateness of the content of an
instrument. In other words, do the measures (questions, observation
logs, etc.) accurately assess what you want to know
Reliability – stability in maintaining consistent measurement in
a test administered twice
Inter-Rater/Observer Reliability: The degree to which different
raters/observers give consistent answers or estimates.
Test-Retest Reliability: The consistency of a measure evaluated over
time.
Parallel-Forms Reliability: The reliability of two tests constructed the
same way, from the same content.
Internal Consistency Reliability: The consistency of results across
items, often measured with Cronbach’s Alpha.