1
QUANTITATIVERESEARCH
NOR IDAYU MAHAT CENTRE FOR UNIVERSITY-INDUSTRY COLLABORATION (CUIC)
UNIVERSITI UTARA MALAYSIA
04-928 4098 / [email protected]
METHODOLOGY
2
Contents
Basic concepts o Statistics and research
Sampling, techniques and procedures
Measurements: o Scale
o Adequacy, validity, reliability and sensitivity
Exploring your data
Statistical inference
Hypothesis testing
Analysis of difference
Complex analyses
3
Basic concept
Pure Basic
• Experimental and theory work undertaken to acquire new knowledge for the advancement of knowledge.
Strategic Basic
• Experimental and theoretical work undertaken to acquire new knowledge for specified broad areas in the expectation of useful discoveries.
Applied
• Original work undertaken to acquire new knowledge with a specific application in view, e.g. to determine possible uses for the findings of basic research.
Experimental
• Systematic work, using existing knowledge for the purpose of creating new or improved products/processes.
Research Activity Types
4
Example Research
Example: Modification on existing Control chart (Nor Idayu Mahat & Sharipah Soaad, 2011) This study discusses on the problem of constructing control charts for multi quality characteristics when the traditional Hotelling T2 fails to detect shifts in the mean or the relationship among the measured quality characteristics. Alternative control charts based on modified one-step M-estimator which is robust towards outliers is proposed to overcome this weakness..... Results from simulation studies proved that the proposed robust control charts offer better performance..... when the variables are independent or dependent.
5
Example Research
Example: The use of Principal Component Analysis in monitoring gear faults (Li et al., 2003) This paper presents a study that uses principal component analysis to reduce dimensionality of the feature space and to get an optimal subspace for machine fault classification.…. The experimental results indicate that the method extracts diagnostic information effectively for gear fault classification and has a good potential for application in practice.
6
Basic concept
Quantitative Research
Scientific application of mathematical principals to the collection, analysis and presentation of numerical data.
Mathematical principals (?)
Collection – knowledge to the design of surveys and experiments in order to get information
Analysis – processing and analysing the collected information to answer some questions
Presentation –interpret the results obtained from the analysis in some meaningful ways.
7
Basic concept
When Quantitative Analysis is needed?
There is a need to present and to interpret numerical data.
There is a need to test some defined statements mathematically.
The aim is to classify variables, count them, and construct statistical models in an attempt to explain what is observed.
Precise prediction is a major concern.
8
Basic concept: What is data?
Numbers
Measurements
Words
Figures
9
Basic concept: Types of data
o Secondary data
• data that has already been collected.
• It could be raw data or compiled data.
o Secondary sources:
• Hardcopies – books, articles, directories, conference papers, newspapers, magazines, research reports and market reports.
• Electronic resources – CD-ROM, on-line databases, internet, videos and broadcasts.
10
Basic concept: Types of data
o Primary data ~ the researcher collect the data herself.
o Methods
• Observation
• Experiment
• Interviews: face-to-face interview, focus group, panels
• Questionnaire
• Diaries
• Portfolios
11
Basic concept: Types of data
Secondary data Primary data
May not match your need. Commonly match to your need.
Access may be difficult or costly.
Original.
May save some costs and time.
Sometimes involve some costs and time.
Allow for longitudinal studies.
May be not appropriate for longitudinal studies.
Validity of some secondary data (e.g. internet sources)
Validity of the process in collecting the data.
12
Population and sample
Where can we get the data?
Population – all entities (people or items) with the characteristics one wishes to study.
Population structure describes the relative numbers of entities with similar characteristics.
Sample – Some of the entities from the population that one may have to answer questions about the population as a whole.
13
Population and sample
Principle of Sampling
o Entities in a sample must be
• taken from the target population following some standard precedures.
• able to represent the actual population.
• adequate to be used in the analysis parts.
• adequate to supply necessary information to the research questions.
14
Population and sample
Sample A
Sample B
Sample C
?
15
Basic concept of statistical tools
Before we decide to use either population or sample, let focus on statistical tools….
Descriptive statistics
Procedures to summarise and to describe the important characteristics of a set of measurements.
Arts of statistics.
Inferential statistics
Procedures to make inferences about population characteristics from information contained in a sample drawn the target population.
16
Basic concept of statistical tools
o Probability sampling
• All objects in the population will have equal chance to be chosen as sampel.
• Less bias sampling procedure.
o Nonprobability sampling
• Objects in a sample are usually selected on the basis of accessibility.
• Bias sampling procedure.
17
Sampling methods
Nonprobability sampling
5. Quota
6. Snow-ball
7. Convenience (opportunity)
8. Purposive
9. Self-selection
Probability sampling
1. Simple random
2. Systematic sampling
3. Stratified sampling
4. Cluster sampling (and
multi-stage)
18
Probability sampling
o Researcher must ensure that every object has equal opportunity for selection
o Randomisation is a must.
o The techniques are free of systematic and sampling bias.
19
Sampling methods: example
In the early stages of planning a school restructuring effort, school district board members are considering a year round schooling program. For the moment, the board is interested in the degree to which parents/legal guardians favor such a change. A simple random sample (n = 300) of parents/legal guardians was drawn from 1,850 families (only one adult per household) and given a questionnaire.
20
Sampling methods
1. Simple random sampling (pensampelan rawak mudah) Pilihan ideal bagi mendapatkan objek secara rawak.
Setiap objek untuk sampel perlu
o dipilih secara rawak daripada senarai populasi. o mempunyai peluang yang sama untuk terpilih.
Kekurangan
o Senarai populasi sukar diperolehi. o Kadang-kala sukar untuk mendapatkan objek yang
telah dikenalpasti.
21
Sampling methods
2. Systematic sampling (pensampelan sistematik)
Tatacara pensampelan
1. Sediakan senarai semua objek populasi.
2. Pilih objek pertama secara rawak daripada senarai populasi.
3. Pilih objek seterusnya pada selang ke-k daripada pilihan yang terdahulu.
4. Ulang proses pemilihan (3) sehingga bilangan objek yang diperolehi adalah memenuhi saiz sampel yang diperlukan.
22
1 11 21 31
2 12 22 32
3 13 23 33
4 14 24 34
5 15 25 35
6 16 26 36
7 17 27 37
8 18 28 38
9 19 29 39
10 20 30 40
List of student in Class A (5 students are needed for every 7 position)
23
Sampling methods
3. Stratified sampling (pensampelan berstratum)
Tatacara pensampelan
1. Setiap objek dalam populasi disusun mengikut kumpulan (strata) berpandukan atribut tertentu (e.g. jantina, sosio-ekonomi dan pendapatan)
2. Pilih sejumlah objek daripada setiap strata secara rawak mengikut
peratus sama banyak bagi setiap strata, atau
peratus berbeza mengikut strata.
24
Sampling methods
4. Cluster sampling (pensampelan berkelompok)
Pensampelan berkelompok
o hampir menyerupai kaedah pensampelan berstrata.
o Kelompok daripada populasi dipilih secara rawak, kemudian semua objek dalam kumpulan terpilih dijadikan sampel kajian.
Pensampelan multi-stage adalah sesuai bagi kes yang melibatkan struktur geografi.
25
Sampling methods
5. Quota sampling (pensampelan berkuota)
Hampir menyerupai kaedah pensampelan berstrata tetapi ia adalah tidak rawak.
Biasanya banyak digunakan
o dalam kajian yang melibatkan temuduga.
o Apabila saiz populasi adalah tidak terhingga.
26
Sampling methods
Researcher chooses proportion representation of objects depending on trait which is considered as the quota.
Example:
Gender Age (year) Quota
Male 20 – 29 56
30 - 44 104
Female 20 – 29 50
30 - 44 110
27
Sampling methods
6. Snowball sampling (pensampelan bola salji)
Kaedah ini sesuai apabila objek dalam populasi adalah sukar untuk dikesan.
Strategi pensampelan:
1. Penyelidik perlu mendapatkan objek pertama yang sesuai untuk kajian.
2. Objek kedua dan seterusnya dikenalpasti berdasarkan bantuan daripada objek yang telah dikenalpasti.
3. Objek dalam sampel adalah tidak rawak.
28
Sampling methods
7. Convenience: objek dipilih atas dasar mudah untuk diperolehi.
8. Purposive: penyelidik memilih hanya objek yang bersesuaian untuk mencapai objektif kajian.
9. Self-selection: sampel bagi kaedah ini terdiri daripada objek yang menyertainya secara sukarela.
29
More sampling methods
Line-intersect sampling
Elements are chosen in a region whereby an element is sampled in a chosen line segment.
Panel sampling
A sampling group is chosen (usually by random), and is asked for the same information repeatedly over a period of time.
Event sampling
Behaviour of interest is collected at the specified interval.
30
More sampling methods: Hypothetical data
A set of data that is generated randomly from some known distribution(s).
When hypothetical data set can be used?
To test performance of a new model/approach under in-control condition.
To help a researcher to identify some possible problems with the proposed model / approach.
31
Hypothetical data: Example
Phase I: construction of control chart
Step 1 Generate 5000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from Np(0,Ip).
Step 2 Compute the robust location and scale estimates for each sample.
Step 3 Randomly generate a new observation, Xi, from Np(0,Ip).
Step 4 Compute the respective T2.
Step 5 Identify the UCL at the 95th (99th) percentile of the 5000 T2 in Step 4.
Step 6 Generate 1000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from contaminated model.
Step 7 Compute the robust location and scale estimates for each sample.
32
Checklist….
• Population vs. Sample
• Objects/respondents
• Variables vs. Constant value
• Parameter vs. Estimator
• Randomness
• Types of data:
• Cross-sectional
• Time series
• Functional series
• Spatial data
33
Errors in research activities
Sampling error – caused by sampling design
Selection error
Estimation error
Non-sampling error – caused by mistakes in data processing
Over / under coverage
Processing error
Non-response
Measurement error
34
Measurement
Constant value – an actual value or a specific character whose value does not change.
Variable – a character with values that may vary.
Level of measurement:
Nominal
Ordinal
Interval
Ratio
35
Nominal
Which of the following daily newspapers have you read during the past month?
Read Not read Don’t know
The Star
The New Straits Times
Berita Harian
36
Ordinal
One can ask respondent to place things in rank order. Example: Please number each of the factors listed in order of
importance in your choice of a new car.
a. Price ____
b. Fuel economy ____
c. Acceleration ____
d. Safety features ____
Otherwise, one may use the common scales such as Likert, semantic differential scale, Guttman scale and Thurstone scale.
37
Measurement
Data
Categorical Quantifiable
Nominal Ordinal Interval Ratio
Increasing precision
38
Exploring your data
It is a good practice to understand your data before any complex analysis is performed.
Objective:
o To identify some strange behaviour.
o To determine a suitable technique that can be employed to the data.
o For validation purposes.
o To make better interpretation on the obtained results.
39
Exploring your data
• Missing value: Objects with no value in some variables.
• Some strategies to handle missing value:
o Exclude objects with missing value.
o Replace a missing value by the mean of all available values for the relevant variable.
o Imputation: missing values can be replaced by some suitable numerical entries.
40
Exploring your data
• Outliers: Values that are distinctly different from other values.
• Outliers may contribute to biased estimated value and this leads to give misleading results.
• Strategies to handle outliers: o Outliers due to recording errors should be
corrected. o If the values are genuine then some thought
must be given as to whether or not they should be retained.
41
Exploring your data
The effect of an outlier in computing the average value.
Sales (RM) Sales (RM)
70.63 70.63
56.28 56.28
70.98 70.98
7.00 70.00
68.42 68.42
56.74 56.74
60.04 60.04
55.73 64.73
42
How to explore your data?
Tabular display
Plot (e.g. histogram, bar chart etc.) o Better than statistic values but limited to 2 or 3
variables at one time.
Statistical values o Common statistical values can be used such as mean,
variance etc.
Map the data • Can be done using e.g. Principal Component Analysis,
Factor Analysis, Data Dimensional Scaling etc.
43
Tabular display
Sex
87 32.2 32.2 32.2
183 67.8 67.8 100.0
270 100.0 100.0
Female
Male
Total
Valid
Frequency Percent Valid Percent
Cumulativ e
Percent
Frequency table
Cross tabulation
Number of repeated exams
1 2 3 4 Total
Sex Female 59 12 12 4 87
Male 101 46 21 15 183
Total 160 58 33 19 270
What information can be extracted from these tables?
44
Tabular display
Bad presentation
45
Pie chart
8
16
2625
18
7
Years Experience
5 or less
6-10
11-15
16-20
21-35
36 or more
GOOD
BAD
46
Line chart / series plot
For continuous measurements.
Often used to highlight some patterns or behaviour of the target variable.
47
Scatter plot
48
Bar chart
Alternative presentation for table.
For categorical measurement only.
Sometimes can be useful to identify the distribution of the data.
49
Box and Whiskers
Suitable for numerical values.
This plot summarises some important statistics (and features) which include:
o Median
o Quartiles
o Potential outliers
50
Histogram
51
Numerical values
The centre (middle) of the distribution of measurements.
Some measurements:
o Mode
o Median
o Sum
o Arithmetic mean
o Trimmed mean
o Robust mean
52
Numerical values
Represent how the data scatter around the centre point, i.e. central tendency values.
Some measurements:
o Range
o Percentile
o Quartiles ; interquartile range (IQR)
o Variance
o Standard deviation; coefficient of variation (CV)
o Standard error of mean
53
Weakness of descriptive tools
Descriptive statistics cannot give broader statement about the difference and relationships between data.
They cannot draw conclusions and making predictions about the properties of a population if the information obtained from sample.
54
Statistical inference
Why inference about population is necessary?
o Sometimes relevant facts are abundant.
o Plots may yield conflict opinions regarding conclusions among decision makers.
o Humans are incapable of utilising large amounts of data.
So, information contained in a sample is used to make inferences about a population. Common methods are
o estimation.
o statistical hypothesis testing.
55
Statistical inference
Estimation: a process that will predict a value of a parameter of interest. It answers the following question
• What is the value of the population parameter?
• Example: What is the average salary of Malaysians?
Statistical hypothesis testing: a procedure that test a hypothesis about the value of a parameter of interest. It answers the following question
• Is the parameter value equal to this specific value?
• Is it true that Malaysians earn RM2200 monthly?
56
Hypothesis testing
Step 1: Formulate hypotheses.
Step 2: Identify an appropriate test statistic to assess
the hypotheses.
Step 3: Compute the test statistic (or the p-value).
Step 4: Compare the test statistic (p-value) to a related
distribution value (identified alpha, α).
Step 5: Make decision and conclusion.
57
Hypothesis testing
• Null hypothesis (H0): hypothesis with no effects, e.g. the process change makes no different.
• Alternative hypothesis (H1): a choice that can be considered if H0 can be ruled out, e.g. the process change has an effect.
58
Hypothesis testing
59
Hypothesis testing: identifying test statistics
• Test statistic: a quantity computed from the sample data.
• Test statistic vs. distribution value (e.g. normal dist., chi-square dist etc.)
• p-value: probability that the obtained test statistic is likely to reject H0.
• Also known as level of significance.
• p-value vs. identified value of α.
60
Hypothesis testing: decision making
Choose either one:
If p-value is less than or equal to α means we have enough evidence to reject H0.
If p-value is greater than α, then we do not have enough evidence to reject H0 (but it doesn’t mean that H0 is true).
61
ANALYSIS OF DIFFERENCE
One population comparison
Two populations comparison
Multiple populations comparison
62
One Population Comparison
To test the central values for a target population. Various
hypotheses testing:
Two-tail test
One-tail tests
01
00
:H
:H
CT
CT
01
00
:H
:H
CT
CT
01
00
:H
:H
CT
CTor
µ0 value is known. The value might be obtained from some previous studies, experts’ opinion etc.
63
One Population Comparison
Parametric methods
o Robust if the population is normally distributed.
o Strategy:
1. Write a research hypothesis.
2. Choose an appropriate test statistics (either Z-statistics or T-statistics) and calculate its value (or p-value) based on the obtained sample.
3. Check for the rejection region. Reject H0 if p-value is less than the fixed value of type one error, α.
4. Draw conclusions.
64
One Population Comparison
Non-parametric methods
o Might be best methods when the population distribution is highly skewed or heavily tailed.
o Often, median is used.
o Example methods: sign test and Binomial test.
o Strategy:
1. Identify the value of population median.
2. Values are ordered from the smallest to the largest.
3. Sample median, , is calculated.
4. Compare the sample median and the population median.
M̂
65
Example
Let say that normally, the average number of passengers fly with a local flight during school breaks is 270 thousands.
So, we might be interested to check whether this number (270) maintain for the current situation.
Mode = 229.00
Median = 265.50
Mean =
280.30
66
Example
Parametric test’s result:
Non-parametric test’s result:
67
Two Populations Comparison
Aim: to compare a central value of two different populations. (Need to consider whether both populations have a homogeneous variance).
Inferences about : Independent samples with three different cases:
o Both population distributions are normally distributed with equal variance.
o Both sample sizes are large.
o The sample sizes are small and the population distributions are non-normal.
21
68
Two Populations Comparison
211
210
:H
:H
211
210
:H
:H
211
210
:H
:H
Two-tail test:
One-tail tests:
or
Parametric tests: - Independent samples t-test with equal variances. - Independent samples t-test with unequal variances. Non-parametric test: - Mann-Whitney U test - Wilcoxon Rank Sum test
69
Example
An experiment was conducted to evaluate the effectiveness of a treatment for tapeworm in the stomachs of sheep. A random sample of 24 worm-infected lambs of approximately the same age and health was randomly divided into two groups: drug-treated sheep and untreated sheep.
70
Example: initial data analysis
What is your expected result?
Drug treated
Untreated
71
Parametric test’s result
Non-parametric test’s result:
72
Two Populations Comparison
Inferences about : Paired data
Appropriate for studies in which measurement in one sample is matched or paired with a particular measurement in the other sample.
Hypothesis
21
0211
0210
:H
:H
D
D
0211
0210
:H
:H
D
D
0211
0210
:H
:H
D
D
Two-tail test:
One-tail tests: or
73
Example
To compare the wearing qualities of two automobile tires, A and B, a tire of type A and one type of B are randomly assigned and mounted on the rear wheels of each of five automobiles. The automobiles are then operated for a specified number of miles, and the amount of wear is recorded for each tire.
Automobile Tire A Tire B
Mean (A) = 10.24
Mean (B) = 9.76
Std. dev (A) = 1.32
Std. dev (B) = 1.33
1 10.6 10.2
2 9.8 9.4
3 12.3 11.8
4 9.7 9.1
5 8.8 8.3
74
Example
Independent Samples Test
.003 .960 .574 8 .582 .4800 .8362 -1.4482 2.4082
.574 7.999 .582 .4800 .8362 -1.4483 2.4083
Equal variances
assumed
Equal variances
not assumed
wear
F Sig.
Lev ene's Test for
Equality of Variances
t df Sig. (2-tailed)
Mean
Dif f erence
Std. Error
Dif f erence Lower Upper
95% Conf idence
Interv al of the
Dif f erence
t-test f or Equality of Means
Paired Samples Test
.4800 .0837 .0374 .3761 .5839 12.829 4 .000wearA - wearBPair 1
Mean Std. Dev iation
Std. Error
Mean Lower Upper
95% Conf idence
Interv al of the
Dif f erence
Paired Dif ferences
t df Sig. (2-tailed)
75
Multi-Populations Comparison
To check whether k populations share the same value of central tendency value.
76
Multi-Populations Comparison
A factory produces disc brakes for high-performance automobiles. The following table summarises the average production of four machines. The target diameter for the brake is 322 mm.
Disc Brake Diameter (mm)
321.9985 322.0143 321.9983 321.9954
.0111568 .0106913 .0104812 .0069883
Mean
Std. Dev iation
1 2 3 4
Machine Number
77
Multi-Populations Comparison
Total variation
= variation within groups + variation between groups
78
Multi-Populations Comparison
Hypothesis testing:
Parametric test
o One-way ANOVA
Nonparametric test
o Kruskal-Wallis H
o Median test
different are spopulation least twoat :
...:
1
210
H
H k
79
Parametric test’s result:
Nonparametric test’s result:
80
Think!!
Job satisfaction was investigated in two different factories A and B. In factory A the employees are on a fixed shift system while in factory B the workers have a rotating shift system. In factory A, a worker always works the same shift, while in factory B, a worker rotates through the three shifts. A satisfaction score was collected from each employee and the aim is to identify difference in job satisfaction between the two groups of workers.
Q: What information needed in order to determine the choice of test?
81
MEASUREMENT ADEQUACY
o Validity
• Does the instrument measures what it is supposed to?
o Reliability
• Does the instrument consistently measure what it is supposed to?
o Sensitivity
• How good the instrument in detecting the smallest amount that it can measure?
82
Validity
• In general, there are two types; Internal and external validity.
• Internal validity refers to the rigor with which the study was performed.
• Design of the study
• Measurements chosen
• Factors involved especially in a study of causal relationships
• External validity refers to the extent to which the results of a study are generalisable or transferable (authenticity).
83
Internal validity
Face validity
Content validity
Criterion-related validity
Predictive validity occurs when the criterion measures are obtained at a time after the test e.g. career tests.
Concurrent validity occurs when the criterion measures are obtained at the same time as the test scores e.g. level of depression.
Construct validity
Convergent
Discriminant
84
1. Face validity
It is the basic and minimal index of validity.
It is concerned with how a measure or procedure appears and understandable by to the respondents.
Does it seem well designed?
Does it seem as though it will work reliably?
Testing strategy: a set of questionnaire is given to a sample of respondents to judge their reaction to the items.
85
2. Criterion-Related Validity
Also known as instrumental validity.
It demonstrates the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.
Example: let say we have a hands-on driving test that has been shown to be an accurate test of driving skills. Then, one propose to a new written driving test. Then, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.
86
2. Criterion-Related Validity
Predictive validity
Indicates the ability of the measuring instrument to differentiate among individuals on a future criterion.
Example: employees ability test
Concurrent validity
Indicates the ability of the measuring instrument to differentiate among individuals who are known to be different (they should score differently on the instrument).
Example: work ethic among welfare recipients.
87
3. Construct validity
Construct validity testifies to the agreement between a theoretical concept and a specific measuring device or procedure.
Example: A doctor would like to test the effectiveness of painkillers on chronic back sufferers. Every day, he asks the test subjects to rate their pain level on a scale of one to ten. In this case, construct validity would test whether the doctor actually was measuring pain and not numbness, discomfort, anxiety or any other factor.
88
3. Construct validity
Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related.
The scores obtained by two different instruments measuring the same concept is highly correlated.
Discriminate validity is the lack of a relationship among measures which theoretically should not be related.
Two variables are predicted to be uncorrelated and the scores obtained by measuring them are indeed empirically found to be so.
89
3. Construct validity
Strategy to achieve construct validity:
Literature review
Confirmatory factor analysis
Correlation analysis
Some multivariate analyses
90
4. Content validity
Content validity ensures measures include an adequate and representative set of items that tap the concept.
Example:
1. A researcher needing to measure an attitude like self-esteem must decide what constitutes a relevant domain of content for that attitude.
2. In socio-cultural studies, content validity forces the researchers to define the very domains they are attempting to study.
91
4. Content validity
Strategy to achieve content validity:
Existing literature
Qualitative research
Judgment of panel of experts
92
Reliability
Reliability is defined as the extent to which an instrument consistently measures what it is supposed to.
Classical test theory – a ratio of variation between the true score and the observed score.
The true-score model
93
Approaches to estimate reliability
1. Equivalency reliability
2. Stability
o Test-retest reliability
o Parallel-form reliability
3. Internal consistency
o Inter-item consistency reliability
o Split-half reliability
4. Inter-rater reliability
94
1. Equivalency reliability
o Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty.
o Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association.
95
2. Stability
A set of measures is consider stable if it has an ability to maintain stability over time despite of uncontrollable conditions or the state of the respondents themselves.
Example: The method of maintaining weights used by the U.S. Bureau of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be reset so they are "weighing" accurately. Keeping track of how much the scales are off from year to year establishes a stability reliability for these instruments. In this instance, the platinum weights themselves are assumed to have a perfectly fixed stability reliability.
96
2. Stability
Test-retest reliability is the correlation between two successive measurements with the same test.
Example:
you can give your test in the morning to your pilot sample and then again in the afternoon. The two sets of data should be highly correlated if the test is reliable.
97
2. Stability
Parallel-form reliability is the successive administration of two parallel forms of the same test.
Examples:
There are two versions that measure Verbal and Math skills in SAT. Two forms for measuring Math should be highly correlated and that would document reliability.
In an exam, two groups of students are given questions having similar items and the same response format, with only different in wording and the ordering of questions.
98
3. Internal consistency
It indicates the homogeneity of the items in the measure that tap the construct.
Example: a questionnaire was designed to find out about college students' dissatisfaction with a particular textbook. Then, a researcher needs to analyzing the internal consistency of the survey items dealing with dissatisfaction which reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction.
99
3. Internal consistency
Inter-item consistent tests the consistency of respondents’ answers to all items in a measure.
In other words, it ensures that the items are homogeneous or all measuring the same construct.
Statistical procedures like KR-20 (Kuder-Richardson) or Cronbach's Alpha are commonly use for these purposes.
100
3. Internal consistency
Split-half reflects the correlation between two halves of an instrument.
Example: you have the SAT Math test and divide the items on it in two parts. If you correlated the first half of the items with the second half of the items, they should be highly correlated if they are reliable.
101
4. Inter-rater reliability
Inter-rater reliability reflects the consistency of the judgment of several raters on how they interpret the responses. In other words, it is the extent to which two or more individuals (coders or raters) agree.
Scenario: Two or more researchers are observing a high school classroom. The class is discussing a movie that they have just viewed as a group. The researchers have a sliding rating scale (1 being most positive, 5 being most negative) with which they are rating the student's oral responses. Inter-rater reliability assesses the consistency of how the rating system is implemented.
102
Power of a statistical test
It is the probability of rejecting the null hypothesis when the null hypothesis is false.
Power also represents the sensitivity of the undertaken analysis.
Factors influencing power: (i) the statistical significance criterion (alpha value), (ii) magnitude of the effect under alternate hypothesis (effect size) and (iii) sample size.
103
Complex analysis
Number of variables
1 variable 2 variables More than 2
variables
Homogeneous sample?
Choosing the right statistical tool
104
Bivariate analysis
BIVARIATE studies two variables simultaneously.
Common studies
• Correlation – measuring relationship between two continuous variables.
• Cross tabulation - measuring relationship between two categorical (or binary) variables.
• Simple modelling – a study involves in finding the best curve (e.g. straight line) that best explain how a variable (independent variable) influences the other variable (dependent variable).
105
MULTIVARIATE ANALYSIS
Multivariate data arise when more than one variable or measurement is made on each object.
Data arrangement
Type of studies:
o Descriptive multivariate studies
o Inferential studies
o Modelling and prediction
npnn
p
p
xxx
xxx
xxx
...
..
..
...
...
21
22221
11211
106
Multivariate analysis
Interdependence methods Involve only either independent variables or
dependent variables. Aim: to seek for patterns or any hidden
information. Methods: principal component analysis, factor
analysis, multidimensional scaling, cluster analysis, projection pursuits etc.
Dependence methods
Both independent variable and dependent variable(s) are measured.
Methods: multiregression, discriminant analysis, MANOVA, canonical analysis, SEM etc.
107
~: The End :~