44
Copyright ©2010 by Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. Statistics and Data Analysis for Nursing Research, Second Edition Denise F. Polit Statistics and Data Analysis for Nursing Research Second Edition CHAPTER Frequency Distributions: Tabulating and Displaying Data 2

Polit ln ch02

Embed Size (px)

Citation preview

Page 1: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Statistics and Data Analysisfor Nursing Research

Second Edition

CHAPTER

Frequency Distributions:

Tabulating and Displaying Data

2

Page 2: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

WHAT THE HECK ARE ALL THOSE NUMBERS???

Page 3: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Distributions

• That’s what a frequency distribution is for—to help impose order on the data

• A frequency distribution is a systematic arrangement of data values, with a count of how many times each value occurred in a dataset

Page 4: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Constructing a Frequency Distribution

• List each data value in a sequence (usually, ascending order) 1, 2, 3, 4, 5…

• Tally each occurrence of the value • Total the frequencies for each value (f) • The sum of fs for all data values must equal

the sample size:

Σf = N

Page 5: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Elements of a Typical Frequency Distribution

• Data values• Absolute frequencies (counts)• Relative frequencies (percentages) • Cumulative relative frequencies (the

percentage for a given score value, combined with percentages for all preceding values)

Page 6: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Example of a Frequency Distribution

Data Value Frequency(f)

Percentage(%)

Cumulative Percentage

1 10 10.0 10.0

2 20 20.0 30.0

3 40 40.0 70.0

4 15 15.0 85.0

5 15 15.0 100.0

TOTAL 100 100.0

Page 7: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Distributions and Measurement Levels

• Frequency distributions can be constructed for variables measured at any level of measurement

• BUT…for categorical (nominal-level) variables, cumulative frequencies do not make sense

• Also…

Page 8: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Distributions for Variables with Many Values

• When a variable has many possible values, a regular frequency distribution may be unwieldy – For example, weight values

(here, in pounds)

Weight f

98 1

99 1

100 1

101 0

102 2

103 1

104 0

105 2

106 1

Etc. to 285 lb

Page 9: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Grouped Frequency Distributions

• In a grouped frequency distribution contiguous values are grouped into sets (class intervals)

• Depending on purpose, 4 to 10 intervals work well

• Use intervals that are psychologically appealing (e.g., in groups of 2, 5, 10, 50, etc.)

Page 10: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Grouped Frequency Distributions—Example

• This grouping communicates information more conveniently than individual weights

Weight Interval

f

75 - 100 6

101 - 125 15

126 - 150 33

151 - 175 26

176 - 200 24

201 - 225 14

226 - 250 9

251 - 275 6

276 - 300 2

Page 11: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Reporting Frequency Information

• Can be reported narratively in text (e.g., “83% of study participants were male”)

• In a frequency distribution table (multiple variables often presented in a single table)

• In a graph: Different graphs used for different types of data

Page 12: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Bar Graphs

• Bar graphs: Used for nominal (and many ordinal) level variables

• Bar graphs have a horizontal dimension (X axis) that specifies categories (i.e., data values)

• The vertical dimension (Y axis) specifies either frequencies or percentages

• Bars for each category drawn to the height that indicates the frequency or %

Page 13: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Bar Graphs

• Example of a bar graph

• Note the bars do not touch each other

Page 14: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Pie Chart

• Pie Charts: Also used for nominal (and many ordinal) level variables

• Circle is divided into pie-shaped wedges corresponding to percentages for a given category or data value

• All pieces add up to 100%• Place wedges in order, with biggest wedge

starting at “12 o’clock”

Page 15: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Pie Chart

• Example of a pie chart, for same marital status data

Page 16: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Histograms

• Histograms: Used for interval- and ratio-level data

• Similar to a bar graph, with an X and Y axis—but adjacent values are on a continuum so bars touch one another

• Data values on X axis are arranged from lowest to highest

• Bars are drawn to height to show frequency or percentage (Y axis)

Page 17: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Histograms (cont’d)

• Example of a histogram: Heart rate data

f

Heart rate in bpm

0

2

4

6

8

10

12

0 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74

Page 18: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Polygons

• Frequency polygons: Also used for interval- and ratio-level data

• Similar to histograms, but instead of bars, a dot is used above score values to designate frequency/percentage

• Better than histograms for showing shape of distribution of scores, and is usually preferred if variable is continuous

Page 19: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Polygons (cont’d)

• Example of a frequency polygon (created in SPSS)

• Note that the line is brought down to zero for the score below lowest data point (54) and above highest data point (75)

Frequency Polygon, Heart Rate

Heart Rate in bpm

7472706866646260585654Fr

eque

ncy

12

10

8

6

4

2

0

Page 20: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Shapes of Distributions

• Distributions of data values can be described in terms of:– Modality – Symmetry– Kurtosis

Page 21: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Modality

• Modality concerns how many peaks (values with high frequencies) there are

• Unimodal = 1 peak• Bimodal = 2 peaks• Multimodal = multiple peaks

Page 22: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Modality Examples

• Unimodal:

• Bimodal:

Page 23: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Symmetry

• Symmetric Distribution: the two halves of the distribution, folded over in the middle, are identical

Page 24: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Symmetry (cont’d)

• Asymmetric (Skewed) Distribution: Peaks are “off center” and there is a tail trailing off for data values with low frequency– Positive skew: Longer tail trails off to right

(fewer people with high values, like for income)

– Negative skew: Longer tail trails off to left (fewer people with low values, like age at death)

Page 25: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Direction of Skew

• Examples of distributions with different skews:

Page 26: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Skewness Index

• Indexes have been developed to quantify degree of skewness

• One skewness index (e.g., in SPSS) has:– Negative values, for a negative skew – 0, for no skew – Positive values, for a positive skew

• If skewness index is less than twice the value of its standard error (to be explained later in the book), distribution can be treated as not skewed

Page 27: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Skewness Index Examples

• Skewness index = 0.80

• Standard error = 0.33

• Positive skew

• Skewness index = -0.72

• Standard error = 0.34

• Negative skew

Page 28: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Kurtosis

• Kurtosis: Degree of pointedness or flatness of the distribution’s peak

• Leptokurtic: Very thin, sharp peak• Platykurtic: Flat peak• Mesokurtic: Neither pointy nor flat

– Like skewness, there is an index of kurtosis Positive values: Greater peakedness Negative values: Greater flatness

Page 29: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Kurtosis Examples

• Leptokurtic (+ index)

• Platykurtic (– index)

Page 30: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Normal Distribution

• A normal distribution (aka normal curve, bell-shaped curve, Gaussian distribution) is:

– Unimodal– Symmetric

– Neither peaked nor flat

• Plays an important role in inferential statistics

Page 31: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Normal Distribution (cont’d)

• Some human characteristics are normally distributed (approximately), like height

1 short person, 3 medium persons, 1 tall person

Page 32: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Uses of Frequency Distributions in Data Analysis

• First step in understanding your data!– Begin by looking at the frequency

distributions for all or most variables, to “get a feel” for the data

– Through inspection of frequency distributions, you can begin to assess how “clean” the data are

Page 33: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Data Cleaning

• One aspect of data cleaning involves seeing whether the frequency distribution contains:– Outliers: Values that lie outside the normal

range of values, and that may or may not be legitimate

– Wild codes: Impossible or invalid codes, like a code of “3” for the variable sex when valid codes are 1 (female) and 2 (male)

Page 34: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Wild Codes

Codes for Sex Frequency Percent

1 (Female) 49 49.0%

2 (Male) 47 47.0%

3 1 1.0%

7 2 2.0%

Total 100 100.0%

The codes 3 and 7 are WILD!

Page 35: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Missing Values

• Frequency distributions can help you assess the pervasiveness of a thorny problem in data analysis:

– Missing data

Wanted:Missing Number!

Description: Data Values in Important Study

Last seen: Date of Enrollment

Missing from: My Dataset

If Found: Contact Me!

Page 36: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Inspection for Missing Values

Sex Frequency Percent Valid %

1 (Female) 46 46.0 51.7

2 (Male) 43 43.0 48.3

7 (Refused) 11 11.0

Total 100 100.0 100.0

11.0% of the data are missing because participants refused to report their sex

Page 37: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Assumptions

• Frequency distributions can help you assess validity of certain assumptions for many statistical tests

– An assumption is a condition presumed to be true and, when violated, can result in invalid results

– For many inferential statistics, a normal distribution (for the dependent variable) is assumed

Page 38: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Describe Sample

• Frequency distributions can help you better understand the type of people who are in your study sample:

– What percent are men?– What percent are African American?– What percent have a college degree?

Page 39: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Answer Descriptive Questions

• Frequency distributions can sometimes be used to answer descriptive research questions

• BUT…inferential statistics are almost always needed, because they allow you to draw inferences about a broader group than the study sample

Page 40: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequency Distributions in SPSS

• Use the Analyze Descriptive Statistics Frequencies command

• Click “Analyze” in the top toolbar menu, which brings up a pop-up menu; select Descriptives

Page 41: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequencies Command in SPSS

• All variables in dataset are listed in box on left

• Use arrow to move desired variable into slot marked “Variable(s)”

• Pushbuttons provide various options

Page 42: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequencies: Statistics Options in SPSS

• Many available options within Frequencies: Statistics

• Here we see that we can select statistics for skewness and kurtosis

Page 43: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Frequencies: Chart Options in SPSS

• The Charts option allows you to create bar charts, pie charts, and histograms

• Normal curve superimposed: An option for Histograms

• Chart values can be Frequencies or Percentage (not available for Histograms)

Page 44: Polit ln ch02

Copyright ©2010 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Statistics and Data Analysis for Nursing Research, Second EditionDenise F. Polit

Graphs in SPSS

• An even wider array of graphs can be created using the Graphs menu on the main toolbar