EMJ SERIES ON METHODS AND STATISTICS Part III: Presenting and Summarizing Data Using Summary Statistics

Sileshi Lulseged, Sanni Ali, Girmay Medhin


Research data is collected on individual units of observation, which can only be interpreted meaningfully if analyzed and summarized using descriptive statistics. This, often done as an initial step in the analysis of a data set, provides simple summaries about the sample and about the observations that have been made. The summaries may be in the form of summary statistics or in the form of tables or graphs. These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis, or they may be sufficient in themselves for a particular investigation. Some measures that are commonly used to summarize or describe a data set are measures of central tendency or location and measures of dispersion or variability. Measures of central tendency or location include the meanmedian and mode, while measures of dispersion or variability include the standard deviation (or variance), the range and interquartile range. The mean is the most informative measure when the data is symmetrical and the median is a preferred measure of central tendency when data is skewed (non-symmetrical). The measures of dispersion are usually used in conjunction with a measure of central tendency. A thorough understanding of these descriptive measures is important first step of data analysis that helps to do informed selection of inferential analysis methods, critical appraisal of the literature and write-up of scientific articles. 


Full Text:




