Summarizing Data
Descriptive analysis condenses large datasets into a few numbers and visualizations, providing a quick overview of an unknown dataset.
Last updated
Was this helpful?
Descriptive analysis condenses large datasets into a few numbers and visualizations, providing a quick overview of an unknown dataset.
Last updated
Was this helpful?
Descriptive analysis, also known as data summarization, focuses solely on the presentation of aggregated data. It doesn't involve interpretation or hypothesis formulation, which fall under exploratory analysis. A simple yet extreme example of data summarization is counting: How many customers are in the database? How many products do we have? How many measurements recorded? These questions condense vast datasets into single numbers (see also the section on typical questions of exploratory analysis).
A more nuanced descriptive analysis might categorize customers by ZIP code, products by category, or measurement points by hour. Here, the condensation results in a number per group (ZIP code, product category, hour) rather than a single figure.
Summarizing isn't limited to counting. Other calculations are possible too. For instance, yearly revenue could be determined by summing up, while average revenue per order would use the arithmetic mean. Identifying the smallest or largest value also condenses data. Additional common measures in descriptive analysis include the median, other percentiles, the mode (most frequent value), and dispersion measures like standard deviation or range (largest minus smallest value).
Type
Measure
Location parameter
Mean ()
Median
Modus
Dispersion parameter
Minimum
Maximum
Range
Variance
Standard deviation ()