Rug plot

Rug plot

Definition

Visualization of the distribution of a single quantitative variable using marks along an axis

Also known as

Rug chart, marginal rug, one-dimensional scatter plot

Summary

A rug plot displays the distribution of a single quantitative variable by placing small tick marks or lines along an axis, where each mark represents an individual data point. This visualization technique provides a direct view of the actual data values and their density along a single dimension. Rug plots are often used as supplementary elements alongside other visualizations to show the underlying data distribution.

Anatomy

  • Axis: A single horizontal or vertical line representing the scale of the quantitative variable
  • Marks (ticks): Short perpendicular lines positioned at the exact value of each data point

Interpreting a rug plot

Each mark in a rug plot corresponds to one observation in the dataset. Dense clusters of marks indicate regions where many data points share similar values, while sparse areas show ranges with few observations. Gaps in the marks reveal intervals where no data exists. By examining the overall pattern of marks, you can identify the range, central tendency, and potential outliers in the distribution. When marks overlap heavily in certain regions, it may indicate a concentration of values, though individual points become difficult to distinguish.

When and how to use a rug plot

Strengths

  • Shows actual data points rather than summarized representations
  • Reveals gaps, outliers, and multimodal distributions that aggregated charts might obscure
  • Requires minimal space, making it suitable as a marginal addition to other plots
  • Preserves the exact location of each observation without binning or smoothing
  • Effective for small to medium-sized datasets where individual points matter

Caveats and limitations

  • Becomes cluttered and unreadable with large datasets due to overplotting
  • Provides limited information about the precise frequency of values when marks overlap
  • Difficult to compare multiple distributions side-by-side
  • May require jittering or transparency adjustments to handle overlapping points
  • Does not scale well beyond a few hundreds observations without modifications

Use cases

  • As a marginal plot alongside scatter plots, box plots, or density curves to show raw data
  • Exploratory data analysis to quickly assess distribution characteristics
  • Comparing a theoretical distribution to observed data points
  • Displaying the distribution of residuals in regression diagnostics

Recommendations

Position rug plots along the margins of primary visualizations rather than as standalone charts. For datasets with more than a few hundred points, consider using transparency (alpha blending) or random jittering perpendicular to the axis to reduce overplotting. When working with very large datasets, combine rug plots with density estimates or use sampling techniques to show a representative subset. For comparing multiple distributions, consider alternatives like violin plots or ridge plots that incorporate rug-like elements more effectively. Ensure adequate contrast between the rug marks and background to maintain readability.

Links

Wikidata entity: Q56291746 (Rug plot)

Wikipedia page: Rug plot