

histogram(; bins=automatic, datalimits=automatic, closed=:left, normalization=:none)

Compute a histogram.

The attribute bins can be an Integer, an AbstractVector (in particular, a range), or a Tuple of either integers or abstract vectors (useful for 2- or 3-dimensional histograms). When bins is an Integer, it denotes the approximate number of equal-width intervals used to compute the histogram. In that case, the range covered by the intervals is defined by datalimits (it defaults to the extrema of the whole data). The keyword argument datalimits can be a tuple of two values, e.g. datalimits=(0, 10), or a function to be applied group by group, e.g. datalimits=extrema. When bins is an AbstractVector, it denotes the intervals directly.

closed determines whether the the intervals are closed to the left or to the right.

The histogram can be normalized by setting normalization. Possible values are:

  • :pdf: Normalize by sum of weights and bin sizes. Resulting histogram has norm 1 and represents a PDF.
  • :density: Normalize by bin sizes only. Resulting histogram represents count density of input and does not have norm 1.
  • :probability: Normalize by sum of weights only. Resulting histogram represents the fraction of probability mass for each bin and does not have norm 1.
  • :none: Do not normalize.

Weighted data is supported via the keyword weights (passed to mapping).


Normalizations are computed withing groups. For example, in the case of normalization=:pdf, sum of weights within each group will be equal to 1.

using AlgebraOfGraphics, CairoMakie

df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c"], 5000))
specs = data(df) * mapping(:x, layout=:z) * histogram(bins=range(-2, 2, length=15))
specs = data(df) * mapping(:x, dodge=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
specs = data(df) * mapping(:x, stack=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
specs = data(df) *
    mapping((:x, :z) => ((x, z) -> x + 5 * (z == "b")) => "new x", col=:z) *
    histogram(datalimits=extrema, bins=20)
draw(specs, facet=(linkxaxes=:minimal,))
data(df) * mapping(:x, :y, layout=:z) * histogram(bins=15) |> draw


density(; datalimits=automatic, kernel=automatic, bandwidth=automatic, npoints=200)

Fit a kernel density estimation of data.

Here, datalimits specifies the range for which the density should be calculated (it defaults to the extrema of the whole data). The keyword argument datalimits can be a tuple of two values, e.g. datalimits=(0, 10), or a function to be applied group by group, e.g. datalimits=extrema. The keyword arguments kernel and bandwidth are forwarded to KernelDensity.kde. npoints is the number of points used by Makie to draw the line

Weighted data is supported via the keyword weights (passed to mapping).

df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c", "d"], 5000))
specs = data(df) * mapping(:x, layout=:z) * AlgebraOfGraphics.density(datalimits=((-2.5, 2.5),))

specs = data(df) *
    mapping((:x, :z) => ((x, z) -> x + 5 * (z ∈ ["b", "d"])) => "new x", layout=:z) *
draw(specs, facet=(linkxaxes=:minimal,))
data(df) * mapping(:x, :y, layout=:z) * AlgebraOfGraphics.density(npoints=50) |> draw
specs = data(df) * mapping(:x, :y, layout=:z) *
    AlgebraOfGraphics.density(npoints=50) * visual(Surface)

draw(specs, axis=(type=Axis3, zticks=0:0.1:0.2, limits=(nothing, nothing, (0, 0.2))))


df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, layout=:z) * frequency()
specs = data(df) * mapping(:x, layout=:z, color=:y, stack=:y) * frequency()
specs = data(df) * mapping(:x, :y, layout=:z) * frequency()


df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(100), c=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, :z, layout=:c) * expectation()
specs = data(df) * mapping(:x, :z, layout=:c, color=:y, dodge=:y) * expectation()
specs = data(df) * mapping(:x, :y, :z, layout=:c) * expectation()


linear(; interval=automatic, level=0.95, dropcollinear=false, npoints=200)

Compute a linear fit of y ~ 1 + x. An optional named mapping weights determines the weights. Use interval to specify what type of interval the shaded band should represent, for a given coverage level (the default 0.95 equates alpha = 0.05). Valid values of interval are :confidence, to delimit the uncertainty of the predicted relationship, and :prediction, to delimit estimated bounds for new data points. Use interval = nothing to only compute the line fit, without any uncertainty estimate. By default, this analysis errors on singular (collinear) data. To avoid that, it is possible to set dropcollinear=true. npoints is the number of points used by Makie to draw the shaded band.

Weighted data is supported via the keyword weights (passed to mapping).

x = 1:0.05:10
a = rand(1:7, length(x))
y = 1.2 .* x .+ a .+ 0.5 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (linear() + visual(Scatter))


smooth(; span=0.75, degree=2, npoints=200)

Fit a loess model. span is the degree of smoothing, typically in [0,1]. Smaller values result in smaller local context in fitting. degree is the polynomial degree used in the loess model. npoints is the number of points used by Makie to draw the line

x = 1:0.05:10
a = rand(1:7, length(x))
y = sin.(x) .+ a .+ 0.1 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (smooth() + visual(Scatter))


contours(; levels=5, kwargs...)

Create contour lines over the grid spanned over x and y by args 1 and 2 in the mapping, with height values z passed via arg 3.

You can pass the number of levels as an integer or a vector of levels. The levels are calculated across the whole z data if they are specified as an integer.

Note that visual(Contour) only works in a limited way with AlgebraOfGraphics since version 0.7, because the internal calculations it does are not compatible with the scale system. With visual(Contour), you can only have categorically-colored contours (for example to visualize contours of multiple categories). Alternatively, if you set the colormap attribute, you can get continuously-colored contours but the levels will not be known to AlgebraOfGraphics, so they won't be synchronized across facets and there will not be a colorbar.

All other keyword arguments are forwarded as attributes to the underlying Contour plot.

x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * contours(levels = 8)
x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * contours(levels = 8, labels = true)

Filled Contours

filled_contours(; bands=automatic, levels=automatic)

Create filled contours over the grid spanned over x and y by args 1 and 2 in the mapping, with height values z passed via arg 3.

You can pass either the number of bands to bands or pass a vector of levels (the boundaries of the bands) to levels, but not both. The number of bands when levels is passed is length(levels) - 1. The levels are calculated across the whole z data if the number of bands is specified. If neither levels nor bands are specified, the default is bands = 10.

Note that visual(Contourf) does not work with AlgebraOfGraphics since version 0.7, because the internal binning it does is not compatible with the scale system.

x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * filled_contours(levels = 3:2:15)

