Data Transformations
Histogram
AlgebraOfGraphics.histogram
— Functionhistogram(; bins=automatic, weights=automatic, normalization=:none)
Compute a histogram. bins
can be an Int
to create that number of equal-width bins over the range of values
. Alternatively, it can be a sorted iterable of bin edges. The histogram can be normalized by setting normalization
. Possible values are:
:pdf
: Normalize by sum of weights and bin sizes. Resulting histogram has norm 1 and represents a PDF.:density
: Normalize by bin sizes only. Resulting histogram represents count density of input and does not have norm 1.:probability
: Normalize by sum of weights only. Resulting histogram represents the fraction of probability mass for each bin and does not have norm 1.:none
: Do not normalize.
Weighted data is supported via the keyword weights
.
Normalizations are computed withing groups. For example, in the case of normalization=:pdf
, sum of weights within each group will be equal to 1
.
using AlgebraOfGraphics, CairoMakie
set_aog_theme!()
df = (x=randn(1000), y=randn(1000), z=rand(["a", "b", "c"], 1000))
specs = data(df) * mapping(:x, layout=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
specs = data(df) * mapping(:x, dodge=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
specs = data(df) * mapping(:x, stack=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
data(df) * mapping(:x, :y, layout=:z) * histogram(bins=15) |> draw
Density
AlgebraOfGraphics.density
— Functiondensity(; extrema, npoints, kernel, bandwidth)
Fit a kernel density estimation of data
.
df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c", "d"], 5000))
data(df) * mapping(:x, layout=:z) * AlgebraOfGraphics.density() |> draw
data(df) * mapping(:x, :y, layout=:z) * AlgebraOfGraphics.density(npoints=50) |> draw
specs = data(df) * mapping(:x, :y, layout=:z) *
AlgebraOfGraphics.density(npoints=50) * visual(Surface)
draw(specs, axis=(type=Axis3, zticks=0:0.1:0.2, limits=(nothing, nothing, (0, 0.2))))
Frequency
AlgebraOfGraphics.frequency
— Functionfrequency()
Compute a frequency table of the arguments.
df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, layout=:z) * frequency()
draw(specs)
specs = data(df) * mapping(:x, layout=:z, color=:y, stack=:y) * frequency()
draw(specs)
specs = data(df) * mapping(:x, :y, layout=:z) * frequency()
draw(specs)
Expectation
AlgebraOfGraphics.expectation
— Functionexpectation(args...)
Compute the expected value of the last argument conditioned on the preceding ones.
df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(100), c=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, :z, layout=:c) * expectation()
draw(specs)
specs = data(df) * mapping(:x, :z, layout=:c, color=:y, dodge=:y) * expectation()
draw(specs)
specs = data(df) * mapping(:x, :y, :z, layout=:c) * expectation()
draw(specs)
Linear
AlgebraOfGraphics.linear
— Functionlinear(; interval)
Compute a linear fit of y ~ 1 + x
. An optional named mapping weights
determines the weights. Use interval
to specify what type of interval the shaded band should represent. Valid values of interval are :confidence
delimiting the uncertainty of the predicted relationship, and :prediction
delimiting estimated bounds for new data points.
x = 1:0.05:10
a = rand(1:7, length(x))
y = 1.2 .* x .+ a .+ 0.5 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (linear() + visual(Scatter))
draw(specs)
Smoothing
AlgebraOfGraphics.smooth
— Functionsmooth(span=0.75, degreee=2)
Fit a loess model. span
is the degree of smoothing, typically in [0,1]
. Smaller values result in smaller local context in fitting. degree
is the polynomial degree used in the loess model.
x = 1:0.05:10
a = rand(1:7, length(x))
y = sin.(x) .+ a .+ 0.1 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (smooth() + visual(Scatter))
draw(specs)
This page was generated using Literate.jl.