# Analyses

## Histogram

`AlgebraOfGraphics.histogram`

— Function`histogram(; bins=automatic, datalimits=automatic, closed=:left, normalization=:none)`

Compute a histogram.

The attribute `bins`

can be an `Integer`

, an `AbstractVector`

(in particular, a range), or a `Tuple`

of either integers or abstract vectors (useful for 2- or 3-dimensional histograms). When `bins`

is an `Integer`

, it denotes the approximate number of equal-width intervals used to compute the histogram. In that case, the range covered by the intervals is defined by `datalimits`

(it defaults to the extrema of the whole data). The keyword argument `datalimits`

can be a tuple of two values, e.g. `datalimits=(0, 10)`

, or a function to be applied group by group, e.g. `datalimits=extrema`

. When `bins`

is an `AbstractVector`

, it denotes the intervals directly.

`closed`

determines whether the the intervals are closed to the left or to the right.

The histogram can be normalized by setting `normalization`

. Possible values are:

`:pdf`

: Normalize by sum of weights and bin sizes. Resulting histogram has norm 1 and represents a PDF.`:density`

: Normalize by bin sizes only. Resulting histogram represents count density of input and does not have norm 1.`:probability`

: Normalize by sum of weights only. Resulting histogram represents the fraction of probability mass for each bin and does not have norm 1.`:none`

: Do not normalize.

Weighted data is supported via the keyword `weights`

(passed to `mapping`

).

Normalizations are computed withing groups. For example, in the case of `normalization=:pdf`

, sum of weights *within each group* will be equal to `1`

.

```
using AlgebraOfGraphics, CairoMakie
set_aog_theme!()
df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c"], 5000))
specs = data(df) * mapping(:x, layout=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
```

```
specs = data(df) * mapping(:x, dodge=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
```

```
specs = data(df) * mapping(:x, stack=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
```

```
specs = data(df) *
mapping((:x, :z) => ((x, z) -> x + 5 * (z == "b")) => "new x", col=:z) *
histogram(datalimits=extrema, bins=20)
draw(specs, facet=(linkxaxes=:minimal,))
```

`data(df) * mapping(:x, :y, layout=:z) * histogram(bins=15) |> draw`

## Density

`AlgebraOfGraphics.density`

— Function`density(; datalimits=automatic, kernel=automatic, bandwidth=automatic, npoints=200)`

Fit a kernel density estimation of `data`

.

Here, `datalimits`

specifies the range for which the density should be calculated (it defaults to the extrema of the whole data). The keyword argument `datalimits`

can be a tuple of two values, e.g. `datalimits=(0, 10)`

, or a function to be applied group by group, e.g. `datalimits=extrema`

. The keyword arguments `kernel`

and `bandwidth`

are forwarded to `KernelDensity.kde`

. `npoints`

is the number of points used by Makie to draw the line

Weighted data is supported via the keyword `weights`

(passed to `mapping`

).

```
df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c", "d"], 5000))
specs = data(df) * mapping(:x, layout=:z) * AlgebraOfGraphics.density(datalimits=((-2.5, 2.5),))
draw(specs)
```

```
specs = data(df) *
mapping((:x, :z) => ((x, z) -> x + 5 * (z ∈ ["b", "d"])) => "new x", layout=:z) *
AlgebraOfGraphics.density(datalimits=extrema)
draw(specs, facet=(linkxaxes=:minimal,))
```

`data(df) * mapping(:x, :y, layout=:z) * AlgebraOfGraphics.density(npoints=50) |> draw`

```
specs = data(df) * mapping(:x, :y, layout=:z) *
AlgebraOfGraphics.density(npoints=50) * visual(Surface)
draw(specs, axis=(type=Axis3, zticks=0:0.1:0.2, limits=(nothing, nothing, (0, 0.2))))
```

## Frequency

`AlgebraOfGraphics.frequency`

— Function`frequency()`

Compute a frequency table of the arguments.

```
df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, layout=:z) * frequency()
draw(specs)
```

```
specs = data(df) * mapping(:x, layout=:z, color=:y, stack=:y) * frequency()
draw(specs)
```

```
specs = data(df) * mapping(:x, :y, layout=:z) * frequency()
draw(specs)
```

## Expectation

`AlgebraOfGraphics.expectation`

— Function`expectation()`

Compute the expected value of the last argument conditioned on the preceding ones.

```
df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(100), c=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, :z, layout=:c) * expectation()
draw(specs)
```

```
specs = data(df) * mapping(:x, :z, layout=:c, color=:y, dodge=:y) * expectation()
draw(specs)
```

```
specs = data(df) * mapping(:x, :y, :z, layout=:c) * expectation()
draw(specs)
```

## Linear

`AlgebraOfGraphics.linear`

— Function`linear(; interval=automatic, level=0.95, dropcollinear=false, npoints=200)`

Compute a linear fit of `y ~ 1 + x`

. An optional named mapping `weights`

determines the weights. Use `interval`

to specify what type of interval the shaded band should represent, for a given coverage `level`

(the default `0.95`

equates `alpha = 0.05`

). Valid values of `interval`

are `:confidence`

, to delimit the uncertainty of the predicted relationship, and `:prediction`

, to delimit estimated bounds for new data points. Use `interval = nothing`

to only compute the line fit, without any uncertainty estimate. By default, this analysis errors on singular (collinear) data. To avoid that, it is possible to set `dropcollinear=true`

. `npoints`

is the number of points used by Makie to draw the shaded band.

Weighted data is supported via the keyword `weights`

(passed to `mapping`

).

```
x = 1:0.05:10
a = rand(1:7, length(x))
y = 1.2 .* x .+ a .+ 0.5 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (linear() + visual(Scatter))
draw(specs)
```

## Smoothing

`AlgebraOfGraphics.smooth`

— Function`smooth(; span=0.75, degree=2, npoints=200)`

Fit a loess model. `span`

is the degree of smoothing, typically in `[0,1]`

. Smaller values result in smaller local context in fitting. `degree`

is the polynomial degree used in the loess model. `npoints`

is the number of points used by Makie to draw the line

```
x = 1:0.05:10
a = rand(1:7, length(x))
y = sin.(x) .+ a .+ 0.1 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (smooth() + visual(Scatter))
draw(specs)
```

## Contours

`AlgebraOfGraphics.contours`

— Function`contours(; levels=5, kwargs...)`

Create contour lines over the grid spanned over x and y by args 1 and 2 in the `mapping`

, with height values z passed via arg 3.

You can pass the number of levels as an integer or a vector of levels. The levels are calculated across the whole z data if they are specified as an integer.

Note that `visual(Contour)`

only works in a limited way with AlgebraOfGraphics since version 0.7, because the internal calculations it does are not compatible with the scale system. With `visual(Contour)`

, you can only have categorically-colored contours (for example to visualize contours of multiple categories). Alternatively, if you set the `colormap`

attribute, you can get continuously-colored contours but the levels will not be known to AlgebraOfGraphics, so they won't be synchronized across facets and there will not be a colorbar.

All other keyword arguments are forwarded as attributes to the underlying `Contour`

plot.

```
x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * contours(levels = 8)
draw(specs)
```

```
x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * contours(levels = 8, labels = true)
draw(specs)
```

## Filled Contours

`AlgebraOfGraphics.filled_contours`

— Function`filled_contours(; bands=automatic, levels=automatic)`

Create filled contours over the grid spanned over x and y by args 1 and 2 in the `mapping`

, with height values z passed via arg 3.

You can pass either the number of bands to `bands`

or pass a vector of levels (the boundaries of the bands) to `levels`

, but not both. The number of bands when `levels`

is passed is `length(levels) - 1`

. The levels are calculated across the whole z data if the number of `bands`

is specified. If neither levels nor bands are specified, the default is `bands = 10`

.

Note that `visual(Contourf)`

does not work with AlgebraOfGraphics since version 0.7, because the internal binning it does is not compatible with the scale system.

```
x = repeat(1:10, 10)
y = repeat(11:20, inner = 10)
z = sqrt.(x .* y)
df = (; x, y, z)
specs = data(df) * mapping(:x, :y, :z) * filled_contours(levels = 3:2:15)
draw(specs)
```

*This page was generated using Literate.jl.*