Vega-Lite: Scales
(part 2)

Data 304

Scale Types

Vega-Lite supports the following scale types:

Continuous: continuous domain \(\to\) continuous range
- “linear”, “pow”, “sqrt”, “symlog”, “log”, “time”, “utc”
Discrete: discrete domain \(\to\)
- discrete range: “ordinal”, or
- continuous range: “band”, “point”
Discretizing: continuous domain \(\to\) discrete range
- “bin-ordinal”, “quantile”, “quantize”, “threshold”

Scale Types

domain	range	scale type
continuous	continuous	continuous (linear, pow, sqrt, symlog, log, time, utc)
continuous	discrete	discretizing (bin-ordinal, quantile, quantize, threshold)
discrete	continuous	discrete (point, band)
discrete	discrete	discrete (ordinal)

Default scale types

The default scale type depends on the data type and the encoding channel.

Binning

Binning is a transformation that puts quantitative values into “bins”.

This is familiar from histograms.
Binning can be used for other properties as well.

Creating bins (2 ways)

There are two ways to create bins in Vega-Lite

transform

{
  ...
  "transform": [
    {"bin": ..., "field": ..., "as" ...} // Bin Transform
     ...
  ],
  ...
}

shortcut in encoding

"size": {"field": ..., "type": "quantitative", "bin": ...}

Binned size

'{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/seattle-weather.csv"},  
  "width": 800, "height": 250,
  "title": "High temperatures in Seattle",
  "mark": {"type": "point"},
  "encoding": {
    "x": {"field": "date", "type": "temporal"},
    "y": {"field": "temp_max", "type": "quantitative", 
          "scale": {"domain": [0,30]}}, 
    "size": {"field": "precipitation", "type": "quantitative", "bin": true},
    "opacity": {"value": 0.7}
  }
}' |> as_vegaspec()

Controlling the bins

To get default bins, use "bin": true.

Can customize with a BinParams object in place of true:

Examples

  "bin": {"binned": true, "step": 5, "anchor": 0}

  "bin": {"binned": true, "maxBins": 15}

  "bin": {"binned": true, "steps": [1, 5, 10]}

  "bin": {"binned": true, "bins": [0, 2.5, 5, 7.5, 10]}

Give it a try

Create this graphic using bin defaults.
Then experiment with some of the bin options.

Binning color

Q. How do we create this plot?

"color": {"field": ..., "type": "quantitative", "bin": true}

Your turn

Choose a color scheme that makes the bins easier to see.
See if using circles (or fill) is better than using points.
What happens if you bin shape instead of color?

Comparing color schemes

Modifying the scale type

Q. How do you think we tell Vega-Lite to do this for size?

"size": {..., "scale": { "type": "threshold", 
                         "domain": [0.2, 0.5, 1], 
                         "range": [25, 50, 100, 200]}}

Continuous scales

"scale":{"type": "symlog", "constant": ...}

Your turn

Try some other scale types: “log”, “pow”, “sqrt”.
Why can’t we bind the scale type to a select input?

Ordinal scales

Ordinal scales have a discrete domain and a discrete range.

essentially serve as a look-up table mapping domain (data values) to range (visual values)
default for color and shape for ordinal/nominal data
main options: domain, range, scheme

But we can also use a continuous range with a discrete domain…

Band scales

default for nominal and ordinal fields on position channels (x and y) of bar or rect marks.

Point scales

point scale = band scale with bandwidth = 0
default for position channels of other marks and for size and opacity

Monarchs data

monarchs.json

[
{"name":"Elizabeth","start":1565,"end":1603,"index":0},
{"name":"James I","start":1603,"end":1625,"index":1},
{"name":"Charles I","start":1625,"end":1649,"index":2},
{"name":"Cromwell","start":1649,"end":1660,"commonwealth":true,"index":3},
{"name":"Charles II","start":1660,"end":1685,"index":4},
{"name":"James II","start":1685,"end":1689,"index":5},
{"name":"W&M","start":1689,"end":1702,"index":6},
{"name":"Anne","start":1702,"end":1714,"index":7},
{"name":"George I","start":1714,"end":1727,"index":8},
{"name":"George II","start":1727,"end":1760,"index":9},
{"name":"George III","start":1760,"end":1820,"index":10},
{"name":"George IV","start":1820,"end":1820,"index":11}
]

name	start	end	index	commonwealth
Elizabeth	1565	1603	0	NA
James I	1603	1625	1	NA
Charles I	1625	1649	2	NA
Cromwell	1649	1660	3	TRUE
Charles II	1660	1685	4	NA
James II	1685	1689	5	NA
W&M	1689	1702	6	NA
Anne	1702	1714	7	NA
George I	1714	1727	8	NA
George II	1727	1760	9	NA
George III	1760	1820	10	NA
George IV	1820	1820	11	NA

Setting padding for bars

Your turn

What stories can you tell with these data? [monarchs.json]
How can you modify the graphic to tell the various stories (still using bars)?
Are there alternatives to bars that you should consider?

Sorting the scale range

  "encoding": {
    "x": {"field": "name", "type": "nominal",
          "sort": {"field": "reign", "order": "descending"},

Your turn

What other sorting might be interesting here (and better than alphabetical)?

Another use of sort

Jitter

Sometimes a little imprecision is better than being exact…

Q. How do we create this kind of plot? [This uses cars.json.]

Jitter with quantiative scales

Manually calculate a new field using random().

{ ...,
  "transform": [
    {"calculate": "datum.Cylinders + 0.5 * random() - 0.25", 
    "as": "jCylinders"}],
  ...
}

Jitter with nominal scales

With nominal scales, we can use xOffset or yOffset encodings.

{ ..., 
  "transform": [{"calculate": "random()", "as": "random"}],
  ...,
  "encoding": { ..., 
    "y": {"field": "Cylinders", "type": "nominal"},
    "yOffset": {"field": "random"}
  }
}

Q. How can we control how much jitter there is when using xOffset or yOffset?

Turning off the scale

Q1. When might we not want to have a scale at all?

Q2. How do we achieve that?

A2.

  "scale": null

A1. When data values are also range values.

Example: colors – you might have literal colors as a column in your data
Example: random – you might generate random range values with calculate.

Vega-Lite: Scales (part 2)

Scale Types

Scale Types

Scale Types

Default scale types

Binning

Creating bins (2 ways)

Binned size

Controlling the bins

Examples

Give it a try

Binning color

Your turn

Comparing color schemes

Modifying the scale type

Continuous scales

Your turn

Ordinal scales

Band scales

Point scales

Monarchs data

Setting padding for bars

Your turn

Sorting the scale range

Your turn

Another use of sort

Jitter

Jitter with quantiative scales

Jitter with nominal scales

Turning off the scale

Vega-Lite: Scales
(part 2)