'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"width": 500, "height": 300,
"mark": "point",
"background": "skyblue"
}' |> vegawidget::as_vegaspec()
JSON has become a data standard
Vega-Lite specifications are JSON objects.
A value can be
""
)true
, false
, or null
An object is an (unordered) list of key-value pairs
""
)An array is an ordered list of values
The standard way to convert CSV to JSON is as an array of objects where each object represents one row of the data.
[
{ "name": "John Calvin", "height": 73.5, "weight": 205, },
{ "name": "Thomas Hobbes", "height": 71.5, "weight": 185, },
]
Note: dangling commas are allowed, which makes editing slightly easier.
Data can be provided in several ways, including:
Included as JSON within the Vega-Lite specification
Imported as JSON or CSV file (from local file or from URL)
Python and R wrappers handle converting from data frames to something Vega-Lite can deal with.
Note: The “raw” data for a Vega-Lite graphic are sent to the browser.
The Vega team has assembled some data sets for testing and examples.
Some data sets are in JSON format, some are in CSV format.
You can find out more about some of the data sets and where they came from here.
A Vega-Lite specifications are JSON objects that describe what sort of graphic should be rendered.
Vega-Lite \(\to\) Vega \(\to\) HTML + Javascript (or PNG or SVG)
The Vega editor provides an online editor to create and render Vega and Vega-Lite graphics.
Complex vega-lite graphics are created by composing views.
We’ll start with the simplest case,
Later we will learn about
We are required to include “$schema” and at least one of “mark”, “layer”, “facet”, “hconcat”, “vconcat”, “concat”, or “repeat”
All but “mark” are used for complex graphics, so let’s make our minimal example by specifying a mark.
'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"width": 500, "height": 300,
"mark": "point",
"background": "skyblue"
}' |> vegawidget::as_vegaspec()
I added a background color ("background": "skyblue"
) so you can see that a graphic is being made. It just doesn’t have anything on it yet.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"},
"mark": "point",
"background": "skyblue", "width": 100, "height": 100
}
'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"},
"mark": "point",
"background": "skyblue", "width": 100, "height": 100
}' |> as_vegaspec()
Why do you think the plot looks the way it does?
Try some of the other marks and see what you get.
What happens if you delete the width
and height
?
The encoding specifies how graphical properties are mapped and/or set.
{
...,
"mark": "point",
"encoding": {
"x": {"field": "fertility", "type": "quantitative"},
"y": {"field": "life_expect", "type": "quantitative"},
"color": {"value": "maroon"}
}
}
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"},
"height": 150, "width": 400,
"mark": "point",
"encoding": {
"x": {"field": "fertility", "type": "quantitative"},
"y": {"field": "life_expect", "type": "quantitative"},
"color": {"value": "maroon"}
}
}' |> as_vegaspec()
Change the color of the dots to some other color you like.
Encode fill
instead of (or in addition to) color
.
Make the dots (a little) larger using the size
property.
Set fillOpacity
to a number between 0 and 1. Experiment with some different values.
Map the dot size to pop
(the population of the country).
What happens if you change the mark to something else? Try it and find out.
These data cover years from 1955 to 2005. Let’s look at just one year.
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": 250, "width": 700,
"data": {
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"
},
"mark": "point",
"transform": [{"filter": "datum.year == 1955"}],
"encoding": {
"x": {"field": "fertility", "type": "quantitative"},
"y": {"field": "life_expect", "type": "quantitative"},
"size": {"field": "pop", "type": "Q"},
"fill": {"value": "maroon"},
"fillOpacity": {"value": 0.6}
}
}' |> as_vegaspec()
{
"params": [{
"name": "year",
"value": 1955,
"bind": {"input": "range", "min": 1955, "max": 2005, "step": 5},
}],
"transform": [{"filter": "datum.year == year"}],
...
}
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": 250,
"width": 700,
"params": [
{
"name": "year",
"value": 1955,
"bind": {"input": "range", "min": 1955, "max": 2005, "step": 5}
}
],
"data": {
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"
},
"mark": "point",
"transform": [{"filter": "datum.year == year"}],
"encoding": {
"x": {"field": "fertility", "type": "quantitative"},
"y": {"field": "life_expect", "type": "quantitative"},
"size": {"field": "pop", "type": "Q"},
"fill": {"value": "maroon"},
"fillOpacity": {"value": 0.6}
}
}' |> as_vegaspec()
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": 250,
"width": 700,
"params": [
{
"name": "year",
"value": 1955,
"bind": {"input": "range", "min": 1955, "max": 2005, "step": 5}
}
],
"data": {
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"
},
"mark": "point",
"transform": [{"filter": "datum.year == year"}],
"encoding": {
"x": {
"field": "fertility",
"type": "quantitative",
"scale": {"domain": [0, 9]}
},
"y": {
"field": "life_expect",
"type": "quantitative",
"scale": {"domain": [0, 100]}
},
"size": {"field": "pop", "type": "Q",
"scale": { "domain": [0, 1.5E9] }},
"fill": {"value": "maroon"},
"fillOpacity": {"value": 0.6}
}
}' |> as_vegaspec()
What component do we need to change?
Guess how that change might be coded.
Here’s how to modify the x-scale.
Using a single year (or slider for year) …
cluster
to the fill of the circles. (What "type"
will you use? Options: “quantitative”, “temporal”, “ordinal”, “nominal”.)Filter the data to look at just one country and then
Create a scatter plot that shows the country’s fertility and life expectency each year.
Connect the dots so you can see the “trail” for this country. (We haven’t talked about layers yet, so you will have the trail only, no dots.)
Replace your filter with a selector widget that lets you interactively pick a country from a list of a few countries you are interested in. (Hint: {"input": "select", "options": [...]}
)
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": 450,
"width": 700,
"params": [
{
"name": "country",
"value": "United States",
"bind": {"input": "select", "options": ["United States", "Canada", "Mexico", "China", "Nigeria", "Egypt", "South Korea"]}
}
],
"data": {
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"
},
"mark": "trail",
"transform": [{"filter": "datum.country == country"}],
"encoding": {
"x": {
"field": "fertility",
"type": "quantitative",
"scale": {"domain": [0,10]}
},
"y": {
"field": "life_expect",
"type": "quantitative",
"scale": {"domain": [30,100]}
},
"opacity": {"value": 0.6},
"size": {"field": "year"}
}
}' |> as_vegaspec()
'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"height": 450,
"width": 700,
"params": [
{
"name": "year",
"value": 1955,
"bind": {"input": "range", "min": 1955, "max": 2005, "step": 5}
}
],
"data": {
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@2.8.0/data/gapminder.json"
},
"mark": "point",
"transform": [{"filter": "datum.year == year"}],
"encoding": {
"x": {
"field": "fertility",
"type": "quantitative",
"scale": {"domain": [0, 9]}
},
"y": {
"field": "life_expect",
"type": "quantitative",
"scale": {"domain": [0, 100]}
},
"size": {"field": "pop", "type": "Q", "scale": {"domain": [0, 1500000000]}},
"fill": {"field": "cluster", "type": "nominal"},
"stroke": {"field": "cluster", "type": "nominal"},
"fillOpacity": {"value": 0.6}
}
}' |> as_vegaspec()
Be sure to scroll to see the entire assignment.
Read Chapters 1 and 2 of Claus Wilke’s Fundamentals of Data Visualization.
Create at least two data graphics and submit the vega-lite specifications and some additional information using this form. For each plot,
Use one of the vega data sets that is not the Gapminder data. (You may use the same data or different data for different plots.)
Create a single view graphic that treats the data as glyph-ready and uses one of the following primitive mark types: area, bar, line, point, rect, text, trail.
You may use the filter transformation if you like, but you shouldn’t use any other data transformations.
Use a different mark type for each graphic (for some variety).
Map at least one property that is not positional.
Identify at least one place where you used something you read in Wilke’s book as you designed your graphic (or would have if you knew how). Do your best to be sure your plot would not be considered bad, ugly, or wrong by Wilke, at least to the extent that this is possible given what we know so far.
Bonus: Bind a slider or selector that lets you change some feature of the graphic.