Final Portfolio
Introduction
Your final task for the semester is to create a small portfolio of your work. You will have an opportunity to revise some of your previous work and to create some new graphics.
Due Date: Thursday, March 13 at noon.
Some general instructions about the portfolio website
Be sure to include all your code.
Feel free to use
#| code-fold: true
for some (or all) chunks. This will hide your code until the user clicks to open it.
Any data sets you use should be available via a URL or in a standard R package so I have access to the data. You can create a
data
directory and put CSV, JSON, or Excel data sets there if they are not already available via a URL else where. Then use[text](url)
to include a link to the data in your document.
Also add
code-tools: true
to your YAML header. This will let me see the source document if I need to.
One of the goals for this project is to learn how to learn more. Another is to use a variety of graphical elements. So you may want to read Exercise 5 and Exercise 6 before starting the rest of the assignment.
Components of the portfolio
Exercise 1 (HW 4 revision) We’ve learned a lot about graphics since HW 4, so here’s your chance to improve upon what you did in HW 4 (and in our subsequent in-class dicsussions). We are going to focus on the genetics kit data. See HW4 for a reminder about the data and the task you had to do then.
Here is a reminder of what genetic share means: 23 and Me calls this “ancestry composition” and describes it like this:
Your Ancestry Composition report shows the percentage of your DNA that comes from each of 47 populations. We calculate your Ancestry Composition by comparing your genome to those of over 14,000 people with known ancestry. When a segment of your DNA closely matches the DNA from one of the 47 populations, we assign that ancestry to the corresponding segment of your DNA. We calculate the ancestry for individual segments of your genome separately, then add them together to compute your overall ancestry composition. Read more.
Scroll through the HW 4 Gallery to see the plots we created at that time. Find an example that has that has something you like about it and explain what you like.
Find an example that has somethng you don’t like, and explain what you don’t like about it.
Now create two graphics, one that helps compare the kits and one that helps compare twins.
- Give your graphics good titles,
- Use use other principles of good visualization,
- Do not restrict your plot to just a small subset of the data.
For each graphic, include a paragraph that tells the story of your graphic.
Exercise 2 (Data and graphics challenge) Complete one of the data and graphics challenges that we didn’t get to in class.
Exercise 3 (A new challenge) Challenge is probably the wrong word because the data set is quite small and fairly simple. But we want to be able to make good graphics for simple data too!
The data come from the Demographic and Health Surveys for Tanzania. You can obtain data like these (with many more items, and down to individual and household level detail) for many countries and years at dhsprogram.com.
Here is a very small summary of some data from Tanzania:
Enter the data into Excel, a CSV, or JSON file.
You will have some decisions to make about things like variable names, etc. Be sure to include a link to the data set you create on your portfolio website. (See instructions at the top of the page.)
Notice that some of the surveys were conducted all in one year and some spanned two calendar years. How will you deal with that?
Use these data to create a visualization that tells a story.
Write a few sentences explaining the story told.
Exercise 4 (Your masterpiece) OK. It doesn’t have to compete with the graphics cited by Tufte (2001), but this is your chance to impress. It can also be a chance to try some things you’ve been wanting to try or to include some elements that you need for Exercise 5.
Using a data set of your choosing, create a graphic that demonstrates your abilities to design and create a graphic that tells a compelling story.
Be sure to pick a data set that is rich enough to make the graphics task interesting. I recommend that you pick data related to something you are interested in.
Explain the choices you made when designing your graphic and relate them to principles of good graphics that we have learned or seen in this class. Mention alternatives to your graphic that you considered but did not opt to submit. (You don’t have to include your alternatives, but you may if that makes it easier to explain.)
Note: You may use examples you find online for inspiration and coding suggestions, but your graphic should not be a direct copy of an existing example.
Exercise 5 (Using your palette) The grammar of graphics gives us a palette of graphical elements with which to “paint” our graphic. The palette includes various marks, channels, composition, etc. One of the goals for your portfolio is that you demonstrate the ability to use a variety of these features and use them effectively. Look over your graphics, and identify a place where you used
- an encoding channel other than x or y.
- layers
- facets
- concatenation or repeat
- non-default settings for a channel’s scale or guide
- tooltips
- another kind of interaction (panning/zooming, brushing, sliders, etc.)
Note: The intention here is that you use each of these at least once in your portfolio to demonstrate your ability to use a wide range of features from the grammar of graphics. Keep that in mind as you go through the exercises.
Exercise 6 (Keep learning) It isn’t possible to learn everything about data visualiztion in such a short course, so you will need to keep learning.
Cite 2 or 3 specific examples in the graphics in your portfolio where you used a feature of Vega-Lite/vegabrite/Altair/altair that we did not learn in class. This might be using a new kind of mark or transform, or a way to customize a feature of the graphic, or a way to use interaction, or…
Here are some resources for finding/exploring new features:
vegabrite website. The Reference and Example Gallery sections are very helpful. You may also find the Design section interesting/helpful.
Vega-Lite documentation and Example gallery. You can usually convert things over to vegabrite or Altair/altair pretty easily once you see how things work in the native Vega-Lite.
The Altair documentation. The User Guide and Examples sections are very helpful.
The Vega-Lite API Examples might be a source of inspiration. This uses the javascript API for Vega-Lite.
The Visulization Curriculum developed at the University of Washington by Jeffrey Heer, Dominik Moritz, Jake VanderPlas, and Brock Craft (using Python/Altair).
There are other sites out there that could serve as inspiration as well. If you find a good one, be sure to let me know about it.
You will also want to keep learning about principles of good graphics design. Cite 2 or 3 specific examples where you followed the advice in one of the resources below. For the Knaflic book, provide specific page numbers. For the Wilke book provide at least section numbers. (Note: each section in that book has a URL which you could include as a link.)
Wilke (2019) contains lots of good information about creating good graphics. The chapter titles make it easy to find advice on particular issues. This will let you scan through chapters that address issues related to the graphics in your portfolio.
Knaflic (2015) is the book that preceded Knaflic (2020). It has more details about graphics principles and fewer exercises.