Data Visualization – Use R Code to create 2 Maps

Data Visualization – Use R Code to create 2 Maps

 

Assignment 4, Part 1
Political Science 3780
Due: Friday, June 8 at 11:59 p.m.
in Assignment 4.1 Canvas dropbox
Submitting the Assignment
For this assignment, in addition to your R code, you will also submit two maps:
one for 2004 and one for 2014, to the Assignment 4.1 dropbox on Canvas. These
two plots should be submitted as separate files (.jpg or .pdf). An assignment
that includes only R code, but not separate files with
the final plots, will not receive any credit.
Setup
Open up a new Editor window in R. As comments (that is, as
lines beginning with #), put your name, student number, and
section on the first three lines, so:
# Jane Doe
# doe.3
# 9:10 section
As usual, you will put your commands and any comments or
answers in this file. Save it often as you work on this exercise,
1
and when you’ve completed it just submit the final version as
your assignment.
Dreamland
Download the file labelled “CDCdata.csv” from Canvas. This
dataset contains county-level estimates of drug poisoning mortality
over time. Read the data in to R. Matching on the fips
variable, merge it with the county name data from the maps()
library in R. Then create two county-level maps of drug poisoning
mortality in the United States, one for 2004 and the other
for 2014. Be sure to add a legend to each map explaining what
the colors mean. Some hints:
1. In general, this assignment follows Lecture 14 fairly closely.
2. R sometimes reads data in as factors rather than text. Factors
are vectors of integer values with corresponding sets of
character values to use when the factor is displayed. They
are also incredibly confusing because they look like text but
they don’t act like text. This dataset has variables that will
be read in as factors unless you use the stringsAsFactors=FALSE
subcommand in your read.csv() command.
I highly recommend doing so.
3. The variable of interest has the annoyingly long name Estimated
Age-adjusted Death Rate, 11 Categories
(in ranges). It also, as the name implies,
contains ranges (0–2, 2.1–4, etc.) rather than actual numbers.
Keeping in mind that you’ll want to plot colors later,
you’ll want to create a new variable in the dataset that takes
2
a value of 1 when age-adjusted death rate is 0–2, 2 when
it’s 2.1–4, and so on.
4. The death rates are measured as deaths per 100,000 population.
5. You’ll need to merge the data with the county.fips dataset
that’s built into the maps library in order to get the map()
command to plot the data correctly. Lecture 14 covered
this process in depth.
6. You’ll need to specify a color scheme for your map, and
very few of the spectra in RColorBrewer can handle 11
colors, which is what you’ll need if you want to represent all
of the categories in the data. You probably want your color
scheme to be a gradient from a lighter color to a darker
color. The best way to do this is to pick a lighter color
and a darker color from an online color-to-hex converter
and then use colorRampPalette() to generate the
gradient from one to the other.
7. Because there are 11 categories, legends can get awkward.
You might want to use the ncol= subcommand in the
legend() command to break the legend up into multiple
columns.
The final product for this assignment will be two maps of the
United States, one for drug poisoning mortality in 2004 and one
for drug poisoning mortality in 2014. Each map should color
each county by drug poisoning mortality rate for the relevant
year and should also include a legend linking the colors you use
to their corresponding death rate range.
3

         $10 per 275 words - Purchase Now