[New post] Alternatives to pie charts: coxcomb and waffle charts
R statistics for Political Science posted: " Packages we will need library(tidyverse) library(rnaturalearth) library(countrycode) library(peacesciencer) library(ggthemes) library(bbplot) https://giphy.com/gifs/thecoolidge-reese-witherspoon-legally-blonde-luke-wilson-Wn5V2jAKEyXUKviFZW Fi"
Respond to this post by replying above this line
New post on R Functions and Packages for Political Science Analysis
First we can download the region data with information about the geography and income levels for each group, using the ne_countries() function from the rnaturalearth package.
Next, we will download national military capabilities (NMC) dataset. These variables - which attempt to operationalize a country's power - are military expenditure, military personnel, energy consumption, iron and steel production, urban population, and total population. It serves as the basis for the most widely used indicator of national capability, CINC (Composite Indicator of National Capability) and covers the period 1816-2016.
To download them in one line of code, we use the create_stateyears() function from the peacesciencer package.
Click here to read more about downloading Correlates of War and other IR variables from the peacesciencer package
states <- create_stateyears(mry = FALSE) %>% add_nmc()
Next we add a UN location code so we can easily merge both datasets we downloaded!
First, we will create one high income group. The map dataset has a separate column for OECD and non-OECD countries. We do with with the ifelse() function within mutate().
Next we filter out any country that is NA in the dataset, just to keep it cleaner.
We then group the dataset according to income group and sum all the primary energy consumption in each region since 1900.
When we get to the ggplotting, we want order the income groups from biggest to smallest. To do this, we use the reorder() function with income_grp as the second argument.
To create the pie chart, we need the geom_bar() and coord_polar() lines.
With the coord_polar() function, it takes the following arguments :
theta - the variable we map the angle to (either x or y)
start - indicates the starting point from 12 o'clock in radians
direction - whether we plot the data clockwise (1) or anticlockwise (-1)
We feed in a theta of "x" (the primary energy consumption values), a starting point of 0 and direction of -1.
Next we add nicer colours with hex values and label the legend in the scale_fill_manual() function.
I like using the fonts and size stylings in the bbc_style() theme.
Last we can delete some of the ticks and text from the plot to make it cleaner.
Last we add our title and source!
states_df %>% mutate(income_grp = ifelse(income_grp == "1. High income: OECD", "1. High income", ifelse(income_grp == "2. High income: nonOECD", "1. High income", income_grp))) %>% filter(!is.na(income_grp)) %>% filter(year > 1899) %>% group_by(income_grp) %>% summarise(sum_pec = sum(pec, na.rm = TRUE)) %>% ggplot(aes(x = reorder(sum_pec, income_grp), y = sum_pec, fill = as.factor(income_grp))) + geom_bar(stat = "identity") + coord_polar("x", start = 0, direction = -1) + ggthemes::theme_pander() + scale_fill_manual( values = c("#f94144", "#f9c74f","#43aa8b","#277da1"), labels = c("High Income", "Upper Middle Income", "Lower Middle Income", "Low Income"), name = "Income Level") + bbplot::bbc_style() + theme(axis.text = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), axis.ticks = element_blank(), panel.grid = element_blank()) + ggtitle(label = "Primary Energy Consumption across income levels since 1900", subtitle = "Source: Correlates of War CINC")
We can compare to the number of countries in each region :
states_df %>% mutate(income_grp = ifelse(income_grp == "1. High income: OECD", "1. High income", ifelse(income_grp == "2. High income: nonOECD", "1. High income", income_grp))) %>% filter(!is.na(income_grp)) %>% filter(year == 2016) %>% count(income_grp) %>% ggplot(aes(reorder(n, income_grp), n, fill = as.factor(income_grp))) + geom_bar(stat = "identity") + coord_polar("x", start = 0, direction = - 1) + ggthemes::theme_pander() + scale_fill_manual( values = c("#f94144", "#f9c74f","#43aa8b","#277da1"), labels = c("High Income", "Upper Middle Income", "Lower Middle Income", "Low Income"), name = "Income Level") + bbplot::bbc_style() + theme(axis.text = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), axis.ticks = element_blank(), panel.grid = element_blank()) + ggtitle(label = "Number of countries per region")
Another variation is the waffle plot!
It is important we do not install the CRAN version, but rather the version in development. I made the mistake of installing the non-github version and nothing worked.
Unsubscribe to no longer receive posts from R Functions and Packages for Political Science Analysis. Change your email settings at Manage Subscriptions.
No comments:
Post a Comment