Self-Assessment RAL R Courses

There are three classes at the Research Academy Leipzig: Introduction, Intermediate, and Advanced.

For people who have taken R classes at RAL before

If you have taken the previously offered “R Introduction” before and you understood all content, I’d advise that you start with the Intermediate course. If you have taken “R Extended” before and you understood all content, I’d advise that you either start with Intermediate or Advanced. The first day in the Intermediate class will repeat content from “R Extended”, but the second day has new content.

For people who have not taken R classes at RAL before

If you have any questions, get in touch with RAL.

Quiz

Intro 2: Graphs

A. Recreate this plot:

mpg %>% ggplot(aes(hwy,displ)) + _____(method = lm, formula='y ~ x')
mpg %>% ggplot(aes(hwy,displ)) + geom_smooth(method = lm, formula='y ~ x')

B. Recreate this plot:

diamonds %>% ____() + geom_bar(_____________)
diamonds %>% ggplot() + geom_bar(aes(color, fill = cut))

Quiz

Intro 4: Quick Data Insights

Calculate the mean price for the different cuts in the diamonds data set

diamonds %>%
____(cut) %>%
____(mean_price = ____(___))
diamonds %>%
group_by(cut) %>%
summarise(mean_price = mean(price))

Intermediate 1: More Complex Transformations

world_bank_pop contains the World Bank’s population data from 2000 to 2018. It has the following columns:

Restructure the data, so that: 1. it only contains data for Germany; 2. the years are recorded in a column year; and 3. indicators for the total urban population (SP.URB.TOTL) and the total population (SP.POP.TOTL) are in their own column. The resulting table should look like this:

Here is some code to get you started:

world_bank_pop %>%
filter(country __ "DEU",
indicator ____ c("SP.URB.TOTL", "SP.POP.TOTL")) %>%
pivot_____(____("20"), names_to = "date") %>%
pivot_____(names_from = indicator) %>%
mutate(date = ymd(date, truncated = 2L))
world_bank_pop %>%
filter(country == "DEU",
indicator %in% c("SP.URB.TOTL", "SP.POP.TOTL")) %>%
pivot_longer(contains("20"), names_to = "date") %>%
pivot_wider(names_from = indicator) %>%
mutate(date = ymd(date, truncated = 2L))

Intermediate 2: Functions

You want to be able to generalise what you did in the exercise before for any country. Write a function that does this and then use the function to generate a line plot for New Zealand’s (NZL) urban population percentage (please create a new variable called URB.PERC for this, before you plot). The plot should look like this:

Code to get you started:

pop_extract_func <- function(___) {
world_bank_pop %>%
filter(country == mycountry,
indicator %in% c("SP.URB.TOTL", "SP.POP.TOTL")) %>%
pivot_longer(contains("20"), names_to = "date") %>%
pivot_wider(names_from = indicator) %>%
mutate(date = ymd(date, truncated = 2L))
}

pop_extract_func("NZL") %>%
mutate(URB.PERC = _____) %>%
ggplot() +
geom_line(aes(____, ____))
pop_extract_func <- function(mycountry) {
world_bank_pop %>%
filter(country == mycountry,
indicator %in% c("SP.URB.TOTL", "SP.POP.TOTL")) %>%
pivot_longer(contains("20"), names_to = "date") %>%
pivot_wider(names_from = indicator) %>%
mutate(date = ymd(date, truncated = 2L))
}

pop_extract_func("NZL") %>%
mutate(URB.PERC = (SP.URB.TOTL/SP.POP.TOTL) * 100) %>%
ggplot() +
geom_line(aes(date, URB.PERC))