The following URL: https://rud.is/210/chicago.csv contains the (surprisingly relatively current) annual salaries of employees in the City of Chicago. This is public information. I did not “hack” it.
You will/may find the following functions of help:
From stringr
:
str_replace()
str_to_title()
From the tidyverse
:
read_csv()
mutate()
separate()
group_by()
select()
arrange()
desc()
count()
summarise()
filter()
slice()
ggplot()
geom_boxplot()
From base/“built-in” R:
as.numeric()
median()
min()
max()
head()
tail()
It’s a decent CSV file, but it needs some work. Here’s what you need to do:
employee_annual_salary
column needs to be turned into a numeric valuename
column should be split into last_name
and first_mid
columnsHere’s what you need to answer:
The blocks below have the answers shown (so you can check your work) but the code is hidden. Try to solve the problems independently or jointly before peeking (or ask me for help).
library(stringr)
library(tidyverse)
chicago <- read_csv("https://rud.is/210/chicago.csv")
glimpse(chicago)
## Observations: 32,062
## Variables: 4
## $ name <chr> "AARON, ELVIA J", "AARON, JEFFERY M", "AARON, KARINA", "AARON...
## $ position_title <chr> "WATER RATE TAKER", "POLICE OFFICER", "POLICE OFFICER", "CHIEF C...
## $ department <chr> "WATER MGMNT", "POLICE", "POLICE", "GENERAL SERVICES", "WATER MG...
## $ employee_annual_salary <chr> "$90744.00", "$84450.00", "$84450.00", "$89880.00", "$106836.00"...
chicago
## # A tibble: 32,062 × 4
## name position_title department employee_annual_salary
## <chr> <chr> <chr> <chr>
## 1 AARON, ELVIA J WATER RATE TAKER WATER MGMNT $90744.00
## 2 AARON, JEFFERY M POLICE OFFICER POLICE $84450.00
## 3 AARON, KARINA POLICE OFFICER POLICE $84450.00
## 4 AARON, KIMBERLEI R CHIEF CONTRACT EXPEDITER GENERAL SERVICES $89880.00
## 5 ABAD JR, VICENTE M CIVIL ENGINEER IV WATER MGMNT $106836.00
## 6 ABARCA, ANABEL ASST TO THE ALDERMAN CITY COUNCIL $70764.00
## 7 ABARCA, EMMANUEL GENERAL LABORER - DSS STREETS & SAN $41849.60
## 8 ABASCAL, REECE E TRAFFIC CONTROL AIDE-HOURLY OEMC $20051.20
## 9 ABBASI, CHRISTOPHER STAFF ASST TO THE ALDERMAN CITY COUNCIL $49452.00
## 10 ABBATACOLA, ROBERT J ELECTRICAL MECHANIC AVIATION $93600.00
## # ... with 32,052 more rows
mutate(chicago,
employee_annual_salary = str_replace(employee_annual_salary, "\\$", ""),
employee_annual_salary = as.numeric(employee_annual_salary),
name = str_to_title(name),
position_title = str_to_title(position_title),
department = str_to_title(department)) %>%
separate(name, into = c("last", "first_mid"), ",") -> chicago
scales::dollar(median(chicago$employee_annual_salary))
## [1] "$84,450"
group_by(chicago, department) %>%
summarise(median_salary = median(employee_annual_salary)) %>%
mutate(median_salary = scales::dollar(median_salary)) %>%
knitr::kable(align="lr")
department | median_salary |
---|---|
Admin Hearng | $68,028.00 |
Animal Contrl | $56,928.00 |
Aviation | $72,862.40 |
Board Of Election | $48,036.00 |
Board Of Ethics | $81,948.00 |
Budget & Mgmt | $89,340.00 |
Buildings | $97,920.00 |
Business Affairs | $75,960.00 |
City Clerk | $62,004.00 |
City Council | $52,950.00 |
Community Development | $85,764.00 |
Cultural Affairs | $84,168.00 |
Disabilities | $82,044.00 |
Doit | $96,066.00 |
Family & Support | $13,520.00 |
Finance | $68,028.00 |
Fire | $91,362.00 |
General Services | $93,600.00 |
Health | $81,948.00 |
Human Relations | $89,676.00 |
Human Resources | $73,170.00 |
Inspector Gen | $75,036.00 |
Ipra | $89,880.00 |
Law | $71,292.00 |
License Appl Comm | $71,292.00 |
Mayor’s Office | $74,250.00 |
Oemc | $20,051.20 |
Police | $87,384.00 |
Police Board | $79,974.00 |
Procurement | $75,960.00 |
Public Library | $57,696.00 |
Streets & San | $73,840.00 |
Transportn | $81,536.00 |
Treasurer | $88,626.00 |
Water Mgmnt | $82,576.00 |
filter(chicago, employee_annual_salary == max(employee_annual_salary)) %>%
mutate(employee_annual_salary = scales::dollar(employee_annual_salary)) %>%
knitr::kable(align="llllr")
last | first_mid | position_title | department | employee_annual_salary |
---|---|---|---|---|
Evans | Ginger S | Commissioner Of Aviation | Aviation | $300,000 |
group_by(chicago, department) %>%
filter(employee_annual_salary == max(employee_annual_salary)) %>%
select(last, first_mid, department, employee_annual_salary) %>%
arrange(desc(employee_annual_salary)) %>%
mutate(employee_annual_salary = scales::dollar(employee_annual_salary)) %>%
knitr::kable(align="lllr")
last | first_mid | department | employee_annual_salary |
---|---|---|---|
Evans | Ginger S | Aviation | $300,000 |
Emanuel | Rahm | Mayor’s Office | $216,210 |
Santiago | Jose A | Fire | $202,728 |
Escalante | John J | Police | $197,724 |
Morita | Julia Y | Health | $177,000 |
Reifman | David | Community Development | $175,020 |
Morrison Butler | Lisa V | Family & Support | $175,002 |
Patton | Stephen R | Law | $173,664 |
Holt | Alexandra D | Budget & Mgmt | $169,992 |
Powers | Thomas H | Water Mgmnt | $169,512 |
Scheinfeld | Rebekah C | Transportn | $169,500 |
Schenkel | Gary W | Oemc | $167,796 |
Rhee | Jamie L | Procurement | $167,220 |
Bannon | Brian A | Public Library | $167,004 |
Widawsky | Daniel J | Finance | $165,000 |
Fairley | Sharon R | Ipra | $161,856 |
Ferguson | Joseph M | Inspector Gen | $161,856 |
Kaiden | Marla M | City Council | $160,248 |
Frydland | Judith | Buildings | $157,092 |
Guerra | Maria | Business Affairs | $157,092 |
Reynolds | David J | General Services | $157,092 |
Williams | Charles L | Streets & San | $157,092 |
Jackowiak | Patricia | Admin Hearng | $156,420 |
Boone | Michelle T | Cultural Affairs | $155,040 |
Berman | Brenna M | Doit | $154,992 |
Choi | Soo | Human Resources | $151,572 |
Lafleur | Kristi L | Treasurer | $150,000 |
Tamley | Karen M | Disabilities | $138,420 |
Mendoza | Susana A | City Clerk | $133,545 |
Berlin | Steven | Board Of Ethics | $131,688 |
Bateman | Kelly Anne | Board Of Election | $129,840 |
Noriega | Mona | Human Relations | $125,004 |
Yamaji | Peter S | Animal Contrl | $123,144 |
Caproni | Max A | Police Board | $97,728 |
Guzman Flores | Michelle | License Appl Comm | $71,292 |
group_by(chicago, department) %>%
summarise(total_salary_budget = sum(employee_annual_salary)) %>%
arrange(desc(total_salary_budget)) %>%
mutate(total_salary_budget = scales::dollar(total_salary_budget)) %>%
knitr::kable(align="lr")
department | total_salary_budget |
---|---|
Police | $1,092,849,505 |
Fire | $458,980,209 |
Water Mgmnt | $157,442,278 |
Streets & San | $148,340,585 |
Aviation | $111,975,153 |
Transportn | $91,304,948 |
Oemc | $81,217,800 |
General Services | $77,608,339 |
Public Library | $54,108,881 |
Health | $44,006,571 |
Finance | $40,812,238 |
Law | $30,924,124 |
Family & Support | $26,367,348 |
Buildings | $25,234,200 |
City Council | $22,843,049 |
Community Development | $17,910,303 |
Business Affairs | $12,468,418 |
Doit | $9,866,184 |
Mayor’s Office | $7,531,149 |
Ipra | $7,385,256 |
Procurement | $6,639,596 |
Board Of Election | $6,572,532 |
City Clerk | $6,185,521 |
Cultural Affairs | $5,950,580 |
Human Resources | $5,313,360 |
Inspector Gen | $4,680,780 |
Animal Contrl | $3,683,492 |
Budget & Mgmt | $3,587,580 |
Admin Hearng | $2,996,364 |
Disabilities | $2,259,888 |
Treasurer | $1,991,577 |
Human Relations | $1,457,052 |
Board Of Ethics | $780,276 |
Police Board | $159,948 |
License Appl Comm | $71,292 |
count(chicago, department) %>%
arrange(n) %>%
mutate(n = scales::comma(n)) %>%
knitr::kable(align="lr")
department | n |
---|---|
License Appl Comm | 1 |
Police Board | 2 |
Board Of Ethics | 9 |
Human Relations | 16 |
Treasurer | 22 |
Disabilities | 28 |
Admin Hearng | 39 |
Budget & Mgmt | 39 |
Inspector Gen | 57 |
Animal Contrl | 67 |
Human Resources | 70 |
Cultural Affairs | 74 |
Ipra | 82 |
Procurement | 83 |
Mayor’s Office | 86 |
City Clerk | 94 |
Doit | 102 |
Board Of Election | 117 |
Business Affairs | 161 |
Community Development | 208 |
Buildings | 262 |
City Council | 396 |
Law | 414 |
Health | 529 |
Finance | 577 |
Family & Support | 654 |
General Services | 930 |
Public Library | 961 |
Transportn | 1,073 |
Aviation | 1,521 |
Water Mgmnt | 1,840 |
Oemc | 1,982 |
Streets & San | 2,152 |
Fire | 4,796 |
Police | 12,618 |
count(chicago, department) %>%
mutate(n = as.integer(2719000 / n)) %>%
arrange(n) %>%
mutate(n = scales::comma(n)) %>%
rename(number_of_citizens_per_employee=n) %>%
knitr::kable(align="lr")
department | number_of_citizens_per_employee |
---|---|
Police | 215 |
Fire | 566 |
Streets & San | 1,263 |
Oemc | 1,371 |
Water Mgmnt | 1,477 |
Aviation | 1,787 |
Transportn | 2,534 |
Public Library | 2,829 |
General Services | 2,923 |
Family & Support | 4,157 |
Finance | 4,712 |
Health | 5,139 |
Law | 6,567 |
City Council | 6,866 |
Buildings | 10,377 |
Community Development | 13,072 |
Business Affairs | 16,888 |
Board Of Election | 23,239 |
Doit | 26,656 |
City Clerk | 28,925 |
Mayor’s Office | 31,616 |
Procurement | 32,759 |
Ipra | 33,158 |
Cultural Affairs | 36,743 |
Human Resources | 38,842 |
Animal Contrl | 40,582 |
Inspector Gen | 47,701 |
Admin Hearng | 69,717 |
Budget & Mgmt | 69,717 |
Disabilities | 97,107 |
Treasurer | 123,590 |
Human Relations | 169,937 |
Board Of Ethics | 302,111 |
Police Board | 1,359,500 |
License Appl Comm | 2,719,000 |
ggplot(chicago, aes(department, employee_annual_salary, group=department)) +
geom_boxplot(fill="steelblue", color="#b2b2b2") +
hrbrthemes::scale_y_comma(limits=c(0,310000)) +
coord_flip() +
labs(x=NULL, y=NULL, title="City of Chicago Salary Distribution by Department") +
hrbrthemes::theme_ipsum_rc(grid="X")
This is a different way to view the salary distributions which we’ll be covering in an upcoming class.
ggplot(chicago, aes(department, employee_annual_salary, group=department)) +
ggbeeswarm::geom_quasirandom(shape=21, color="white", fill="steelblue", size=1.25, alpha=1/2, stroke=0.05) +
hrbrthemes::scale_y_comma(limits=c(0,310000)) +
coord_flip() +
labs(x=NULL, y=NULL, title="City of Chicago Salary Distribution by Department") +
hrbrthemes::theme_ipsum_rc(grid="X")