The data were collected from three high schools in the US, consisting of information on the students' performance as measured by three continuous outcome variables: math, reading, and writing, as well as five predictors: their demographic information on race/ethnicity, parental level of education, gender, lunch type, and test preparation course.
The R code for data visualization, descriptive statistics, and multi-linear regression was written in R Markdown and knitted to html.
Math: The student's score on a standardized mathematics test, a continuous variableReading: The student's score on a standardized reading test, a continuous variableWriting: The student's score on a standardized writing test, a continuous variableRace/ethnicity: The student's racial or ethnic background (Asian, African-American, Hispanic, etc.)Parental level of education: The highest level of education attained by the student's parent(s) or guardian(s)Gender: The gender of the student (male/female)Lunch: Whether the student receives free or reduced-price lunch (yes/no)Test preparation course: Whether the student completed a test preparation course (yes/no)
| variables | level_1 | level_2 | level_3 | level_4 | level_5 | level_6 |
|---|---|---|---|---|---|---|
| race/ethnicity | group_a | group_b | group_c | group_d | group_e | |
| parental level of education | some high school | high school | some college | associate's degree | bachelor's degree | master's degree |
| gender | male | female | ||||
| lunch | free/reduced | standard | ||||
| test_prep course | completed | none |