Last month, I used IPEDS data from 2019 to illustrate a point. A friend asked me about 2020, and so I decided to just do a simple time series. For Black History month I decided to use this large dataset to illustrate the history of Black degree attainment in the USA using the IPEDS data from my last post. As you can see the proportion of Black people earning PhDs in American Universities has been steadily increasing since the start of this time series (2011). Also, the proportion of Black people earning an Associates has been declining since 2015. Although Black people with an American doctorate are extremely rare, our proportion has been steadily climbing since the beginning of this time series (2011). It’s striking that the proportion of PhDs outnumbered the proportion (not total) of Bachelors degrees in 2021 and 2022. I saw something different when I filtered for degrees earned in a Science, Technology, Engineering, or Math (STEM) category.
Although this data is interesting to look at, comes from a reputable source, and is pretty much in its raw shape—there were no mathematical procedures that could obscure reality done on it other than calculating a fraction—it’s an incomplete picture of societal progress. Caution should be exercised before drawing blanket conclusions.
*******************
I’ve included some code for you R coders to use on my github page. It contains two functions and an example. The first function will download data from the web at IPEDS for the desired year range and save to your desired directory. The second function will compile and parse all that data. The result is a huge R dataframe (11 years was over 10 million of rows long!) where each row is a degree, and the columns are various attributes about that degree: What the degree was? What was the race of the person who earned it? What was the (biological) sex of the person? What was the institution where it was earned? What level was the degree?, etc.
You can ask and illustrate the answers to a TON of questions (that usually result in MORE questions). I’ve fallen into the rabbit hole many mornings… I’ve included the code for this analysis of degrees earned by the total people in the Black category as one example. However, the White as well as the Non-residents is very striking!
Because the data included biological sex, you could also compare properties of the degree attainment of males vs females in each race category across years.
The last thing I have shared is an R object that has English names for many of the coded attributes. For example, in the raw data, IPEDS codes each race attribute. For example, CBKAAT = Total Black people; CWHITT = Total White people; CAINT = Total American Indian people; CASIAM = Asian males; etc. Also, bachelors degrees are coded as 05, associates 03, etc…