Skip to main content

I Did a Boo Boo

Last night, I looked at a chart that had been tweeted out by Marco Learning, a terrific source for information about The College Board's AP Program.  It showed the percentage of all scores graded 4 and 5 over time by subject, and there were some glaring points: Lots of big increases in certain subjects that didn't seem to make sense.  Turns out, their data was correct.

Wanting to dive down a little deeper, I went to the College Board website to look at the data myself, and to "download" it for some additional analysis.  I put the word download in quotation marks on purpose.

I have a history with College Board, of course.  I used to download the very rich AP data by state, exam, and ethnicity they'd post on their site and put it into an interactive format that pulled out insight better than the large, text-exclusive spreadsheets they'd post.  Then--despite the organization's oft-cited commitment to transparency--they stopped.  

In an example of Newspeak worthy of the novel 1984 that they might want to use in a future AP English Literature Exam, College Board said they were going to implement a "streamlined" reporting protocol for the data.  Less data, and less insight, in other words, was better. (They also announced that their "Landscape" product was being pulled down while they were saying they were making it more transparent, by the way, and no high school person has access to it today.)

Anyway, this chart shows incorrect data for AP Psych, suggesting that the percentage of 4 and 5 scores increased by 42 percentage points between 2022 and 2024.  Let me explain how it found its way into my tweet, and the larger issues it points out.


You can still download summary data at the subject level (but not more detailed than that) on the College Board website, but it comes in a messy format that makes one think they don't really want you to do any analysis on it.  It has hidden rows, hidden columns, merged cells, and different formats by row that make anything other than tedious manual extraction almost impossible. It looks like this; the data are clearly intended for casual users who want a quick answer, and not in a way that makes it easy to study in-depth.



So, after getting frustrated after wrangling this and admitting I'd been foiled by the data people on Vesey Street, I settled not for raw data, but for summaries on their website, on pages like this for 2024 and this for 2022.  I manually copied all the tables, pasted them into Excel, and then set about cleaning them up.  Even that was frustrating:  In some years, College Board calls its exam "AP English Language & Composition," while in other years, it's "AP English Language and Composition."  Similarly, it's either  "AP 2D Art & Design" or  "AP 2-D Art and Design." Some years, data are rounded to the nearest whole number; in others, to one decimal point. These are insignificant differences to human readers, but they're a big deal for computers.  

All seemed to be going well, although the year-to-year changes in nomenclature and formatting seem capricious and undisciplined from a data standpoint, especially for an organization that prides itself on its research and analysis capabilities.

And, finally, on the 2024 link, above, guess what? AP Psych is listed twice: First under "History and Social Sciences" 


and then again under "Sciences."  So, AP Psych in 2024 (but not the other years) got counted twice.

Had I been successful in just downloading and cleaning the numbers, this would not have happened because I calculate the percentage of the totals of raw numbers.  But because I had to scrape this off a website, this error showed up.  I should have checked this a couple of ways before posting, but I didn't, and that's my fault.

This would normally be where I'd call on College Board to make their data more accessible to the general public in the interest of transparency, but a) they don't listen, b) they don't give a crap about the members, and c) they just wait for people to forget how bad they are at the most simple things and keep paying their executives multi-million dollar salaries.  

And these are the people, I'd remind you, who are being asked to fix the FAFSA, and despite the massive conflict of interest it creates, gleefully and arrogantly agree to do so. 

All is good.  Carry on.  I'll post the complete data soon after I do more more auditing. 


Comments

Popular posts from this blog

Changes in AP Scores, 2022 to 2024

Used to be, with a little work, you could download very detailed data on AP results from the College Board website: For every state, and for every course, you could see performance by ethnicity.  And, if you wanted to dig really deep, you could break out details by private and public schools, and by grade level.  I used to publish the data every couple of years. Those days are gone.  The transparency The College Board touts as a value seems to have its limits, and I understand this to some extent: Racists loved to twist the data using single-factor analysis, and that's not good for a company who is trying to make business inroads with under-represented communities as they cloak their pursuit of revenue as an altruistic push toward access. They still publish data, but as I wrote about in my last post , it's far less detailed; what's more, what is easily accessible is fairly sterile, and what's more detailed seems to be structured in a way that suggests the company doesn...

Educational Attainment and the Presidential Elections

I've been fascinated for a while by the connection between political leanings and education: The correlation is so strong that I once suggested that perhaps Republicans were so anti-education because, in general, places with a higher percentage of bachelor's degree recipients were more likely to vote for Democrats. The 2024 presidential election puzzled a lot of us in higher education, and perhaps these charts will show you why: We work and probably hang around mostly people with college degrees (or higher).  Our perception is limited. With the 2024 election data just out , I thought I'd take a look at the last three elections and see if the pattern I noticed in 2016 and 2020 held.  Spoiler: It did, mostly. Before you dive into this, a couple of tips: Alaska's data is always reported in a funky way, so just ignore it here.  It's a small state (in population, that is) and it's very red.  It doesn't change the overall trends even if I could figure out how to c...

Changes in SAT Scores after Test-optional

One of the intended consequences of test-optional admission policies at some institutions prior to the COVID-19 pandemic was to raise test scores reported to US News and World Report.  It's rare that you would see a proponent of test-optional admission like me admit that, but to deny it would be foolish. Because I worked at DePaul, which was an early adopter of the approach (at least among large universities), I fielded a lot of calls from colleagues who were considering it, some of whom were explicit in their reasons for doing so.  One person I spoke to came right out at the start of the call: She was only calling, she said, because her provost wanted to know how much they could raise scores if they went test-optional. If I sensed or heard that motivation, I advised people against it.  In those days, the vast majority of students took standardized admission tests like the SAT or ACT, but the percentage of students applying without tests was still relatively small; the ne...