Skip to main content

NY City Public Schools, and what they might tell us about the SAT

Recently, I received a message from Akil Bello who pointed out a data visualization he had seen.  It was originally posted to Reddit, but later was edited to eliminate the red-green barrier that people with color-blindness face.  The story was here, using a more suitable blue-red scheme.

There's nothing really wrong with visualizing test scores, of course.  I do it all the time.  But many of the comments on Reddit suggest that somehow the tests have real meaning, as a single variable devoid of any context.  I don't think that's a good way to analyze data.

So I went to the NY City Department of Education to see what I can find.  There is a lot of good stuff there, so I pulled some of it down and began taking a look at it.  Here's what I found.

On the first chart, I wanted to see if the SAT could be described as an outcome of other variables, so I put the average SAT score on the y-axis, and began with a simple measure: Eighth grade math and English scores on the x-axis. Hover over the regression line, and you'll see an r-squared of about .90.

Scientists would use the term "winner, winner, chicken dinner" when getting results like this.  It means, for all intents and purposes, that if you know a high school's mean 8th grade achievement scores, you can predict their SAT scores four years later with amazing accuracy.  And--here's the interesting thing--the equation holds for virtually every single school.  There are few outliers.

Ponder that.

But critics of the SAT also say that the scores are reflective of other things, too; an accumulation of social capital, for instance.  So use the control at the bottom to change the value on the x-axis.  Try economic need index, or percentage of students in temporary housing, or percentage of the student body that are White or Asian. The line may go up (positive correlation) or down (negative) but you'll always see the schools with the highest scores tend to have the characteristics you'd expect.

Jump to the second tab.  This is more a response to the Reddit post: The top map shows the ZIP codes and a bubble, indicating the number of schools in that ZIP.  The bottom map shows every school arrayed on two poverty scales: Economic Index and Percent in Temporary Housing.  The color shows the mean SAT score in the school (Critical Reading plus Math, on a 1600-point scale.)  Purple dots represent higher scores.

Use the ZIP highlighter, and you'll see the top map show only that bubble, and the bottom will show the schools in it.

Got the lesson?  Good.  Now, think about why the colleges with high median test scores a) have them, and b) tend to produce students with high GRE and MCAT and LSAT scores,  and c) point to excellent outcomes for their students.

And let me know what you think.


Popular posts from this blog

So you think you're going back to the SAT and ACT?

Now that almost every university in the nation has gone test-optional for the 2021 cycle out of necessity, a nagging question remains: How many will go back to requiring tests as soon as it's possible?  No one knows, but some of the announcements some colleges made sounded like the kid who only ate his green beans to get his screen time: They did it, but they sure were not happy about it.  So we have some suspicions about the usual suspects. I don't object to colleges requiring tests, of course, even though I think they're not very helpful, intrinsically biased against certain groups, and a tool of the vain.  You be you, though, and don't let me stop you. However, there is a wild card in all of this: The recent court ruling prohibiting the University of California system from even using--let alone requiring--the SAT or ACT in admissions decisions next fall.  If you remember, the Cal State system had already decided to go test blind, and of course community colleges in

Baccalaureate origins of doctoral recipients

Here's a little data for you: 61 years of it, to be precise.  The National Science Foundation publishes its data on US doctoral recipients sliced a variety of ways, including some non-restricted public use files that are aggregated at a high level to protect privacy. The interface is a little quirky, and if you're doing large sets, you need to break it into pieces (this was three extracts of about 20 years each), but it may be worth your time to dive in. I merged the data set with my mega table of IPEDS data, which allows you to look at institutions on a more granular level:  It's not surprising to find that University of Washington graduates have earned more degrees than graduates of Whitman College, for instance.  So, you can filter the data by Carnegie type, region or state, or control, for instance; or you can look at all 61 years, or any range of years between 1958 and 2018 and combine it with broad or specific academic fields using the controls. High school and indep

2018 Admissions Data

This is always a popular post, it seems, and I've had a couple of people already ask when it was going to be out.  Wait no more. This is IPEDS 2018 admissions data, visualized for you in two different ways.  You can switch using the tabs across the top. The first view is the universe of colleges and universities that report data; not every college is required to, and a few leave data out, and test optional colleges are not supposed to report test scores.  But IPEDS is not perfect, so if you find any problems, contact the college. On the first view, you'll see 1,359 four-year private and public, not-for-profit institutions displayed.  In order to make this as clean as possible, I've taken out some specialty schools (nursing, business, engineering, etc.) as many of those don't have complete data.  But you can put them back in using the filter at top right. Hover over any bar, and a little chart pops up showing undergraduate enrollment by ethnicity. You can also