When a transit trendline first seems to be going the wrong way, everyone wants an explanation, and most people want a quick one. Almost anyone a journalist asks will have a theory, which is often just a projection of their own tastes. (“I personally like x in transit, so ridership would go up if there were more x.”) Transit statistics, sadly, are often not presented in the best way to help journalists and citizens evaluate them — it’s something many folks, including me, are working on. So it’s tough for journalists to reach any conclusion, and tempting to go for a quick but misleading take.
We need to read the resulting articles with sympathy to the journalist’s challenge but also an application of common sense. So let’s have a look at Laura Nelson and Dan Weikel’s recent Los Angeles Times article, which discusses a recent moderate drop in transit ridership in greater Los Angeles.
Let me apologize to Nelson and Weikel for calling them out in particular, since my real point is to note how common these mistakes are in transit journalism. The article contains some very good points about what the agency needs to be doing (better and faster buses) which are well-supported by the data and which I’ve discussed at length here.
Still, whole frame of the article is misleading, in two main ways:
- the use of data, which I’ll discuss in this post.
- the comparison of data to investments, which I’ll discuss in the next
Ridership Data is Noisy, so One or Two Data Points are Not a Trend
The chart in the article shows that ridership has been falling for one year, based on just one data point (Later in a Tweet, Nelson told me she had two data points, with ridership down in both calendar 2014 and ’15, but that’s not in the article or the chart.)
Based on these one or two points, the authors propose a vast and ominous trend:
The Los Angeles County Metropolitan Transportation Authority, the region’s largest carrier, lost more than 10% of its boardings from 2006 to 2015, a decline that appears to be accelerating. Despite a $9-billion investment in new light rail and subway lines, Metro now has fewer boardings than it did three decades ago, when buses were the county’s only transit option.
Accelerating? You need many data points to support this claim, because you are saying not just which way the line is going but also how it’s curving. What the published chart shows is that:
- There’s a larger interesting story about the broad fall in ridership across the 90s and dramatic recovery across the 00s.
- Ridership has been generally flat since 2006, going up and down in about a 10% band, with no sign of strong movement in any direction.
Let’s acknowledge what a hard problem this is for journalists. Something seems to be happening in the transit data. People are asking about it. You work on a short turnaround. It’s very hard to take on the truth, which is that ridership data is very noisy, so it takes a long time to see a trend.
This, by the way, is why I’ve done so little commentary on trends since the August 2015 implementation of Houston’s new bus network, whose original design process I led. Ridership was down the first month, so people wrote omnious articles. Then it soared the next month, so people wrote triumphalist articles. Now it looks like it may be down a little (post on this soon). Ridership data is just very volatile, so a trend has to be going for a while before you can call it.
Watch for the “Arbitrary Starting Year” Trick
When a journalist says some grand thing has been happening since year y, you should immediately ask: “why year y in particular?” Again, here’s how the article opens:
For almost a decade, transit ridership has declined across Southern California despite enormous and costly efforts by top transportation officials to entice people out of their cars and onto buses and trains.
Why “almost a decade”? Why not just a decade? Because if you compared 2015 to 2005 instead of 2006, ridership wouldn’t be down, and the authors wouldn’t have a story.
Sure, ridership is down 10% since 2006. But it’s up since 2011 and way up since 2004. Want to talk about the grand sweep of history? Nelson says that ridership is lower than it was 30 years ago, which sounds terrible, but it’s higher than it was 25 years ago! Thirty years ago, by the way, was fiscal year 1984-85, the year of the Olympics, so of course ridership was unusually high. [Update: Henry Fung, channeled by the excellent Ethan Elkind, has shown that it was fare cuts, not the Olympics, that made 1984-85 such a banner year.]
With a trendline like the one in the chart, you can say anything you want by comparing some past year to the present. Your conclusion is about the year you chose.
This “arbitrary starting year” trick is a very common in misleading journalism. Be suspicious whenever you see a single past year is cited as a point of comparison.
In Part 2, I’ll talk about the problem of comparing transit outcomes to transit investments.
I believe point 1 is the same as point 2; we’re picking an arbitrary end year to the data.
As has been pointed out by others, the Fiscal Year 1985 at SCRTD included a certain athletics event that occurred in Los Angeles in late summer 1984.
Interesting. Yeah, I think part of the problem with journalism is that you want to have a snappy headline. “Transit Ridership Goes up and Down” won’t do it. But if you look at that chart, especially if you extend it out to 1980, then that is exactly what has happened.
I do give credit for at least looking at overall transit ridership though. So many times I see reporters — and quite often agencies — falling over themselves because this light rail line or that streetcar has a large number of riders. That is the worst type of metric. If everyone just switches from a bus to a train and doesn’t get there any faster, that is hardly an improvement.
Thanks for writing this piece. It could be a great primer for journalists and others whose jobs force them or entice them to over-simplify complex data or to generate headlines. And I’m somewhat disheartened that some of the sites I tend to follow have also picked up on the story and reused it (e.g., https://nextcity.org/daily/entry/transit-ridership-southern-california-decline and http://www.planetizen.com/node/83509/los-angeles-transit-ridership-decline%E2%80%94are-rail-investments-working). At least the Planetizen article cast some doubt on the piece in the final paragraphs, but the headlines themselves are unfortunately what do the most damage.
And the great thing about the “messiness” of ridership is that methodologies often change over time as well. As technology improves (hopefully) the reliability and validity of ridership estimates improves — adding additional noise and difficulty in interpreting ridership trends.
But even if you include all the different operators that have since come onto the scene since 1985, ridership on a per capita basis is still way down. Much of this probably has to do with how the LA urban area is designed, more complicated trip making patterns, lack of affordable housing in the denser areas and not enough transit operating money. Implementing the Proposed Bus Network Strategic Plan would be step in the right direction.
The arbitrary starting time problem is like a 1-dimensional version of the modifiable areal unit problem wherein the significance level of a statistical test can be doctored (however inadvertently) by choosing a convenient region of space and applying the test to it. The bulk of research in ‘space syntax’, which is an academic version of ‘accessibility blobs’ suffers from this by computing accessibility metrics which rely on the entire network rather than just the portion of network around a particular point – the same statistical thinking can be applied by academics as by journalists!
I am not sure which academic works you are referring to but space syntax analysis does not always rely on the entire network, often times it is actually done on a portion of a network around a particular point or area.
Wouldn’t Census Journey to Work Data be a better very long term indicator of public transit use? Boardings versus actual total users in a population are very different kinds of data. “Boardings” has a huge number of variables associated with it that no one outside of transit schedulers and wonks understand (service design, land use, employment, fares, linked versus unlinked trips, etc). in many cases boardings are estimates based on formulas and not actual data. e.g. LA Metro has automated passenger counters (APC) on buses, but not on rail. Rail ridership is their best guess, just as bus ridership used to be, pre APC data. Is there a reason why we rarely rarely see critiques of transit based on the Census?
Matt. Census Journey to Work data is, of course, just about commutes to work, so when it’s used it introduces its own distortions. It’s missing a huge range of what transit actually does: trips for school, errands, shopping, events, social etc etc. Also, because Census transportation data is from only a sample of the population, it’s not very reliable at small zones, and that matters because all meaningful transit insights arise from understanding how it performs in specific local situations, not its average performance over a region, state, or country. Finally, the Census is always ultimately about what people say, and I would always prefer to watch what they do. The “huge number of variables” associate with direct counts of ridership are not for schedulers and wonks. I specialize in helping people sort their way through them: https://humantransit.org/2015/07/mega-explainer-the-ridership-recipe.html
Actually, a more detailed analysis of the ridership trend line — and the factors behind the trends — shows a far more serious decline than the Times story describes; here’s my analysis:
The story is complex, but, the short version is, transit leadership is dedicated to expanding the rail system and, when they go too far — which is most of the time — transit usage falls. Conversely, far smaller funding of expanded and improved bus service has produced huge ridership increases.