Beyond “On-Time Performance”

A San Francisco reporter emailed me yesterday with this question, regarding the city’s main transit system, Muni:

As you know, Muni set a goal in 1999 when the [San Francisco Municipal Transportation Agency] board was formed, to have a 85 percent on-time performance standard. That was voted on in 1999 (Prop. E). Since then … the agency has yet to the meet goal or even gotten close to it. The highest it’s been was 75 percent a few months ago.  … I wanted to ask if you if there is any danger for Muni to be so focused on this one standard? Are performance metrics evolving and why are they evolving? What else should Muni to be looking at as far as improving reliability?

There are two problems with the measures of “on-time performance” that prevail in the industry.

1.  They are not customer-centered.  A standard on-time performance measure shows percentage of services that were on-time, not the percentage of riders who were.  Because crowded services are more likely to be delayed, the percentage of customers who were served on-time may be lower than the announced on-time performance figure.  An on-time performance figure weighted by ridership would give a clearer impression of the actual customer experience.  This requires counting the on-board load at every timepoint, which is difficult but getting easier. Understandably nobody wants to see numbers that are even lower than they are, but the truth could be helpful in focusing attention on where the real problems are on the busiest services.  (It would be trivially easy for the airline industry to do this, but they don’t. Perhaps they want you to think, falsely, that if the whole airline is 90% on time that means there’s a 90% chance that you will be on time.  If bigger more crowded planes are more likely to be late, then that’s not true.)

2.  For high-frequency, high-volume services, actual frequency matters more.  Suppose that a transit line is supposed to run every 10 minutes, but every trip on the line is exactly 10 minutes late.  A typical on-time performance metric (e.g. the percentage of trips that are 0-5 minutes late) will declare this situation to be total failure, 0% on-time performance.  But to the customer, this situation is perfection.

For this reason, some big-city transit agencies use a “headway-maintenance” system, in which a transit operator’s job is not to run on time, but rather to run a specified number of minutes after the preceding vehicle on the line.  What should be reported in this case is not the time a bus came, but the actual elapsed time between consecutive trips.   GPS based real-time location systems, which are becoming common, provide the information needed to do this.

It’s worth noting that these two objections to standard on-time performance probably push the reported rate in opposite directions.  An on-time performance measure weighted by ridership (responding to my point #1) would almost certainly yield a lower score.  But on high-frequency services (my point #2) I suspect that many services now being counted as late don’t seem that way to the passenger.  If the trip ahead was late as well, then the actual gap between trips — which is all that a customer notices on high-frequency service — may be more or less fine.  So a shift to headway maintenance might yield numbers that tell a more positive story about the actual customer experience.  In any case, once the transit agency is clearly doing what it can to optimize reliability, the case for other improvements — such as stronger transit priorities — will be far stronger.

Changing these metrics is hard.  There’s a lot of room for debate about exactly how to calculate a revised measure, and that’s the easy part.  Headway maintenance, in particular, means changing job expectations and management practices.  It can affect how drivers are judged, so the unions may have a role.  All this takes a lot of courage and persistence, so it’s understandable that many agencies aren’t ready to do it.  Some agencies are thinking about the issue but are not ready to act.

The hardest part may be explaining to the public that the previous measures weren’t so good, because someone will ask: “If that’s true, why have you used the imperfect measure for so long?”  Fortunately, there’s a good answer to that: the quality of data.  GPS vehicle location systems have become routine in big North American systems only in the past decade, while smartcard and app-based ticketing technologies — which can pinpoint ridership by time and location — are still coming on.  Many of these systems have lots of bugs, but as they become more reliable, these systems will provide robust data that was never there before.  Transit managers can say, truthfully, that they’ve always understood the problem with on-time performance but never had the tools to monitor a more nuanced and accurate indicator.  But now they do, or they will soon.

So perhaps now is the time to update the idea of “on-time performance,” and the ways that agencies pursue it, to achieve a better focus on what matters to the customer.

[Updated July 8, 2023: This 2010 post needed very little editing to bring it current. The ability to measure headway reliability, and to relate reliability to the number of people affected, has improved enormously since I wrote this, although there are still many data challenges. Many more agencies have shifted to headway management, but some, like San Francisco, are stuck with on-time performance measures written into law even if they don’t really measure what matters.  This continues to be a lively and important topic.]

13 Responses to Beyond “On-Time Performance”

  1. Leigh Holcombe October 25, 2010 at 9:43 pm #

    Many municipal buses have onboard computers that let an operator know if s/he is “hot” (ahead of schedule) or “cold” (behind schedule). Because of the way operator performance is measured, they will often drive less safely if they are cold. I wonder if this might be too big a cost for punctuality.
    Interestingly, the biggest reason by far that buses are late is traffic – traffic largely caused by single occupant vehicles. In theory, the more successful a transit system becomes, the more of its own logistical problems are solved.

  2. Alon Levy October 25, 2010 at 9:50 pm #

    SBB has switched to a customer-centric definition of on-time performance. It counts customers as late if, due to train delay, a missed connection, or both, their trips have gone more than three minutes over schedule. By that new standard, the on-time performance is about 87%.
    (By the way, I heartily recommend you go to SBB’s English website, which features a lot of interesting information about the company and its practices. The only other non-English-speaking company I can think of that is this forthright in English is JR East.)

  3. Rob Fellows October 25, 2010 at 10:26 pm #

    There is a lot that could be done with location information that’s becoming more ubiquitous in modern transit agencies. For example, if a smart bus is given data about the location of its leader and follower, it wouldn’t be difficult to present a display for the operator showing how close he or she is to halfway between them. Similarly, headway-based signal priority requests could be turned on or off to optimize headway rather than schedule using the same information.
    The primary challenges are not technical, they’re philosophical. Can agencies adjust their thinking to prioritize headway vs. schedule management? Will they give operators the tools they need to manage their headways?

  4. Jarrett October 26, 2010 at 2:47 am #

    NOTE: This post was updated at this point in the comment thread, but all the preceding comments are still relevant to the revised post.

  5. Eric Fischer October 26, 2010 at 10:48 am #

    The fact that the definition of what it means for Muni to be on time was part of Proposition E is part of the problem — it means they can’t change from a schedule-based definition to something else without a vote of the public.
    On the other hand, the proposition says nothing about what they actually have to include in their published schedule, so I think it is kind of surprising that they haven’t altered the schedule to match reality instead of trying to make reality match the schedule.

  6. capt subway October 26, 2010 at 11:56 am #

    The OTP bugaboo has also run amok here in NYC on the NYCT subways. The former president of NYCT, Harold Roberts, was obsessed with OTP to a truly pathological degree and to the exclusion of almost everything else and wanted to achieve nothing less than 100% (LOL). He just didn’t understand that on a system such as the NY subway, with many lines operating with headways as short as 4 minutes all day long OTP just isn’t a good indicator of service quality. Far more important are frequency, adequacy (except during the rush hours are there enough trains to comfortably carry all the passengers?) and even spacing between trains.
    The idiots in charge here in NYC actually eliminated service in the hair-brained belief that too many trains were causing congestion on the line, which in turn caused lateness.
    Oh rest assured they also added huge amounts of running time to paper over any lateness. In sum: less is actually more, slower is actually faster.

  7. Steve October 26, 2010 at 12:26 pm #

    Jarrett,
    As a matter of fact Muni is doing these things.
    http://www.sfmta.com/cms/rstd/documents/6-26-09Item15FY10ServiceStandardsChangesPROPOSED.pdf

  8. capt subway October 26, 2010 at 1:07 pm #

    This is not surprising. But, without sounding patronizing or “elitist” this is the whole problem with the CA style of government by proposition. In this case most passengers do not understand the nuts-n-bolts of daily transit operations. So to most of them OTP sounds like a great way to measure performance. It’s an easily graspable concept. Try explaining through-put, even spacing, headways, service adequacy, vehicle requirements and turn-around time, crew requirements, etc. to the average voter.

  9. Jarrett October 26, 2010 at 1:31 pm #

    @ Steve. Thanks. I’ve added a sentence linking to the SFMTA policy. Keep us abreast of how it goes.

  10. Alon Levy October 26, 2010 at 7:45 pm #

    By the way, the aforementioned SBB website is here. Follow links to the passenger rail annual report, which talks about investment, on-time performance, and other issues.

  11. Chris October 27, 2010 at 10:14 am #

    I agree that for the customer waiting at the stop, a headway maintenance system would be beneficial. But a 10 minute frequency where all buses are 10 minutes late is not as good as one where they are on time – it is (presumably) taking the passengers longer to reach their destination than scheduled. And a pure headway maintenance system would penalize good bus drivers (and their passengers) who might have to drive especially slowly so as to keep their distance from a bus driver who drives especially slowly due to lack of effort or excessive socializing. A headway maintenance system might be OK if it worked with bus loading to allow heavily loaded buses to proceed and lightly loaded buses to hang back.

  12. Jarrett at HumanTransit.org October 27, 2010 at 12:10 pm #

    Chris.  I'd make a distinction between buses being late relative to a schedule, as opposed to taking longer than they should to complete a trip.  A headway maintenance system still focuses on fixing the latter problem, but quits worrying about the former one. 

  13. Scott October 27, 2010 at 4:36 pm #

    “…the actual gap between trips — which is all that a customer notices on high-frequency service…”
    This is only true if: a) The customer does not need to make any transfers, b) does not need to be anywhere at a certain time, and/or c) is willing to leave far earlier than theoretically necessary.
    Several years ago, I had cause to go to another city in the Metro area every day for several weeks. This involved one or two buses, followed by a train, followed by another bus. Unless I left home a full hour earlier than called for by the theoretical service schedule, I would almost always be late, though by a different amount every time. Sometimes a scheduled run would simply never come at all, but mostly it was a matter of missing connections because a bus was running late. In the end, I actually found it more reliable and time-effective to simply bicycle the 15 miles!