Mass Transit Student Visualizations Fall 2018

My students took on various projects this fall, and I am very proud of what they worked on, so I asked their permission to share. Because they had a LOT of projects to do and to choose from, we spent less time perfecting graphics than we did telling stories with data. As a result, be gentle.

Default Gallery Type Template

This is the default gallery type template, located in:

If you're seeing this, it's because the gallery type you selected has not provided a template of it's own.

Full portfolios from contributors

Did I miss anybody? Let me know if I did.

Interactive graphic of revenue sources for LA transit providers (and controls), flypaper effect, displacement effect

Here’s the pretty picture, or ugly picture, depending on how you feel about colors and cumulative bar charts.

Wednesday I posted some comparisons of transit revenue data from the NTD for big, regional agencies. I got some questions and comments, one being a comment from brilliant friend Shane Phillips, that LA Metro’s reliance on sales tax might not be as a bad as I suggested, granted that much of it is expended in capital improvements. I disputed this point, simply in that volatility can hurt your capital budgets, too, and I did raise the point that LA got much less from early Measure R implementation simply because it passed in the middle of the recession, and it took us awhile to climb out. During those months, we got much less take than forecasts suggested. It took us a good three years to get to forecast levels, and while Metro can use the funds, those early low months are painful–especially when R was set to sunset on us, and we got little out of the first two-three years.

Another concern came up about whose revenues I aggregated; I only did singleagencies. Brilliant friend Erik Griswald suggested that we should aggregate by geography rather than just agency. I don’t know about this. I see little evidence that different agencies within regions harmonize either service or budgets; agencies have separate budgets even if their service areas and spans overlap, and not all providers are full reporters for the NTD. Geography might be enlightening for things like comparing bases. You might compare total farebox take out of total travel base, for example for bus-only companies and rail-only companies in the same general area. But those are vastly different services with buses doing feeder work that rail services do for a smaller number of trips. Even then, though, agencies operate different service spans and coverage areas, which mean different revenue-mile and revenue-hour potential. Besides the conceptual problems, it’s also a mess given the way NTD reporting is done. You have to do the work to aggregate even by agency as they split by expenditure category and source. There is a lot of budget detail that agency aggregation hides; for instance, Miami Dade Transit gets general fund appropriations from both localities and the state. That’s really interesting information for budget nerds; it’s probably less interesting for people interested in transit just thinking where the money is coming from, which is what motivated me to start poking these data in the first place.

My own concern is a simple lack of confidence that I aggregated the categories properly. I think I did fine except for some chickenfeed categories, but nonetheless, I’d have to do a lot more checking work before I published these in a journal. I think I’d need a budget person to help sort it for sure.

Nonetheless, I figured I could do some with just LA providers, and it might be interesting. I know the providers a bit better here than anywhere else. I am still missing a bunch. I tried to get every agency that participates in TAP (I didn’t; some are under the “small operators” category, tho). It’s interesting but incomplete, with all the problems I figured I’d get once I tried to do this by geography. I’ve got the kitchen sink here institutionally and service wise, as well, including a paratransit provider for illustration (Access).

Sorry for the truncated names, but R has strong preferences about size, and for the locals who are most likely to be interested, the names should be obvious enough.

A better person would have completely consistent colors between yesterday and today, but I am a sad and sloppy coder and I didn’t save plot coding; I did the best I could. I did somewhat alter the small categories around “other” because with the tiny agencies, those become more important. Nobody here is reporting any toll revenue, so that category is gone, but I find that confusing. I checked the NTD stuff three times–I thought Metro got some revenue from HOT lanes. If they do, I don’t see it. Maybe it’s funneled through the state funds, and that’s where it shows up in the budget.

Also gone is the income tax category, we since don’t have any with these agencies, but oooooooo how I wish California would make that an opt-in option on income tax forms. It’s not the same as a big-base dedicated tax, but I think lots of people in California would be willing to chuck in $20 for transit voluntarily, and that could add up to a decent amount.

As to the “active capital investment program” suggestion Shane raised…that might be true with Metro, but many of these little, LA-area agencies do not strike me as having particularly active capital investment programs. I could be wrong, I suppose. They have to buy vehicles, and their real estate is as expensive as anybody else’s here. My suspicion, however, is that they rely on the sales tax because it’s there, and it essentially displaces other potential funding sources; you can, by comparison, see where other small providers in other states where sales taxes aren’t an option, they go with appropriations from general funds or some state fund. So my point doesn’t hold, either; appropriations can be even more of a white-knuckle ride in terms of stability than sales taxes, and you have the extra headache of dedicating lobbyist time to that in addition to policy measures.

The availability of sale tax revenue probably has enabled smaller cities to stabilize funding for transit and use their other local funds for other purposes. On balance, I’m guessing that is probably to the good. Probably. Sales taxes are easy for consumers to pay (the retailer has the cost of the compliance), have a nice big base, and tax foreigners living abroad (but visiting us) and resident noncitizens.

One counter, which I suspect Metro would make: because Los Angeles raises local funds with the sales tax, these agencies attract more Federal dollars–the flypaper effect. That can be true, too, but the Feds do not necessarily care what public finance tool you are using, as long as the money is there. So LA could go with market-rate parking charges and dedicate that to a fund, and as long as it went to a dedicated fund, that would work the same way for flypaper effects.

I’d need to do more work to be able to show anything solid about flypaper or displacement. This is only one year of funding, too, and it may just be that agencies that are seeing bigger lumps of money from the Feds had holdover ARRA funds or got lucky during one grant cycle.

Fun note: What do Metrolink and MARTA have in common? They would miss parking charges if those go away.

I’m reaaaaaaalllly sick of these data; I’ve scratched this particular itch, which is a passive aggressive way of noting that anybody who wants to see more should go play with the data themselves. 😊😊😊

Interactive transit revenue data, visualized in different ways via Plotly

This week I wanted to challenge myself to learn how to use Plotly, as those always look so nice. The other thing I wanted was to clear up some questions I had with WMTA’s budget numbers. They do a very nice job telling a visual story, but I was a little flummoxed by the absence of a full revenue chart. I also wanted to learn more about ImageMagick colors, so there’s that, too.

It’s very difficult to compare transit agency budgets; not all use zero-based budgeting and they are quite different in terms of their scope and service spans. NY Metro is so much bigger than everybody else; very few agencies cross state and county lines. But I picked five relatively large, inter-jurisdictional direct service providers. These data are all from the NTD.

This first chart is a bit weird, but stick with me. I wanted to see who was getting what from where, and how much. I was rather interested in looking at the raw budget numbers by revenue category as whole. I do like it, as it helps us see how important in the big transit picture each type of funding source is. We do tend to point out that fares do not cover costs, and yet this shows that fares are important, overall. They may be much less important in Los Angeles than sales taxes, but in other systems, fares are important. We can also see who is getting what from the Feds, at least this year, 2015.

This next one gets us to the proportions that I suspect most people care about, and the scratchy, uncomfortable feeling I have with the dependence that so many of our big transit agencies have on sales taxes. I get into a lot of trouble saying that I don’t love our transit sales tax measures. I get why LA people support them, but it’s hard for me to tolerate the whinging about small ridership gains when we don’t act more aggressively with gasoline taxes and tolls. There is really no reason why we shouldn’t be doing both tolls and a local option gasoline tax instead of sales tax after sales tax, other than the political difficulties of expecting drivers in LA to pay. If we can’t raise the costs of driving relative to taking transit, I think we are going to be frustrated by underperforming investments for a very long time.

And sales tax revenue is hella volatile and follows the business cycle, so that agencies relying on them for half their budget on sales taxes, like the LA Metro, get to have the floor fall out from under them every time there is a recession.

These are both public and in Plotly, so you can diddle with them, too. For those interested, these data went from Excel to R to Plotly with direct stream through R to the Plotly API.

Visualizing LA Metro’s Ridership data, 2009 until 2016

Attention Conservation Notice: Here’s the animation I’ve been working on in order to understand the ridership changes at LA Metro, beware that the y-axes are all different, and that distorts the amount of variation going on. I did this mostly as timeline; YMMV.


The backstory:

About a week or so go while I was gardening, it occurred to me: if we various transit nerds were seeing the same trend we are seeing for the past few years with LA’s Metro ridership, only labeled for VMT instead, we’d be declaring that the market for vehicle travel was saturated.

Relax, I don’t think the transit market in LA is saturated. That was just me, getting grumpy with myself for being too lazy to examine my own biases.

But I sure would like to know what is going on. There are limits as to what you can explain with descriptive data analysis, but doing some critical visualizations put me a lot further along in my own understanding of what I think is going on, so that I thought I would share what I got.

Every so often, the Times reports on these numbers, and Laura Nelson wrote-up a nice story a day or so ago about bus ridership loss, and Sahra Sulaiman wrote up this depressing (but important) piece for Streetblog.

We’ve invested a lot of money recently in Los Angeles, and with capital investments at the scale we have undertaken, the last thing we want to do is put more money into obtaining fewer rides. Yes, I know, decades of neglect, yada yada, but we should be seeing nice, big jumps with early investments–diminishing returns should show up later.

Credible explanations: a) new rail supply is moving passengers from the bus to rail so that we are having fewer bus transfers and thus, lower counts; b) retirements and aging has prompted less commuting by transit as well as car (egads, let’s hope not as that is a demand effect); c) gasoline prices are low so that more people drive; d) the introduction of Uber and Lyft (then Zimride, thanks for the info Kendra Levine) into the LA travel market means that people handle the last mile problem (or the entire trip) with those services instead of buses; e) fare increases; f) reduced overall bus supply; g) the routes need to be reconfigured; h) bus transit is an inferior good*, so that we saw the highest possible usage during the worst of the recession, falling off as price-sensitive consumers at the lowest incomes leave the systems for other means; i) all that talk about fighting obesity and active transport hit home and more people started walking and biking; j) fare increases have forced bus riders to ride less.

It certainly looks like the UBER/higher fares/low gas prices combo are not helping, at least in the above timeline.

What did I miss? Single factor explanations are not likely here.

Then, there are various assertions that aren’t credible.

a) that “There’s actually more ridership, you just don’t understand how to use the numbers” directed at the Times’ Laura Nelson. Nah. As I demonstrate below, you can beat up on the numbers, and you’ll still have a trend. She’s reporting this correctly as far as I can tell even if transitlovers don’t like the framing being anything less than the Times’ usual, breathlessly supportive pro-development tone.

b) A hardy perennial: you’re using the wrong measure, that’s not fair! (Ridership counts are an incomplete measure of transit output–it takes more work to move 10 passengers 10 miles than it does to take 10 passengers 1 mile–but you need way better data than Metro is putting out there for the rest of us if you want to passenger-miles for the bus system, where the ridership loss is occurring. But I don’t buy this one. If that were the case, Metro would have an easy answer to the Times when the ridership story comes up. If anybody from metro wants me to diddle with better numbers, they know where I am at. Holler at me; I’m just here not getting my book done and having a midlife crisis.)

c) The averages do not capture peak ridership well, and our trains are doing that for us! That could be. But we don’t (or shouldn’t) build to peaks for any capital investment. Building to peak is one reason our auto infrastructure is so over-capitalized.

d) But, but, if you go back multiple decades when Los Angeles had fewer people, you would see that we have more rides, not fewer. Moving the end points around on analyses is certainly a way to manipulate what you see for trends. But that didn’t convince me either: of course we have more rides taken now than we did in 1970. But I still don’t see get why we’ve had the ups and downs we’ve had since 2009.

To create the graphic, I went to Metro’s Ridership statistics page and downloaded the data for all the years from 2009 to 2016. For the gas price data….eh, all I could find was a statewide average, but it should do well enough to indicate trends. These are data from the Energy Information Agency.

So all of these data have strong seasonality, and while you can see trends past the noise, it’s hard to figure out what is noise, and trend. I used a Fournier transform process to detect for seasonality, then decomposed each dataset into seasonality and trend. The data for the Metro 720 looks like this, where the top panel shows you the raw data, the second the seasonal variation, the third the deseasonalized trend, and the final panel the remainders.

Metro 720 Rapid Ridership, Jan 2009 to December 2016, Time Series Decomposition

I can show you a bunch of these, and they are all interesting, but here’s the bus and the rail overall. Rail, a strong upwards trend towards the end due to the Expo line and its extension, and buses, falling, long before the fare hikes in 2014 (but, I suspect, those fare hikes are not helping; the bus riders are going to be the most price sensitive customers due to their demographics).

Metro Rail Ridership, Jan 2009 to December 2016, Time Series Decomposition


Metro Bus Ridership, Jan 2009 to December 2016, Time Series Decomposition


Ok, yeah, let’s look at the red line. That’s a lot of ridership loss there I don’t understand:

Red Line Ridership, Jan 2009 to December 2016, Time Series Decomposition


Fuel prices Jan 2009 to December 2016, Time Series Decomposition


So here is a correlation matrix of the deseasonalized bus and rail ridership numbers plotted against gas prices. I could get fancy and start fitting the curve, but…it’s fairly clear that falling gas prices since 2012 track well with the ridership declines. But I do think we are seeing a rail supply effect here: rail patrons are less likely to be sensitive to fuel or fare prices than bus patrons, and we see that rail in general correlates positively with fuel prices, though not to the same degree as bus ridership. That’s kind of a weird result, but not really when you think that a big boost in ridership came from new supply, and supply to an area–downtown Santa Monica–where riders are going to be comparatively better off than the rest of the region.

There is evidence of reshuffling exiting patrons–it would be nice to have the Santa Monica Big Blue Bus data to compare–as bus and rail ridership are negatively correlated at around -.40. We know that there are times where rail and bus act as complimentary goods, and there are other times when they are substitutes. If this is evidence that some riders got moved from buses to trains…well, that’s not the wonderous “we solved traffic and air quality” victory lap we might want, but those folks are probably getting a more comfortable ride, and I’m ok with that.

Correlation Plot, Fuel Price and Ridership by Mode, 2009 to 2016


That said, it’s not all reshuffling. At the end of the time period, we have about 70,000 more rides on an average weekday on the rail system–and remember, that even with some pretty big losses on the Red line (whyyyyy?) and Green line. But we’ve lost nearly 280,000 rides on the bus side, so reshuffling isn’t the whole story, either.

So here’s the correlation matrix between the routes I examined.There are network spillover effects in action, generally, with connections between lines being weak in instances you would expect (gold line, Metro 720: they don’t really feed each other) and very strong in instances where lines intersect. Interestingly, both the Green and the 720 are parallel E-W routes, but with a lot of real estate in between. The Expo Line, which runs parallel to them both, perhaps became the the E-W route of choice, which might explain some of those losses. Maybe.

Correlation Plot, Routes by Ridership, 2009 to 2016


Falling gas prices and higher fares are probably not helping; lags in land use probably are not helping; Uber/Lyft are probably not helping; reshuffling is a potential explanation but not, really, a problem from my perspective.

Bah. I have a better handle on what is going on, but not a convincing story (at least not for me, not yet) for why. But the pictures are kind of cool.

*If you have ever attempted to lecture me about housing supply and demand on Twitter, but you yell and scream about this term, then go back to your micro 101 class. It’s a technical term that describes a good where consumption increases when incomes falls. I still love transit, but–again technically–the term applies empirically, so I use it.

Keywords in Journal of the American Planning Association Articles, 1975 until 2017

This week’s visualization is a bite in the ass; I’m ready to tear my hair out. Let’s just say this: my original plan was to see if I could find where the “sprawl” discourse really began in planning, at least in this journal….and I have some ideas, but none of the text analysis tools I used gave me a #@@$!! thing.

Since I have to go off to graduation this morning, and they really prefer that one both put on pants and brush the teeth, I’ll just throw what I have so far up here for discussion. For all the kvetching I’ve heard over the years that JAPA spends too much time on transportation, these data don’t show that.

Untitled 1 100 RGB GPU Preview