Baseball and Carbon Accounting, Part 2
Note: The following serves as part 2 of n on the subject. See Part 1 here.
Avoiding Runs… and Carbon… Against a Baseline
As we saw in Part 1, it’s relatively easy to count something that exists—runs allowed or carbon emissions. But in this case, we want to track the opposite: runs, or carbon, avoided. How do we do that?
We need some relevant baseline of runs allowed to compare against. As in baseball, the creation of that baseline is part art and part science... and part data science.
One starting point is “average”, specifically MLB average. It is relatable and easy to calculate. The league might have an ERA of 4.50 overall, and this pitcher might have had a 3.00 ERA in 180 innings, which amounts to a 30 run difference compared to league average. We could say that the pitcher prevented 30 runs compared to average, or that they were 30 runs above average.
It turns out that this still isn’t sufficient for baseball player evaluation. An average player compared against league average will have a value of zero. But that’s not fair either, is it? If a league average player gets hurt, the team’s record is likely to suffer while receiving a lower level of performance from a replacement. The drop off tells us that the average player does in fact have positive value to the team.
Thus, the lower “replacement level” baseline is more commonly used to evaluate players. This allows for the estimation of “runs above replacement” and “wins above replacement” (WAR).
What is the optimal baseline for measuring carbon emissions, or rather, emissions reductions? The Paris Agreement set a goal of 45% reductions by 2030 based on a 2010 baseline. Google’s baseline was set as 2019. Varying baselines might work for tracking an individual company’s progress year to year, but not beyond that. It’s a bit like saying Eddie Rosario in 2023 is worse than the 1930 league average but Ozzie Albies in 2023 is better than the 1968 league average. Without more information, we don’t know which is better in 2023 because they’re not on the same baseline, and we can’t draw conclusions about the strength of the 2023 Braves lineup either.
Splitting Credit
As with carbon emissions, our goal is to reduce run scoring as much as possible, and reaching zero is incredibly difficult. A pitcher can’t do it on their own. Even a perfect game is not entirely under their control. They typically allow at least a dozen balls in play of varying difficulty for the fielders, which have some chance of falling for a hit. All pitchers must share credit for run prevention with their fielders, with the possible exception of Steve Nebraska (and even he has the catcher to thank for receiving and framing all those strikes).
Likewise, reaching Net Zero Carbon is impossible without working up and down the supply chain. Even then, there’s probably some amount of unavoidable emissions especially in the short term, so you’ll need to look outside your own scope to remove carbon from elsewhere, either by boosting your offense or by making trades. (More on trades another time.)
Context
Even after accounting for the fielders and setting a useful baseline, we have more to consider to accurately assess an individual pitcher’s contributions.
Location
Runs are abundant in Colorado, where the air is thin and the outfield is spacious, so throwing a shutout there is even more impressive than in other parks.
Let’s say you are putting solar panels on your house. What are the carbon emissions of the panels for simplicity, let’s assume zero) compared to the default “replacement” level source of electricity, from your utility? If you’re in the mid-Atlantic region, maybe it’s 450 grams of CO2 equivalent per kilowatt-hour. But if you’re in California, it might be 250 g/kWh. In Iceland, it’s 26 g/kWh. Even though the solar panels might be identical, the impact on emissions reductions can be substantially different.
Role/Sector
We also need to consider what role we are considering. In 2022, the average first baseman hit .251 with offensive production 11% above league average. On the other hand, the average catcher hit .228 and produced runs at 12% below league average. Thus, a league average hitter playing catcher will be significantly more valuable than the same hitter as a first baseman.
Similarly, carbon emissions vary by industry, so eliminating carbon in a heavy carbon industry like steel production or transportation might be more significant than in electric power, where solutions are already cost-effective and emissions have already been cut substantially*.* The “replacement level” is higher in steel and transportation, meaning that getting to net zero is of much greater value in terms of emissions reductions.
Timing
Run production has varied greatly over baseball history. Pitching a shutout in 1930 or 1999 meant preventing a lot more runs, compared to average or replacement level, than in 1906 or 1968. Likewise, getting to net zero in 2023 will require a lot more effort than it will in 2050. (Not to mention the cumulative effects of nearly three decades of eliminated carbon in the meantime!)
Timing is also relevant on a micro scale. Runs allowed in high leverage situations, especially late in the game, are disproportionately impactful on team wins and losses. MLB teams pay ace relievers handsomely to prevent runs when they count the most. Likewise, low emissions solutions are less valuable midday, when renewables are powering significant portions of the grid, than at night when coal and gas plants are still running strong. Companies like Google have recognized that there is an uneven impact in eliminating emissions at different hours, so they have targeted hourly electricity matching as part of their net zero pledge.
In baseball, we make adjustments for context such as location, role, and timing; similar adjustments are valuable for assessing the impact of carbon solutions.
Accounting Standards
Following the development of metrics to apportion credit for run creation and prevention, and the conceptual recognition of replacement level, Tom Tango laid out the theoretical framework for Wins Above Replacement (WAR).
However, it was just a framework, leaving the specific calculations to anyone with sufficient data and motivation to create them. Depending on your data source, how you choose to set the baselines of average and replacement level, how you choose to calculate run creation and prevention, and how you adjust for context, you could have wound up with significantly different WAR values. And, that’s exactly what happened.
Everyone and their brother built different WAR variations, with subtle but meaningful differences in their calculations. Major baseball stats websites FanGraphs, Baseball Prospectus, and Baseball-Reference all adapted their own versions. Pitchers in particular saw significant differences; a pitcher might have 7.0 WAR and lead the league on one site but 4.0 WAR on another. Writers often picked whichever answer was most convenient for their story at the moment, lacking consistency and undermining their own evaluations long term.
Carbon accounting runs a very similar risk. Multiple methods exist, and there’s some fuzziness to how emissions are calculated. Companies can choose whichever framework they want, so some choose the method that presents them in the most favorable light.
Baseball offers reasons for optimism. The varying accounting methods have started to move closer together. Upon further study, some methodologies are shown to outperform others, and their proprietors tweak accordingly, leading to convergence. In the case of FanGraphs and Baseball-Reference, they outright agreed to set a common replacement level baseline. Raw data quality has improved (Statcast!), and the marginal differences between sources has diminished. In some cases, the league itself steps in to declare “official” statistics, though this has been rare since the recognition of the Save in 1969.
Standardized methods of accounting have led to greater adoption and recognition of their meaning. How can carbon accounting similarly move towards standardization to grow adoption and significance?
What Can Carbon Accounting Can Learn From Baseball
We’ve stumbled across some learnings from baseball analytics that have potential implications for carbon accounting. Let’s run through a few.
Accounting can shape reality, for better or worse. Traditional baseball accounting with Runs and RBIs shaped lineup and roster construction for a century. Bill James and his predecessors figured out the faults in this thinking, but it took 50 years for the industry itself to adjust.
Unfortunately, climate change isn’t giving us that kind of time to make mistakes. We’ve got to get this right, and now.
Double counting has drawbacks… and advantages. As noted, double counting Runs and RBIs led to suboptimal lineup construction. But, it did promote a diverse style of play, with speedsters, bat control experts, and sluggers all carving a niche in the game. Pitching staffs balanced control artists and flamethrowers. The sport of baseball remained healthy.
Now, run scoring and run prevention are optimized, but rosters are homogenous and, to an increasing number, the game became boring. This forced the league to literally change the rules of the game to try to restore excitement.
Diversity of options also has benefits in combatting climate change. We need a lot of different solutions in different sectors. Green hydrogen might not power cars, but it might help with aviation, shipping, or manufacturing. Developing a diverse set of solutions also increases the odds that we’ll succeed if one of the presumed solutions doesn’t scale cost effectively or runs into other implementation challenges.
We also need everyone pulling in the same direction to combat climate change. We can’t put the onus totally on the Scope 1 polluters to change. We need to also motivate everyone upstream and downstream to transition to carbon free technologies. Double and triple counting emissions ensures that those up and down the supply chain are dragged into the fight as well.
Does that also complicate the accounting and subtract from its accuracy? Yes. But do those drawbacks outweigh the benefits?
Standardization can build confidence and momentum. Regardless of the quality of a metric itself, having an official definition or certification generates credibility and wider adoption. Certain carbon accounting methods have been established by international bodies and are more widely in use today as a result. Even if there are better accounting methods, do we have the will to build establish new standards and spend time and money transitioning from the current methods?
There’s no perfect measurement, but some methods are better than others. This applies in baseball, where no one has found the best WAR calculation or any perfect metric for this matter. We’re still learning, we’ll keep getting better. This is why I selected the quote from statistician George Box to kick off Part 1.
Maybe our current carbon accounting methods aren’t perfect. But are they good enough? Or, perhaps, given where we stand, they might be our best option at this point?
Thanks for following along so far. I have found this helpful to better understand the challenges of carbon accounting. Maybe you took something from it as well?
I have neglected carbon offsets for now, and maybe we’ll circle back to that another time.