Here’s a Thought: Explaining Levels Of Pitching Statistics
July 14, 2009 by Nathaniel Stoltz
Filed under Fan News
The pitcher pictured in this article, J.A. Happ, is 6-0 with a 2.91 ERA.
Great, right?
Wrong.
If you know about BABIP, strand rate, FIP, and where I’m about to go with this, you don’t need to read this. If you don’t know about them, please, let me inform you.
When I wrote this article, showing that Matt Cain hasn’t progressed in his career, and is simply a No. 3 starter, I seemed to create a divide. Some people understood what I was saying (and agreed), and some people didn’t (and disagreed).
And that’s fine. I write columns here because I know a lot about the game of baseball and how to evaluate talent, and I like to use my extensive knowledge base to educate everyone.
If that last paragraph comes off as arrogant, I’m sorry: I just have put a lot of time and effort into baseball analysis, and I like to think it pays off. I freely admit when I’m wrong, and I am by no means clairvoyant when it comes to baseball.
Anyway, back to the point. I was talking about Cain’s low BABIP, high strand rate, and good-but-not-great FIP to back up my argument.
If you don’t know what that means, allow me to explain. Just listen to what it means, understand it, and then do what you want with that information. If you don’t think FIP is a good stat, fine, but at least you know exactly what it means.
So, here we go. This is what I (and stats people) believe about pitching.
Most of you who don’t know much about stats look at win-loss record and ERA.
Win-loss record is a bad stat. Obviously, a pitcher who allows one run in a game where his team scored zero runs did a better job than a pitcher who allowed seven runs in a game where his team scored eight.
Now, you may be thinking “Well, those games don’t happen often.”
But they add up, and pitchers who deserve to be 13-10 quickly become 10-13. Pitchers who should be 15-8 become 10-11. It happens. Continuing with the Matt Cain theme, he was certainly better than 15-30 over 2007-2008. He also isn’t 10-2 this year. He’s really maybe a 15-11 or so quality pitcher each of those years.
Now, several years back, many people realized this, and paid more attention to ERA.
Now, ERA is a better stat than wins, because it removes some luck from pitching.
Removing luck from pitching is so important. You have to isolate what the pitcher is doing, not what the offense behind them or the defense behind them is doing. Wins still have that big component of how much run support a pitcher gets.
ERA removes that run support component, but it leaves much more luck than many of you might think.
Consider this rather extreme example:
Pitcher A throws ten straight pitches that are swung at, hit, and roll softly between the pitcher, catcher, and third baseman on the infield. All ten batters are safe, and the batting team has a huge inning.
Pitcher B throws ten straight pitches that are crushed deep to center, but die on the warning track and are outs. He throws 3 1/3 perfect innings.
Which pitcher did a better job?
Obviously, the first one: he induced weak contact and kept the ball on the ground.
Which one has the better ERA?
The second one: he didn’t allow anyone to reach base.
Again, things that extreme don’t happen often, but consider this more plausible example.
Team A has a shortstop who can move five feet to each side in the time a line drive gets from a batter’s bat to the edge of the infield.
Team B has a shortstop who can move eight feet to each side in that time.
A pitcher on Team A gives up 15 liners between five and eight feet of the shortstop over a season. A pitcher on Team B does the same. Team A’s shortstop misses all of them and they go for hits. Team B’s catches all of them and they go for outs.
Did Team B’s pitcher do anything better than Team A’s? No. But his ERA (and WHIP) will certainly look better, because 15 fewer guys got on base during the course of a year.
Add in subtle variances in batted-ball placement, fielder positioning, and fielder quality, and you can get 20 or 30 hits becoming outs over the course of a year, or vice versa.
So there’s obviously some luck that goes into balls in play, from a pitcher’s perspective. A pitcher can’t control where the ball is hit, where his defenders are positioned, and how good his defenders are.
So we know that.
Now, let’s step away from batted balls for a second and look at what defense doesn’t play a role in.
Three things can happen in baseball that defense isn’t involved in: walks, strikeouts, and home runs. Because there’s no batted-ball luck involved, these three things are sometimes referred to as the “Three True Outcomes.”
They are also referred to as the “peripheral statistics” of a pitcher.
If you ever hear me say “…’s peripherals are:” I’m discussing the pitcher’s walk rate (how many batters he walks per nine innings), strikeout rate (how many strikeouts he gets per nine innings), and homer rate (how many homers he allows in nine innings).
For a pitcher, a walk rate of three to four BB per nine innings pitched is about average, a strikeout rate of six to seven per nine innings is about average, and a homer rate of one HR per nine innings is about average.
Going back to Matt Cain, he has a walk rate of 3.54 BB/9, a strikeout rate of 7.31 K/9, and a homer rate of .85 HR/9. That means he’s got average control and slightly above-average strikeout and homer ability. That’s not bad. That’s a good #3 starter. It’s not an ace, which is what I said before.
But without reigniting a war about that (and if you have something to say about Cain, at just add to the 54 comments (and counting) on that article rather than messing with the discussion on this one), let’s move on.
So we know what the pitcher controls. Let’s go back to what he doesn’t control.
If you read my articles, or those of other sabermetrics (advanced baseball stats) guys, you probably see the term BABIP used a lot.
BABIP stands for batting average on balls in play. It’s largely luck (I’ll clarify the “largely” in a bit). We know it’s largely luck because statistical studies have shown that a pitcher’s BABIP in one year has almost no correlation to his BABIP of the next year.
So, why is J.A. Happ’s 6-0 record and 2.91 ERA not a good indicator of his pitching ability?
Because he has a .242 BABIP.
Nobody (except Chris Young, Mariano Rivera, and maybe one or two others) can keep a BABIP in that range. Last year, J.A. Happ’s BABIP was .283. In 2007, it was .389.
BABIPs tend to centralize around the .300-.310 mark. If a pitcher is in that range, it’s fairly likely that their ERA will fall pretty close to their true level of performance. If it’s way below that, their ERA will likely overrate them, and if it’s way above that, their ERA will underrate them.
You also see the stat “FIP” thrown around a lot in my articles. What is FIP?
FIP (or Fielding Independent ERA) is essentially ERA without the BABIP luck issues. FIP looks at a pitcher’s peripheral numbers, assumes average luck on everything else, and then tells you what that pitcher’s ERA should be. J.A. Happ’s is 4.51. Matt Cain’s is 3.87.
So FIP is essentially just a better version of ERA.
Now, there’s plenty of stat guys who will tell you FIP is perfect.
It’s not.
Pitchers can control some things on balls in play. They can influence whether the ball is a ground ball, line drive, popup, or outfield fly.
What they can’t influence is what those do.
Batters hit in the .720 range on line drives. They hit about .260 on ground balls.
So if there was a pitcher who gave up only line drives, we would actually expect him to have a BABIP around .720.
A pitcher who only gives up ground balls would be expected to have a BABIP around .260.
For flies, the BABIP is in the .160-.170 range, and batters hit about .015 on popups.
Consider these two examples:
Pitcher A: 16.7% LD, 35.8% GB, 47.6% FB, 9.4% PO.
Pitcher B: 16.7% LD, 48% GB, 35.3% FB, 10.2% PO.
If we multiply out the expected BABIP figures, we get that Pitcher A’s BABIP should be about .293 and Pitcher B’s BABIP should be about .301.
Pitcher A is Happ, who has a .242 BABIP. Pitcher B is Jon Lester, who has a .348 BABIP.
It’s no surprise that Lester’s 3.22 FIP thus exceeds his 3.87 ERA.
Note that Happ, because he allows more flies, which have a lower BABIP than grounders, does have a slightly lower Expected BABIP (or xBABIP). But note that it’s a .008 difference, not .106.
Almost entirely because of that balls-in-play luck, Happ’s ERA is .94 better than Lester’s, even though Lester’s FIP is 1.29 runs lower.
I brought up Chris Young and Mariano Rivera as low BABIP-ers earlier. Why? Young allows a ton of flyballs in huge Petco Park, and Rivera’s cutter induces a ton of popups.
If you don’t feel like multiplying a bunch of numbers out to find xBABIP, just use (LD percentage + .12). Because BABIP on liners is so much higher than BABIP on anything else, limiting line drives is key to getting good results on balls in play.
In scouting terms, this is the old “pitch to contact, away from the barrel of the bat.”
So don’t assume that just because a pitcher’s BABIP is .340, it’s going to come down. Check the line-drive rate, and if it’s up in the 21-plus percent range, the BABIP is legit. It’s been shown that LD percentage + .12 is a good “quick and easy” formula to assess BABIP.
That’s really it as far as balls in play stats.
Glad it’s over?
We’re not done yet. There’s a few other things I’d like to call attention to.
You may be wondering why groundball pitchers are held in high esteem by stat guys if flies have a much lower BABIP.
It’s simple: flies can go out of the yard.
However, the percentage of fly balls that leave the yard is another statistic, like BABIP, that’s been shown to be largely random. Homer-to-flyball ratio (or HR/FB) typically sits in the 7-13 percent range. If a pitcher has a HR/FB of 2 percent, he’s lucky. If he has a HR/FB of 20 percent, he’s unlucky.
You’re probably thinking “But I thought you said pitchers could control homers!”
Well, they usually can, because flyball percentage can stay fairly consistent, and homers correlate to that.
“But, isn’t FIP flawed then?”
Yes, yes it is.
Hence xFIP (or Expected FIP). This takes a pitcher’s flyball rate, assigns average HR/FB luck, and generates an expected home run rate. It then uses the expected home run rate to go with walks and strikeouts to produce FIP.
J.A. Happ has a 9.4 percent HR/FB rate, just slightly below average. His FIP is 4.51. His xFIP is 4.77.
The extra .26 is a correction in HR/FB luck.
There’s one other stat, like BABIP and HR/FB, that has no predictive value and is thus luck. I promise I’m done after this one.
It’s strand rate, or percentage of runners stranded. The average is about 72 percent. It’s true that better pitchers may have slightly higher strand rates and worse ones might have slightly lower ones, but if a pitcher’s strand rate isn’t between 69 percent and 75 percent, they are getting lucky (if it’s higher) or unlucky (if it’s lower).
J.A. Happ’s strand rate is 85.9 percent.
With a BABIP of .242 and a strand rate of 85.9 percent, it’s obvious why Happ has succeeded.
While Happ isn’t a bad pitcher, his peripherals are 6.31 K/9, 3.31 BB/9, and 1.14 HR/9. All three are about average, which lead to an average 4.51 FIP.
J.A. Happ’s 2.91 ERA stems from an average pitching performance coupled with exceptional luck on balls in play and with stranding runners. His 6-0 won-loss record, on top of that, stems from good run support from an excellent Phillies offense.
The stats tell us that Happ and Cain are overrated, and Lester is underrated.
So there you have it. That’s what “us stats guys” look at when evaluating a pitcher. I hope that those of you who didn’t know about this stuff have learned something, and that I presented this information in a clear, understandable fashion.