Cy Young Award in the Sabermetrics Era: A Study of Who Will Win in 2011, Part 2
March 17, 2011 by Todd Drager
Filed under Fan News
This is Part 2 of a two-part series in which I will analyze a current Cy Young Predictor formula. I offer a replacement formula to account for the change in philosophy for the Cy Young voters with the growing influence of new-age statistics (sabermetrics) and use this new formula to project the Cy Young race in 2011 and beyond.
In Part 1, I looked into a widely accepted Cy Young Predictor formula and explained the flaws in it. (You can check out Part 1 here).
In Part 2, I will project pitching statistics for the top 12 National League pitchers.
NOTE: All of the Tables contain a lot of information and, as such, have been uploaded elsewhere and linked in here for better clarity.
PART 2: A Method to predict the Cy Young Award winner in any given year
We have already found a more accurate way to predict who will win the Cy Young Award based upon season statistics (see Part 1), but now let’s look into an accurate way to predict who will win the Cy Young before the season even starts.
The 2011 MLB season is quickly approaching, and there has been a ton of hype surrounding the Phillies’ pitching staff, but is it warranted?
They have all shown to be dominant pitchers in the past, but how likely is it that one of the four main Phillies’ starters will take home the Cy Young this year?
Who are the most likely candidates to challenge the Phillies’ aces for the crown in 2011?
To answer all of these questions, I will need to project 2011 pitching statistics based on prior years’ data.
There’s no simple way to do this. Every year is different, and you don’t know who has made improvements and who has struggled through the offseason.
The age factor is always a question too. Some young pitchers come in with high expectations and never break through, while others come out of nowhere and have dominant seasons.
Older pitchers have a lot more experience, yet their arm strength usually suffers late in their career.
Pitchers going to a new team, or significant defensive improvements made in the offseason, are both obstacles that are used to project accurate pitching statistics.
But in general, barring any unforeseen injury, a pitcher’s statistics will be closely related to his statistics from past seasons.
There is a limit to how far back you can look, though.
Obviously, Cliff Lee’s rookie or sophomore season isn’t a fair comparison to his later years. It takes time for a pitcher to show his true colors and for him to either develop into a star, or fade into obscurity.
Most pitchers have bumps along the way, but looking at any three or four-year time period seems to tell a lot. A pitcher’s average statistics over one three-year period will often provide clues as to how his next year will be.
Why three years?
Well, three years seems to be a good median number of years to analyze. If we only look at the previous season, we won’t give ourselves enough information.
As an example, Zach Greinke won the Cy Young in 2009 with a 2.16 ERA and 1.07 WHIP. His numbers in 2008 were decent, but not Cy Young type numbers.
In 2010, his numbers again were really good, but not enough for the Cy Young. If we were to only gauge a player’s performance on the previous year, we would surely think Greinke could post Cy Young caliber numbers in 2010.
However, it is actually very rare for a pitcher to have similar seasons back to back.
On the flip side, if we were to look at Greinke’s entire career, we’d be pulling in information from when he was just starting out in the MLB and hadn’t yet developed into the dominant pitcher he turned out to be. Again, that wouldn’t be a fair analysis.
Greinke is just one example, but the trend holds true for the majority of cases.
The 2011 Cy Young Predictor
I’ve compiled a list of the top 12 starting pitchers over the past three seasons. Their statistics are shown in table 2a here:
The average number of games started over the three-year time period is boxed for each pitcher. The adjusted wins/losses columns will be discussed shortly.
First, let’s use these statistics from past years to find an accurate projection for the 2011 season in Cy Young Points (CYP).
The CYP (adjusted) column shows what that pitcher’s statistics in that year would have given him with the ADJUSTED CYP as found in PART 1 (Cy Young Points (CYP) = ((5*IP/9)-ER) + (K’s/5) + (SV*1.5) + (Shutouts*2) + ((W*3)-(L*2)) + (VB*5) + ((0.5*IP)-(IP*WHIP/3)))
This is found by adding the individual CYP from each individual category shown in the previous columns.
The W/L ADJ is a simple adjustment to account for a pitcher changing teams between 2008 and 2011.
Take, for instance, Zack Greinke.
Basically, if Greinke was on the Brewers for the past three years, how many wins would have been compared to how many wins he got with the Royals?
Clearly, the Brewers had a better offense, so Greinke would have most likely had a few more wins. This adjustment compares the team wins in that given year and, using that, shows what the CYP would have been.
Then the CYP(W/L adj) column shows the total CYP using the W/L adjusted values.
The CYP(#G adj) is a simple adjustment to equalize the number of games started for all pitchers.
Some pitchers have started more games than others over the past three seasons. Some of that is due to the pitcher’s team being in a playoff race, and some is due to the pitcher himself having a dominant season.
In any given year, any pitcher shown could throw more innings, but it depends on if they are called upon to do so. It’s out of the pitcher’s control, so I assume each pitcher to start the SAME number of games to equalize each pitcher’s chances.
If you look at Table 2a, the boxed numbers show the average number of games played for each pitcher. Normalizing to the largest average will eliminate the advantage that some pitchers have had by simply starting more games.
NOTE: This does not equalize number of innings pitched, but only equalizes the number of starts. If a pitcher typically goes deep into games, he will still have a significant advantage over a pitcher who only tosses a few innings per start.
After those two simple adjustments, we can find the average CYP over the past three years, as shown in the far right column.
As you can see, I’ve taken a straight average over the past three years to find the CYP (ave). The reason I haven’t put any emphasis on more recent years, is like I mentioned earlier: it’s very rare for a pitcher to have similar seasons back-to-back.
An average over a three-year period is a much better indicator.
Now comes the fun part.
Using all this data, we can find the probability of a pitcher having a dominant season, and thus infer his likeliness to take home the Cy Young award.
We have to find the probability that a pitcher will reach a high numbers of CYP in 2011. Oswalt may average 129 CYP, but will he be able to get enough CYP in 2011 to win the Cy Young Award? You won’t do that with only 129 CYP.
The way we can analyze this is cumulative probability.
Cumulative probability is the sum of probabilities. It is used to predict the probability of a randomly selected score being greater than or equal to a specific value (referred to as the normal random variable).
If we set our normal random variable to a CYP value of say, 190, that would show the probability of any pitcher reaching 190 CYP in 2011. This number is used because using the past data, if a pitcher scores 190 CYP, he should take home the award.
Table 2c below shows the standard deviation, then the cumulative probability of each of the top 12 pitchers achieving 190 CYP in 2011.
Then, in the rightmost column, each pitchers’ chance of winning the award is broken down into a percentage of the sum.
A safety factor of 20 percent was left in to account for other starting pitchers not mentioned, and relief pitchers.
As you can see, Roy Halladay has the best chance of achieving 190 CYP this year. Cliff Lee, Chris Carpenter and Tim Lincecum also have good odds, but the odds are significantly less than Halladay’s.
A breakdown by team is shown in Table 2d below.
Probability By Team |
|
Phillies |
49.64% |
Cardinals |
11.32% |
Giants |
10.80% |
Brewers |
7.30% |
Rockies |
2.14% |
Padres |
1.58% |
Dodgers |
0.93% |
Marlins |
0.67% |
All Others |
15.63% |
TOTAL |
84.38% |
The Phillies’ top four starters combine for a predictable advantage in winning the Cy Young Award this year at 49.64 percent.
The Cardinals have the next best odds at 11.32 percent, but it’s worth noting that the analysis was also done with Wainwright in the rotation for the Cards.
The probability shifted from around 15 percent for the Cardinals, down to 11.32 percent without him.
Conclusion
The calculations presented in this study use several assumptions, and every year presents a new opportunity for ANY pitcher.
A new ace may come out of nowhere, much like Mat Latos did last year. A steady pitcher over the past few years could completely fall apart, a pitcher could get traded, or there could be significant injuries to anyone.
You never know what will happen, and that’s why we watch sports, right? If everything went as predicted every year, where’s the fun in cheering for the underdog?
This study is a snapshot of predictable statistics for the 2011 season and how the Cy Young voting will turn out come season’s end, based on things we know NOW.
The Phillies may have good odds to win the Cy Young, but will they? Or will a young stud come out of nowhere and challenge one of the perennials in the N.L.?
The odds indicate otherwise but, with so many variables, it’s impossible to really know.
When we sharpen our pencil and get down deep into the stats, we can know a little more and be more prepared to answer those questions.
Hopefully, this study puts you a little bit ahead of the other guy.
Written By Todd Drager
This article was originally published in clean and simple PDF form here.
Follow Todd on Twitter @7thandPattison
Read more Philadelphia Phillies news on BleacherReport.com
Cy Young Award in the Sabermetrics Era: A Study of Who Will Win in 2011 Part 1
March 7, 2011 by Todd Drager
Filed under Fan News
This is a 2 part series in which I will analyze a current Cy Young Predictor formula, offer a replacement formula to account for the change in philosophy for the Cy Young voters with the growing influence of new-age statistics (sabermetrics), and use this new formula to project the Cy Young race in 2011 and beyond.
Part 1 will look into a widely accepted Cy Young Predictor formula and explain the flaws in it. As voters are considering sabermetric statistics more and more, the Cy Young formula also needs to adapt to the new ways of thinking and voting. Part 1 will examine and breakdown a new formula and check it with past data for accuracy.
Part 2 will look to the 2011 Cy Young race in the National League. First an analysis will be done to determine how the top pitchers will fare in the 2011 season. Next, a projection will be done to determine how these pitchers will end up placing on the Cy Young ballot. A similar analysis will be done for the American League at a later date.
PART 1: A TILT TOWARDS ADVANCED STATISTICS AND WHAT IT MEANS TO THE CY YOUNG AWARD
INTRODUCTION
Baseball fans are getting smarter.
There’s been a change in the way we watch, discuss and analyze baseball in the past few years. A lot of that is due to fantasy geeks around the world and their constant strive to find an edge in the game. We have Daniel Okrent and the guys from the original Rotisserie League to thank for that. They brought the game into our lives, and now millions play it day in and day out.
Fantasy baseball is really a derivative of what Bill James was trying to do with sabermetrics. James began his work in the 70’s to try and find a better way of assigning value to players on the field and at the plate. His work won the approval of many, including Okrent. Without James developing the theory and Okrent bringing a “silly little game” to our lives, in all likelihood, baseball wouldn’t be nearly as popular as it is today, and it certainly wouldn’t be dissected as much.
James knew back in the 70’s that many highly regarded baseball statistics weren’t telling the whole story. One of them was the win/loss category. Pitchers can only do so much to win games, so if they don’t have a decent offense behind them their wins will be lower and their losses will be higher than a pitcher with the same arm on a team with a great offense. That seemed obvious to him, yet the mainstream media and baseball gurus around the league had been using certain barometers for good pitchers and bad pitchers for years, so while James had done some incredible work it took years for it to be truly recognized.
The tide has shifted lately though, and many are finally coming around the advanced metrics James and other had been writing about for years. This is especially evident in the 2010 Cy Young voting. Felix Hernandez won the award with a startling 21 of 28 first place votes from the Baseball Writers’ Association of America even though he only won 13 games. If this happened in the 70’s he wouldn’t have even made the ballot.
THE EXISTING CY YOUNG PREDICTOR
James wrote a formula with Rob Neyer of ESPN to calculate a projected Cy Young winner prior to this shift in voting. His formula is as follows:
Cy Young Points (CYP) = ((5*IP/9)-ER) + (K’s/12) + (SV*2.5) + Shutouts + ((W*6)-(L*2)) + VB
(where VB is a Victory Bonus of 12 points awarded for leading your team to the division championship.)
Why he even bothered to write the formula is questionable it itself as the Cy Young almost always went to the pitcher with the most wins prior to 2009. Period. But that’s beside the point. James’ formula worked great up until the past few years. But then a noticeable shift in voting occurred.
In 2008, James’ formula correctly selects Cliff Lee and Tim Lincecum.
In 2009, however, James’ formula selects Felix Hernandez and Adam Wainwright. Zach Greinke won in the A.L and was ranked 2nd on James’ formula. Tim Lincecum won in the N.L and was only ranked 4th in James’ formula.
In 2010, again we see the shift in voting. James’ formula selects Roy Halladay and CC Sabathia, but Felix The Kid ended up taking home the A.L. award. Felix was ranked 6th in James’ formula! Above him were CC, Price, Lester, Soriano, and Buchholz.
If the trend in voting continues down this path, it’s clear that James’ original formula needs to be modified to fit this new-age thinking. In the following study I’ll explain the flaws in the old formula and provide a new formula to account for the shift in Cy Young voting.
THE ADJUSTED CY YOUNG PREDICTOR
If we take apart James’ formula and break it into variables and constants we have this:
Cy Young Points (CYP) = ((A*IP/9)-ER) + (K’s/B) + (SV*C) + (Shutouts*D) + ((W*E)-(L*F)) + (VB*G)
(where the constants are A through G, and the variables are each pitcher’s individual stats)
As Cy Young voters are becoming more and more accepting of sabermetrics statistics, this formula seems to be leaving out some key data that voters look at. While I could make the case that voters should look at advanced statistics such as WAR, CERA, or DIPS, that isn’t yet a reality. Maybe in the coming years these advanced statistics will be looked at, but that time is not now. But there is one glaring piece of information left out of James’ formula that voters are clearly looking at now, WHIP (walks+hits/IP).
WHIP. It even sounds cool. It’s simple enough for anyone to understand, yet very telling of a pitcher’s dominance on the mound. With a quick glance at WHIP you can get a snapshot of the pitcher and understand how much luck was involved with his ERA and overall record. With Greinke and Hernandez winning the A.L Cy Young the last few years yet not dominating the Win category, it’s clear the voters are looking into a category that those pitchers did well in. WHIP.
Greinke had a 1.073 WHIP in 2009(good for 2nd in the A.L) to go with his 2.16 ERA and 242 K’s, and Hernandez had a 1.06 WHIP (2nd in the A.L) to go along with his 2.27 ERA and 232 K’s. Neither pitcher was in the top 5 in Wins in the A.L, and in fact Hernandez only amassed 13 throughout the entire season.
If we incorporate WHIP into James’ equation and modify the constants we can find an equation much more suitable for the present day. The easiest way to explain how and why I made the changes is to show both the EXISTING equation and ADJUSTED equation, and then provide an explanation and commentary below. The basic equation, including WHIP, is:
Cy Young Points (CYP) = ((A*IP/9)-ER) + (K’s/B) + (SV*C) + (Shutouts*D) + ((W*E)-(L*F)) + (VB*G) + ((H*IP)-(IP*WHIP/J))
And the constants used in both James’ (EXIST) and my (ADJUSTED) study are as follows:
EXIST | ADJUSTED | |
A | 5 | 5 |
B | 12 | 5 |
C | 2.5 | 1.5 |
D | 1 | 2 |
E | 6 | 3 |
F | 2 | 2 |
G | 12 | 5 |
H | 0 | 0.5 |
J | 0 | 3 |
COMMENTARY
I came to these adjusted constants by analyzing the relative strength each individual constant would add to the overall total. To do this I analyzed the 2009 N.L Cy Young race. Using James’ existing equation, the top 10 finishers should have been the following pitchers in the order shown below in Table 1a (with stats included). The The CYP(exist) is the value calculated with the EXISTING equation and the CYP(adjusted) is the value shown with the ADJUSTED equation.
TABLE 1a
2009 N.L. | ||||||||||||||||
RK | PLAYER | TEAM | G | GS | IP | ER | K | SV | SHO | W | L | ERA | DC | WHIP | CYP(exist) | CYP(adjusted) |
1 | Adam Wainwright | STL | 34 | 34 | 233 | 68 | 212 | 0 | 0 | 19 | 8 | 2.63 | 1 | 1.21 | 189.11 | 172.37 |
2 | Chris Carpenter | STL | 28 | 28 | 192.7 | 48 | 144 | 0 | 1 | 17 | 4 | 2.24 | 1 | 1.01 | 178.06 | 169.33 |
3 | Jonathan Broxton | LA | 73 | 0 | 76 | 22 | 114 | 36 | 0 | 7 | 2 | 2.61 | 1 | 0.96 | 169.72 | 132.70 |
4 | Tim Lincecum | SF | 32 | 32 | 225.3 | 62 | 261 | 0 | 2 | 15 | 7 | 2.48 | 0 | 1.05 | 162.92 | 184.16 |
5 | Heath Bell | SD | 68 | 0 | 69.7 | 21 | 79 | 42 | 0 | 6 | 4 | 2.71 | 0 | 1.12 | 157.31 | 115.35 |
6 | Ryan Franklin | STL | 62 | 0 | 61 | 13 | 44 | 38 | 0 | 4 | 3 | 1.92 | 1 | 1.2 | 149.56 | 103.79 |
7 | Javier Vazquez | ATL | 32 | 32 | 219.3 | 70 | 238 | 0 | 0 | 15 | 10 | 2.87 | 0 | 1.03 | 141.67 | 158.79 |
8 | Brian Wilson | SF | 68 | 0 | 72.3 | 22 | 83 | 38 | 0 | 5 | 6 | 2.74 | 0 | 1.2 | 138.08 | 102.00 |
9 | Josh Johnson | FLA | 33 | 33 | 209 | 75 | 191 | 0 | 0 | 15 | 5 | 3.23 | 0 | 1.16 | 137.03 | 138.00 |
10 | Jair Jurrjens | ATL | 34 | 34 | 215 | 62 | 152 | 0 | 0 | 14 | 10 | 2.6 | 0 | 1.21 | 134.11 | 130.63 |
The CYP(adjusted) components were separated into percentages of the sum in order to understand why a certain player received a certain score, i.e. answering the question, “what did they do well in”. Table 1b summarizes those findings.
TABLE 1b
RK | PLAYER | TEAM | ERA | K | SV | SH | W/L | DC | WHIP |
1 | Adam Wainwright | STL | 35.65% | 24.60% | 0.00% | 0.00% | 23.79% | 2.90% | 13.07% |
2 | Chris Carpenter | STL | 34.88% | 17.01% | 0.00% | 1.18% | 25.39% | 2.95% | 18.59% |
3 | Jonathan Broxton | LA | 15.24% | 17.18% | 40.69% | 0.00% | 12.81% | 3.77% | 10.31% |
4 | Tim Lincecum | SF | 34.30% | 28.34% | 0.00% | 2.17% | 16.83% | 0.00% | 18.35% |
5 | Heath Bell | SD | 15.36% | 13.70% | 54.62% | 0.00% | 8.67% | 0.00% | 7.65% |
6 | Ryan Franklin | STL | 20.13% | 8.48% | 54.92% | 0.00% | 5.78% | 4.82% | 5.88% |
7 | Javier Vazquez | ATL | 32.64% | 29.98% | 0.00% | 0.00% | 15.74% | 0.00% | 21.64% |
8 | Brian Wilson | SF | 17.81% | 16.28% | 55.88% | 0.00% | 2.94% | 0.00% | 7.09% |
9 | Josh Johnson | FLA | 29.79% | 27.68% | 0.00% | 0.00% | 25.36% | 0.00% | 17.16% |
10 | Jair Jurrjens | ATL | 43.98% | 23.27% | 0.00% | 0.00% | 16.84% | 0.00% | 15.91% |
AVERAGES: | SP | 35.21% | 25.15% | 0.00% | 0.56% | 20.66% | 0.98% | 17.45% | |
RP | 17.13% | 13.91% | 51.53% | 0.00% | 7.55% | 2.15% | 7.73% | ||
SP(exist) | 30-40% | 9 to 12% | 0.00% | 0 to 1% | 45-55% | 2-5% | 0.00% | ||
RP(exist) | 10 to 15% | 3 to 7% | 60-70% | 0.00% | 13-17% | 2 to 5% | 0.00% |
Look at the averages to make sense of it all.
As you can see by looking at the averages, with the ADJUSTED equation, the overall score depends on roughly 50% ERA + WHIP whereas the EXISTING equation would account for roughly 30-40% ERA + WHIP for Starting Pitchers (SP). Another big change is the dependence on Wins. In the EXISTING equation, wins accounted for roughly 45-55% of the total score, whereas in the ADJUSTED equation, Wins account for much less (an average of 20.66% in 2009). Strikeouts were also valued higher in the ADJUSTED equation, as the voters seem to value that more now too.
To verify that this ADJUSTED equation would work for more than just one circumstance, it was tested on the past 2 years’ Cy Young races in both the N.L. and the A.L. The data is shown below:
TABLE 1c | ||||||||||||||||
2009 N.L. | ||||||||||||||||
RK | PLAYER | TEAM | G | GS | IP | ER | K | SV | SHO | W | L | ERA | DC | WHIP | CYP(exist) | CYP(adjusted) |
1 | Adam Wainwright | STL | 34 | 34 | 233 | 68 | 212 | 0 | 0 | 19 | 8 | 2.63 | 1 | 1.21 | 189.11 | 172.37 |
2 | Chris Carpenter | STL | 28 | 28 | 192.7 | 48 | 144 | 0 | 1 | 17 | 4 | 2.24 | 1 | 1.01 | 178.06 | 169.33 |
3 | Jonathan Broxton | LA | 73 | 0 | 76 | 22 | 114 | 36 | 0 | 7 | 2 | 2.61 | 1 | 0.96 | 169.72 | 132.70 |
4 | Tim Lincecum | SF | 32 | 32 | 225.3 | 62 | 261 | 0 | 2 | 15 | 7 | 2.48 | 0 | 1.05 | 162.92 | 184.16 |
5 | Heath Bell | SD | 68 | 0 | 69.7 | 21 | 79 | 42 | 0 | 6 | 4 | 2.71 | 0 | 1.12 | 157.31 | 115.35 |
6 | Ryan Franklin | STL | 62 | 0 | 61 | 13 | 44 | 38 | 0 | 4 | 3 | 1.92 | 1 | 1.2 | 149.56 | 103.79 |
7 | Javier Vazquez | ATL | 32 | 32 | 219.3 | 70 | 238 | 0 | 0 | 15 | 10 | 2.87 | 0 | 1.03 | 141.67 | 158.79 |
8 | Brian Wilson | SF | 68 | 0 | 72.3 | 22 | 83 | 38 | 0 | 5 | 6 | 2.74 | 0 | 1.2 | 138.08 | 102.00 |
9 | Josh Johnson | FLA | 33 | 33 | 209 | 75 | 191 | 0 | 0 | 15 | 5 | 3.23 | 0 | 1.16 | 137.03 | 138.00 |
10 | Jair Jurrjens | ATL | 34 | 34 | 215 | 62 | 152 | 0 | 0 | 14 | 10 | 2.6 | 0 | 1.21 | 134.11 | 130.63 |
TABLE 1d | ||||||||||||||||
2009 A.L. | ||||||||||||||||
RK | PLAYER | TEAM | G | GS | IP | ER | K | SV | SHO | W | L | ERA | DC | WHIP | CYP(exist) | CYP(adjusted) |
1 | Felix Hernandez | SEA | 34 | 34 | 238.7 | 66 | 217 | 0 | 1 | 19 | 5 | 2.49 | 0 | 1.14 | 189.69 | 187.66 |
2 | Zack Greinke | KC | 33 | 33 | 229.3 | 55 | 242 | 0 | 3 | 16 | 8 | 2.16 | 0 | 1.07 | 175.56 | 191.66 |
3 | CC Sabathia | NYY | 34 | 34 | 230 | 86 | 197 | 0 | 1 | 19 | 8 | 3.37 | 1 | 1.15 | 169.19 | 156.01 |
4 | Mariano Rivera | NYY | 66 | 0 | 66.3 | 13 | 72 | 44 | 0 | 3 | 3 | 1.76 | 1 | 0.90 | 163.83 | 125.49 |
5 | Roy Halladay | TOR | 32 | 32 | 239 | 74 | 208 | 0 | 4 | 17 | 10 | 2.79 | 0 | 1.13 | 162.11 | 168.85 |
6 | Justin Verlander | DET | 35 | 35 | 240 | 92 | 269 | 0 | 1 | 19 | 9 | 3.45 | 0 | 1.18 | 160.75 | 161.73 |
7 | Joe Nathan | MIN | 70 | 0 | 68.7 | 16 | 89 | 47 | 0 | 2 | 2 | 2.1 | 0 | 0.93 | 155.08 | 125.52 |
8 | Brian Fuentes | LAA | 65 | 0 | 55 | 24 | 46 | 48 | 0 | 2 | 2 | 3.93 | 1 | 1.40 | 150.39 | 96.59 |
9 | Jered Weaver | LAA | 33 | 33 | 211 | 88 | 174 | 0 | 2 | 16 | 8 | 3.75 | 1 | 1.24 | 137.72 | 123.31 |
10 | Josh Beckett | BOS | 32 | 32 | 212.1 | 91 | 199 | 0 | 2 | 17 | 6 | 3.86 | 0 | 1.19 | 135.42 | 131.55 |
TABLE 1e | ||||||||||||||||
2010 A.L. | ||||||||||||||||
RK | PLAYER | TEAM | G | GS | IP | ER | K | SV | SHO | W | L | ERA | DC | WHIP | CYP(exist) | CYP(adjusted) |
1 | CC Sabathia | NYY | 34 | 34 | 237.7 | 84 | 197 | 0 | 0 | 21 | 7 | 3.18 | 0 | 1.19 | 176.47 | 161.02 |
2 | David Price | TB | 32 | 31 | 208.7 | 63 | 188 | 0 | 1 | 19 | 6 | 2.72 | 0 | 1.19 | 171.61 | 159.11 |
3 | Jon Lester | BOS | 32 | 32 | 208 | 75 | 225 | 0 | 0 | 19 | 9 | 3.25 | 0 | 1.2 | 155.31 | 145.36 |
4 | Rafael Soriano | TB | 64 | 0 | 62.3 | 12 | 57 | 45 | 0 | 3 | 2 | 1.73 | 0 | 0.8 | 153.86 | 121.05 |
5 | Clay Buchholz | BOS | 28 | 28 | 173.7 | 45 | 120 | 0 | 1 | 17 | 7 | 2.33 | 0 | 1.2 | 150.50 | 131.87 |
6 | Felix Hernandez | SEA | 34 | 34 | 249.7 | 63 | 232 | 0 | 1 | 13 | 12 | 2.27 | 0 | 1.06 | 150.06 | 175.74 |
7 | Justin Verlander | DET | 33 | 33 | 224.3 | 84 | 219 | 0 | 0 | 18 | 9 | 3.37 | 0 | 1.16 | 148.86 | 145.83 |
8 | Trevor Cahill | OAK | 30 | 30 | 196.7 | 65 | 118 | 0 | 1 | 18 | 8 | 2.97 | 0 | 1.11 | 147.11 | 133.45 |
9 | Neftali Feliz | TEX | 70 | 0 | 69.3 | 21 | 71 | 40 | 0 | 4 | 3 | 2.73 | 0 | 0.88 | 141.42 | 112.02 |
10 | Joakim Soria | KC | 66 | 0 | 65.7 | 13 | 71 | 43 | 0 | 1 | 2 | 1.78 | 0 | 1.05 | 138.92 | 111.06 |
TABLE 1f | ||||||||||||||||
2010 N.L. | ||||||||||||||||
RK | PLAYER | TEAM | G | GS | IP | ER | K | SV | SHO | W | L | ERA | DC | WHIP | CYP(exist) | CYP(adjusted) |
1 | Roy Halladay | PHI | 33 | 33 | 250.7 | 68 | 219 | 0 | 4 | 21 | 10 | 2.44 | 1 | 1.04 | 211.53 | 209.52 |
2 | Adam Wainwright | STL | 33 | 33 | 230.3 | 62 | 213 | 0 | 2 | 20 | 11 | 2.42 | 0 | 1.05 | 183.69 | 185.09 |
3 | Heath Bell | SD | 67 | 0 | 70 | 15 | 86 | 47 | 0 | 6 | 1 | 1.93 | 0 | 1.2 | 182.56 | 134.59 |
4 | Ubaldo Jimenez | COL | 33 | 33 | 221.7 | 71 | 214 | 0 | 2 | 19 | 8 | 2.88 | 0 | 1.15 | 170.00 | 165.83 |
5 | Billy Wagner | ATL | 71 | 0 | 69.3 | 11 | 104 | 37 | 0 | 7 | 2 | 1.43 | 0 | 0.87 | 166.67 | 135.35 |
6 | Brian Wilson | SF | 70 | 0 | 74.7 | 15 | 93 | 48 | 0 | 3 | 3 | 1.81 | 0 | 1.18 | 166.25 | 128.07 |
7 | Tim Hudson | ATL | 34 | 34 | 228.7 | 72 | 139 | 0 | 0 | 17 | 9 | 2.83 | 0 | 1.15 | 150.64 | 142.54 |
8 | Francisco Cordero | CIN | 75 | 0 | 72.7 | 31 | 59 | 40 | 0 | 6 | 5 | 3.84 | 0 | 1.43 | 140.31 | 90.89 |
9 | Chris Carpenter | STL | 35 | 35 | 235 | 84 | 179 | 0 | 0 | 16 | 9 | 3.22 | 0 | 1.18 | 139.47 | 137.42 |
10 | Carlos Marmol | CHC | 77 | 0 | 77.7 | 22 | 138 | 38 | 0 | 2 | 3 | 2.55 | 0 | 1.18 | 133.67 | 114.05 |
CONCLUSION:
Using the adjusted constants, we have a new Cy Young Predictor formula as shown below:
Cy Young Points (CYP) = ((5*IP/9)-ER) + (K’s/5) + (SV*1.5) + (Shutouts*2) + ((W*3)-(L*2)) + (VB*5) + ((0.5*IP)-(IP*WHIP/3))
By looking at the results from Tables 1c-1f, it’s clear that this formula will result in more accurate results in the sabermetrics age.
The Red lines in each chart indicate the Cy Young winner from that year. As you can see, the ADJUSTED equation correctly chooses the Cy Young winner from that year and league. Unfortunately, we still have a relatively small sample size with this new trend of voting, so the ADJUSTED equation doesn’t overemphasize the importance of WHIP or completely disregard the value in Wins.
As years pass, even this equation will most likely need to be updated to account for the new trends in Cy Young voting. There may be an even greater importance placed on sabermatrics in the future. Only time will tell. For now, this ADJUSTED equation seems fairly accurate to predict, with given data, who will win the Cy Young award.
In part 2, I will examine the 2011 N.L. Cy Young race with this ADJUSTED Cy Young Predictor and find each pitcher’s probability of winning the Cy Young. Will a Phillies pitcher take it home, or do the odds rest with another N.L. starter?
Written By Todd Drager
Republished With Permission From 7thAndPattison.com
Follow Him on Twitter @7thandpattison
Read more Philadelphia Phillies news on BleacherReport.com