About Class Balance.
For those of you who read my first ever blog post, I gave an account of my life as a WoW player since Ulduar while simultaneously explaining the ups and downs of the Enhancement spec. At that time, I was a 5-year veteran of the Druid class. Needless to say to anyone who’s raided as the Enhancement specialization since that era, we’ve seemingly had more downs than we have had ups. In fact, some might be tempted to just call the entire past three years one large valley in our spec’s representation and value in heroic raids. But just how bad was it?
Well, to be able to answer that question, we have to answer a few different ones. How do you determine how each class is doing relative to the others? Where do I find the information necessary to determine this? What, exactly, is “balance”, anyway? This is my take on it.
- Balance is when the numbers of each spec are so close, resources like SimulationCraft and raidbots.com show almost no disparity between the best and worst specs in WoW. This is the most visible form of balance. This is what I would call “e-peen-o-balance”, where your placement on the WoL or damage meters is directly in proportion to your superiority over other players, but it requires that everyone’s favored spec is among the top 5 on average. Makes sense, right?
- Balance is when you achieve physical and mental perfection and transcend your lesser, mortal form and attain that of a greater being. One with antlers, feathers, and great girth. One with a dance that mesmerizes, and a voice of such beauty, mere mortals are unable to fully comprehend it. This is what I would call “balance on hallucinogens”.
- Balance is when all of the members of each role in WoW are of equal value to their raid group in a progression situation. This is what I would call “effective balance”.
As hilarious as it would be, somehow I doubt the developers have goals for us all to ascend to the form of an owl beast, so that leaves the first and third options. While I make fun of option one, it’s the easiest to measure while having the most third-party support to do so, and therefore becomes the most important aspect of balance to the vast majority of players who are vocal about the topic. This is understandable, and in an ideal world, the numbers of each DPS class would be close enough for skill to be reflected on the meters and for mechanics to favor the strengths of each class and spec evenly across a raid zone.
Contrary to what many, maybe even most, players would consider to be the appropriate direction, Blizzard did not attempt to achieve “e-peen-o-balance” in classic WoW or Burning Crusade. In Wrath of the Lich King, their philosophy shifted a bit toward equality between specs on damage charts, and in Cataclysm and Mists of Pandaria that has been iterated on. Here’s a look at how players’ perception of class balance lines up with Blizzard’s design philosophy.
Class balance data, and how not to read it.
There are several common mistakes players make in Class Balance discussions.
First, please consider www.raidbots.com. The default page for raidbots.com (DPS Bot) compares the top 100 combat logs for each spec in World of Warcraft, and it retrieves this data from www.worldoflogs.com. It’s real data from real sources. You can configure what information the graph shows you with many different options.
Many players have linked this page directly as “proof” that their class is underpowered, or that some other class is overpowered. And that’s a problem.
The default option “Top 100” on raidbots.com uses an extremely small sample size. There are ten million people playing World of Warcraft … picking only 100 of them certainly won’t accurately represent the whole. There are more significant problems with it than that though, problems that are basic knowledge to anyone with a background in statistics, but may or may not be easily seen by those who don’t have such a background. The real reason using the top 100 parses isn’t an accurate representation of balance is because it’s subject to 3 things that can skew the data – RNG that varies by class and spec, different numbers of skilled players playing each spec and class, and different returns for players of different skill levels.
Refer to this graph:
What does this graph mean?
Well, not a whole lot. I made up 3 generic bell curves and put them on the same graph. In other words, it’s fake data, and it in no way reflects reality … but hopefully some people can learn something from it.
The dark blue represents Shaman damage, the light blue represents the damage of a Mage, and the green represents Hunter damage.
As you can see in the graph, the Shaman and Mage lines peak at the same number – 20,000 DPS. That is the average damage per second these specs will manage on a boss fight. The Mage line is much wider, however, meaning that when the player is subject to bad RNG or is unskilled, his or her damage will be much lower as a Mage than as a Shaman, and likewise with high skill and/or good RNG, the Mage will be much higher. In other words, the Mage has a higher standard deviation, but the average works out the same. The Hunter averages slightly higher than the other two, but the Hunter’s standard deviation is also much lower than the Mage’s.
Here, I’ve highlighted what might be roughly the 95th percentile for each class in red.
So we’ve determined that, based on my graph of fake data, Shaman and Mage damage is equal at 20,000 DPS average, and Hunter damage averages 21,000. But if you use the top 85th percentile of parses only, the Mage is going to appear to be averaging over 25,000 DPS, the Hunter at 23,500 or so, and the Shaman at 22,500, which is obviously a very inaccurate representation of what this data suggests.
Using a percentile instead of a set number of logs partially solves the problem of class and spec representation numbers, but not fully. Players tend to shift to specs with the perceived highest damage potential, which skews the data, and the other problems with RNG and skill factors remain.
So using a set number of data points (top 100) and a set percentage of data points (95th percentile) don’t tell the whole story. Does using the “All Parses” option do it? Well, it’s a little better than the previous two options, but it’s still subject to sample size variance, and you start getting into parses where players of certain specs are performing a vital role that’s far more important than just putting up numbers (such as Arcane Mages frequently being assigned to breaking Face Rage on Shannox, which I’ll go over later), as well as players with ungemmed, unenchanted, and unreforged gear, poor specs, little to no knowledge of how to play, disconnects, deaths, lack of basic understanding of encounter mechanics and so on. Put simply, the PvE game is not, and should not be, balanced around these variables, but the “All Parses” option includes them. All of these options have different forms of sample bias.
Holy crap. That’s an awful lot of reasons raidbots.com isn’t a great source of information about class balance. Not knocking Seriallos at all – they’ve put a lot of work into an absolutely amazing tool for the WoW community! (Epeen Bot is amazing for seeing how you perform compared to other members of your spec, and Compare Bot is an excellent way to get a quick summary of a combat log parse for those unfamiliar with navigating World of Logs, or comparing two parses.) But even they understand class balance is more nuanced than DPS Bot rankings, as you can see from the comments at the bottom of the main page. The Spec Scores on raidbots.com under all parses are the most accurate representation of class DPS balance you’re going to find on the internet … despite not being very accurate at all.
But there’s more.
This part really depends on what you view as true balance. Typically, in World of Logs rankings, players aren’t attempting to kill a boss as efficiently as they can. They are attempting to top the charts, which often involves playing inefficiently, attacking targets that don’t need to die, coordinating with other raiders in the group to ensure higher numbers, or “abusing” certain mechanics to inflate their numbers.
When you’re first learning an encounter, doing these kinds of things not only doesn’t help, but will often hurt your chances of success or ability to learn the encounter. A common example in Firelands might’ve been putting DoTs or using cleave abilities on Riplimb and Rageface while fighting Shannox on Heroic. This damage does literally nothing to improve your chances of success or work toward a kill.
Another example is Heroic Madness of Deathwing. Early on, many groups abused “Spellweave” to aid in killing important targets such as Mutated Corruption or Corrupting Parasite. After many guilds became comfortable with the encounter, they would often assign one or two people to attacking the bloods the entire fight, and order everyone else to not do so. This allows the bloods to keep healing, so the one or two players attacking them can abuse Spellweave for very high DPS over the course of the fight … most of which did nothing. The world first kill of Madness had an Enhancement Shaman present in the lineup. Excaliber from KIN Raiders had the highest damage done of everyone in the raid. Does that automatically validate Enhancement as an amazing DPS spec? Well… no. Due to early strategies often not killing the bloods for quite some time, and the Flame Shock spreading mechanic of Lava Lash, a great deal of that damage was meaningless Spellweave damage that contributed very little real benefit. It just looked good on the charts, nothing more.
So should damage that doesn’t count when it comes to completing a fight count in balance discussions? You’re free to disagree or believe what you want, but damage that doesn’t matter quite frankly doesn’t matter. It, unfortunately, comes up all the time as valid damage despite the fact that it did nothing. That Shadow Priest who did 56,000 DPS on Shannox in Firelands very well may have done 25,000 of it to the dogs – targets that did not die or need to be damaged by the Priest, and while there was some benefit to a Shadow Priest keeping Shadow Word: Pain rolling on additional targets for more Shadow Orbs, Vampiric Touch on any more than your primary target would be nothing but wasted damage. Meanwhile, the Arcane Mage’s damage is very low due to dealing with breaking Face Rage, but his or her raid value is very high due to it being a vital role. Because Mages were nearly always the ones chosen for that particular job, their numbers as shown on DPS Bot were very, very low. It wasn’t an even remotely accurate representation of what Arcane Mages were capable of doing on that fight.
All damage done, be it effective damage furthering your progress on the target you’re working on or not, “counts” on World of Logs rankings, and that data is obviously also reflected on DPS Bot. Because it shows on these rankings, players frequently take it at face value. According to many forum-goers, the DPS(e) column on World of Logs is the one that matters. Despite the column quite literally standing for “Damage Per Second (effective)”, it really truly isn’t telling you the whole story, and it’s rarely 100% effective damage. When talking about class balance, this needs to be factored in.
Similarly, healing is even less about the HPS you see on logs than damage is about DPS. Splash heals, heal over time effects, area heals – these are valuable tools to healers, but they also contribute to a lot of overhealing. The Shaman class has often been among the lowest in effective healing done in logs and on healing charts, but it’s really a much more complex issue than HPS on World of Logs. Classes that are heavy on Heal over Time effects, namely Restoration Druids, Holy Priests, and now Monks, tend to have very high HPS numbers. How much of that healing is truly effective healing? It’s not really a secret that HoTs do a lot of overhealing, but they also tend to contribute to other spells doing it as well. If I heal a target for 80,000 with Greater Healing Wave with 20,000 overheal, the Rejuvenation that ticked for 5000 half a second earlier may as well be considered as overhealing and its healing done added to the effective heal of the Greater Healing Wave… because that Rejuv didn’t need to be there to top that player off, but the Greater Healing Wave did. The problem is there’s no efficient way for World of Logs to tell you these things. There is also no efficient way for World of Logs to tell you when your heals have saved lives, and that’s the most important part. With the Restoration Shaman mastery being what it is, their lives saved are not very effectively reflected on the healing meters. Even if the meters show a Druid doing 20-30% more healing than you are as a Shaman, it doesn’t mean he or she is saving 20-30% more lives or being 20-30% more effective or valuable to your group.
The last “resource” that is often used is SimulationCraft. It is often cited in class balance discussions, but it also doesn’t make for a very strong argument for several reasons. First, the simulations most frequently cited are Patchwerk fights, and … those don’t exist very often. When they do, they are the outliers, not the norm. Between damage taken having an impact on damage done for some (Moonkins, Warriors, Shamans with Lightning Shield) and many boss abilities causing movement, target switching, AoE, vulnerability phases and so on, a Patchwerk sim just isn’t going to reflect real combat.
Even if each spec is modeled perfectly, a sim is still only as good as the user. The user needs to configure the simulation to fit the situation they are in to get accurate results – but even then, it’s only going to really be accurate if your spec is modeled very well (it varies from spec to spec) and if all players perform at the same level. Some specs punish mistakes more than others, and even the best players are going to mess something up sometimes.
Is DPS(e) on World of Logs the true end all be all of class balance, or should class balance be based on your value to a raid in a progression situation, which is much more nuanced and difficult to get a truly objective picture of? Let me know what you think.
“Different classes are different”
As we all know, classes are designed to feel and play differently. That includes nuances (players prefer to replace nuance with the term “clunky”) and each class has inherent advantages and disadvantages, but most of them have lessened in PvE settings over the years. For example, many classes have gained AoE capabilities or damage cooldowns that they didn’t originally have. So what are the inherent advantages and disadvantages classes can have? There are six important areas of damage dealing, and each class’s capability in these areas varies. I’ve included a rating for Enhancement’s performance in each area – 0 means we’re effectively incapable of it and it’s a huge disadvantage; 5 means it’s an area of strength for us.
Ramp up time: This can apply to single target and AoE damage. The damage of some classes just happens right away, while the damage of others takes some time to build up. Rogues have, historically, had a long ramp up period. Warriors have never had a meaningful ramp up period.
Enhancement has practically no ramp-up time on a single target. Lava Lash is not a truly large source of our damage, so Searing Flames not providing its full benefit isn’t a big deal for a single Lava Lash, and that’s only going to be a problem at the start of the fight. Otherwise, we have very little ramp up. Some complain about it occasionally, but all I have to say to that is “play any other melee spec in the game and you’ll see how good we really have it.” WotLK: 3/5, Cata: 4/5, Mists: 5/5.
Pets: This includes guardians and anything a player summons that attacks a target, like totems or Doomguard or Force of Nature. The advantages of permanent pets are few, but they exist. Pets can sometimes attack when a player cannot do so easily, such as during Morchok’s Black Blood of the Earth phase, but those situations are not common. While they sometimes have some AoE effects or other useful abilities, pet is, essentially, a glorified DoT with legs and a one-target limit.
Enhancement relies heavily on Elementals, Searing Totem and, to a lesser extent, Feral Spirit. With that come the disadvantages of being a single target DoT with a fair amount of maintenance. Even so, it’s not often a very big disadvantage (Al’akir may be an exception) and can occasionally be an advantage. 3/5.
Damage Over Time effects: Some of these happen automatically, but those I’m referring to are those that are casted. Corruption, Flame Shock, diseases. These do low damage per second, but have very high damage for their cast/execution time. Applying them to multiple targets will eventually yield very high damage per second, but it comes at a cost of very long ramp up time. Similar to pets, they can continue during movement and when you can’t attack. They can also be abused for “free damage” in many situations, which looks good on damage charts but often has no positive impact on combat.
Enhancement has Flame Shock, and can choose the Primal Elementalist talent for Immolate while Fire Elemental is up. These aren’t powerful enough to provide a huge benefit, but Flame Shock will nearly always be on all nearby targets due to Lava Lash. WotLK and Cata: 2/5, Mists: 3/5.
Area damage: Area damage comes in many forms, from small numbers of targets like Cleave, Chain Lightning or Blade Flurry, to damaging everything like Blizzard, Rain of Fire, Fire Nova etc. Area damage comes at different costs, and for some classes, it can even be a single target gain to be able to deal AoE damage. Enhancement AoE has a long ramp up time, but it’s one of the few that can increase our damage to our primary target, and is very good over long periods of time.
This has long been a major complaint of Enhancement and Elemental Shamans, and it is evident the developers have listened due to the constant tweaks and three AoE redesigns since Ulduar. The current incarnation of our AoE damage continues to receive a lot of complaints, but it’s honestly among the very best. The ramp-up isn’t great, as it doesn’t start to take effect until you are able to apply Flame Shock, Lava Lash, then finally Fire Nova, and that can be an issue in short AoE bursts or if your target ends up being killed quickly for whatever reason. Honestly, though, if things are dying that fast, there’s no need for you to contribute and your group is better served by continuing to single target. Our AoE on 5 targets that live for more than a couple seconds is very good, and it only gets better as more targets and more time are added. Chain Lightning’s interactions with Echo make it a very powerful spell as well. WotLK and pre 4.3 Cata: 1/5, 4.3 and Mists: 4/5. The only reason I can’t give 5/5 is because of those times your Flame Shock target dies before you can spread it.
Burst damage: This is probably the single most important advantage a damage dealer can bring to the table. Having the ability to control when large parts of your damage gets done is extremely valuable, as has been shown by numerous encounters over the past expansion, like Magmaw, Chimaeron, Maloriak, Ascendant Council, Cho’gall, Alysrazor, Ragnaros, Hagara, Spine of Deathwing and so on. It allows you to meet DPS checks in dangerous phases, it allows you to kill dangerous adds faster, it allows you to take great advantage of vulnerable points, and when it comes to the meters, it puts you on top at the start so you are on top on short fights, and still do respectable damage on the long ones. Bursts of damage are flexible, they are useful, and this is frequently one of the most important considerations for top tier progression.
Throughout Cataclysm, we had none. With the addition of Ascendance, Fire Elemental applying Searing Flames, Feral Spirit receiving a buff, Stormlash Totem being added, and Elemental Mastery being a talent choice, we’ve shifted from the very worst to the very best. Elemental may rival or be slightly better than Enhancement and some other specs can burst at insanely high levels, but Stormlash keeps Shamans ahead. WotLK: 3/5, Cata: 0/5, Mists: 5/5.
Mobility and Ranged damage: These are very important for most raid encounters, and has often left melee as less than desirable for cutting edge guilds. Ranged allows for fast target switching and the ability to damage targets from safe places, and one of the most important parts of playing a melee character is finding ways to stay on target as much as possible.
This is a weak point for nearly all melee. Only the specs with charge abilities (Warrior, Feral) are going to perform well with a lot of movement over longer distances. We have pretty decent and mobile ranged damage with Glyph of Unleash Lightning though, so it could be worse. You won’t see any melee above 3/5 here. WotLK: 1/5, Cata and Mists: 2/5.
Balance is not easy.
So, back to my point. I’ve written an awful lot and can finally give a subjective answer to the question. Yep, just my opinion, no facts. How bad was it for Enhancement in Wrath and Cataclysm?
As Vixsin stated well over a year ago, Shaman DPS spots had been disappearing in cutting edge guilds as our unique buffs were handed out to everyone else and we were given nothing to compensate. Every resource available has shown Shamans as among the lowest DPS in the game, be it logs of world first kills, World of Logs rankings, data compiled by Raidbots, SimulationCraft results, or anything else, what we’ve seen has been depressing. It’s hard to believe that it’s really not the full story when nearly everything we see agrees that we weren’t very good. But it isn’t the full story. Enhancement certainly seemed underpowered to me through Trial of the Crusade, Icecrown Citadel, heroic tier 11 and Firelands. Blizzard even admitted after the fact that our damage was probably a little low through the latter half of Wrath… but not by much. Our damage on its own has been very close to fine, and for the vast majority of players, our deficiencies could be made up for by simply playing a little better. It’s merely been a lack of beneficial ways of dealing damage in cutting edge scenarios that has been our problem, such as burst damage, AoE, and so on. A problem that has been solved for us in Mists of Pandaria.
As far as determining how well balanced classes are goes, I haven’t even begun to factor in some other very important aspects of balance. Utility is perhaps the most difficult and most variable from situation to situation. Group healing effects, damage reduction effects, immunities, movement abilities, kiting abilities, interrupt abilities… what are the value of each of these forms of utility in a PvE setting and how do you determine when classes are and are not balanced in this area? Many players don’t even consider these things – the new talent trees receive a lot of criticism because not every talent has a direct impact on your most basic primary role in a group, but having enough powerful utility abilities to take care of the many situations dungeon and raid encounters throw at you is often paramount to success. How much emphasis needs to be placed on balancing utility in a PvE setting?
So how does Blizzard do it? Honestly, exactly how they say they do it. They look at everything, all data from all credible sources, especially their own. They make informed decisions after truly analyzing the data that we as players only get to see some of. Popular raid compositions, PvP team compositions, representation in all areas, it’s all considered, and all of it has varying levels of impact on their choices made. The right moves to make are simply a lot less obvious than players like to believe. It’s very easy to find data that appears to support a point on the surface, but finding a true representation of balance isn’t so easy.
Class Balance is a difficult job. Yes, it’s really this complicated, and I’ve barely scratched the surface. When you factor in how the tiniest adjustments can have a wild unforeseen impact on PvP balance, the effects of gear, set bonuses and trinket effects on top of such a wide variance in player skill levels and the simple fact that the collective intelligence of 10 million players is much higher than 150 developers when it comes to discovering new tricks and new ways to use abilities that isn’t expected, yeah, class balance isn’t easy. Not only that, but having a truly meaningful discussion on class balance requires those involved to not only see the data that is available, but analyze it and understand what the data means. The data doesn’t always mean what it appears to say. And despite the perception of many that says otherwise, it really is very impressive just how good Blizzard has been at balance recently, and how much better they are getting as time goes by.