The Issue of "Perfect Balance" and the Misconception of "Part vs Part" Comparisons

So recently there's been a lot of discrepancies among different members on how certain Beyblades interact in certain matchups, which makes it difficult to determine what's necessarily viable in a certain situation, or how a matchup will typically play out. While it's common knowledge that part performance will vary slightly between individual players (and subsequently these players' opinions of the part in question), the degree of variation for performance in Burst parts is much larger than ever before. The two most common causes of these discrepancies are:
  • The differences in "Balance" for Burst parts/combos, which is just a particularly frustrating case of natural product variation and not really something that can be fixed.
  • Judging the viability/usefulness of parts based on "part vs part" matchups rather than based on combos or the concepts behind these combos, which, quite honestly, I think is a fundamentally wrong way of looking at the game.

Perfect Balance backstory, mostly for documentation's sake (Click to View)

...Which brings us to the current (November 5th, 2016) problem of Neptune, Wyvern, Deathscyther, Odin, and Dark Deathscyther. In the past, the established Stamina "food chain" was Deathscyther = Odin > Dark Deathscyther > Neptune > Wyvern. However, more recently balance has become a factor and thrown a wrench in the whole thing, causing drastically different results between players. A prime example of this is 1234beyblade's Wyvern, which has near-perfect balance and is able to outspin any of the other competitive Stamina Layers when they have average-to-above-average balance. Recently some other people have been using more balanced Wyverns as well and are generally able to match Dark Deathscyther in terms of Stamina. While we as a community may not have been aware of it at the time, I'm fairly certain balance was the source of the Deathscyther vs. Odin debate - the winner between the two was probably whichever one was blessed with better balance.

Meanwhile, both of my Wyverns lose to my Dark Deathscyther by a pretty large margin, which is the case for a few others as well. However, my D2 is able to outspin some Deathscyther combos, which isn't possible for others unless D2 is given a significantly stronger launch. My Neptune will usually either barely lose to my D2 or tie with it, while for most others D2 obliterates Neptune. While Unicorn isn't included on the list of conventional Stamina Layers, my Unicorn narrowly beats my Neptune and can beat my Deathscyther in certain matchups, while the majority of other players see Unicorn as garbage. See, the problem here is that balance is all relative - do my D2 and Neptune just have great balance, or do my Deathscyther and Wyverns just suck? Did the majority of competitive players get stuck with a bad Neptune that loses to not only D2, but Wyvern as well? Who is "normal", and who is the outlier?

This raises an even larger headache in the form of "potential stamina", meaning the maximum amount of Stamina a certain part can theoretically have if it's balance is indeed perfect. While truly perfect balance is physically unattainable, in a world where all Burst parts had perfect balance and all players had equal strength launches, which parts would really rise to the top? The issue is basically the same as the free-spinning parts in Standard Spin-Equalizing matchups (with zero friction between the free-spinning components being the unattainable equivalent to perfect balance): we have a special note on our top-tier list that basically translates to "assume best possible F230 for F230 combos", and the differences between F230 performances were most likely a large component in the controversy surrounding its ban - for some people, it "just didn't work". So for the Burst tier list, do we do the same thing and "assume best possible Neptune/Wyvern/D2/etc." and have all of them listed as viable options for Stamina, or do we go by "majority results" (which is fundamentally what the top tier list is to begin with) and only have the traditionally higher-end Stamina layers like Deathscyther and Odin listed?



The other issue that's been grinding my gears lately (and is kind of difficult to explain, so bear with me) is the way people view matchups on a "part vs part" basis rather than "combo vs combo" or even a "concept vs concept" basis, and tend to pigeonhole the use cases for most parts into one specific matchup. Maybe this mindset developed due to the lack of formal combo testing threads lately, in which the OP is highly encouraged to not only post tests, but also explain their part choices, the concept behind the combo, and how it is intended to be played. While it's tough to find specific examples for a general mindset that members here tend to perpetuate, a good example off the top of my head is Valkyrie and Deathscyther.

Nearly everyone here will agree that Valkyrie is arguably the best Attack Layer in the game and counters Deathscyther very well, and it's also a true fact that Stationary Valkyrie combos were originally developed as a "safer" way of defeating Stationary Deathscyther combos than using Valkyrie on Accel. However, these two statements have been distorted to the extreme, so that they now sound more along the lines of "Deathscyther's only counter is Valkyrie" and "Stationary Valkyrie only exists as a counter to Deathscyther". Neither of these statements are true, and the numerous offshoot sayings they have spawned ("O2 only beats Odin" is one I can think of off the top of my head) are reflective, in my opinion, of a flawed and rather nonsensical view of the game on a fundamental level. If I were to put it into a MFB context and say something like "Burn's only counter is Lightning" or "Lightning's only purpose competitively is to counter Burn" it would sound completely ridiculous. Plenty of other things are capable of beating Burn/Deathscyther, and Lightning/Valkyrie have myriad other uses aside from taking out Burn/Deathscyther combos.

What I think needs to be done to get out of this mindset is for people to step back and look at the combo as a whole, as the sum of all of its parts, and its intended performance/role/goal rather than just at each part individually. For example, let's take Valkyrie Armed Claw (shameless self combo plug, but it doesn't really change the point I'm trying to make):
  • Valkyrie is the best Layer for Bursting opponents regardless of what they are.
  • Armed is, as far as I am aware, the most Burst-resistant Disk in the game due to its light weight and wide vertical weight distribution (I'll bump my Burst Attack/Defense thread sometime soon with more on this theory), which minimizes the risk of the combo self-Bursting and allows you to launch the carp out of it while not particularly worrying if your opponent Weak Launches.
  • Claw's tip shape gives the combo not a lot of friction with the ground, which decreases the tendency to self-Burst, and is also one of the heavier Drivers, affecting the vertical weight distribution. I use Claw over Revolve because Revolve precesses earlier into the match than Claw does, making for weird contact angles that don't focus as much force horizontally on twisting the opponent's Layer.
That reads a lot like the "parts choice" section of a testing thread, and while I did say previously that I wanted the community to get away from focusing on specific parts, it's more about how these parts help the combo do its job rather how they specifically match up against another part in a different matchup with a bunch of other factors at play. This "job" that the combo is supposed to do is Burst Attack - its method of winning is by Bursting its opponent. This opponent does not have to be Deathscyther, either. It works, to a degree, against anything stationary. Obviously it's going to have an easier time against Deathscyther Gravity Defense than it would against Neptune Knuckle Yielding, but bringing back the MFB analogy that's no different than Lightning CH120RF beating Burn AD145WD more soundly than it would Earth 100CS. The weaknesses of the combo must also be looked at from a conceptual point of view as well - more mobile opponents with a high Burst Defense (Orbit combos in particular) will counter VAC, as it is mostly unable to hit them and the hits that do land are less likely to make the opponent skip its teeth. For example, as a response to stationary Valkyrie, people started using mobile combos like Deathscyther Stallers to avoid contact with it.



I didn't really begin this post with a conclusion in mind; I basically just wanted to get both of these issues out on the table. But I guess the point that I'm trying to make is that it seems like people are focusing too much on just parts (Layers in particular) and mistakenly pinning a combo's success or failure on "how one Layer interacts with another" when it's possible that the other parts are affecting the outcome. If I could bring up one specific battle I've had that I think embodies this point completely, it would be my most recent match with Yami.


I guess I also just wanted to bring up the question of how we are going to handle balance issues with parts since they cause such widespread differences in results across the board.
Very good explanation! I definitely agree with you - seeing combos as a whole is a must. Just seeing parts separately and assuming just because, for example, what you mentioned, Valkyrie's the best Attack layer in the metagame, doesn't mean the other parts are horrible itself. Seeing a custom as parts seems so puzzling. It'll only make the process of determining if a combo is effective, much harder. Knowing on how each part performs, Bladers would of course, see if it all works together as a custom - using knowledge and proper testing. I do also agree with you with the lack of testing combos threads - I feel like it might be one of the reasons for this particular incident. Great thread overall.
Well said, I totally agree. I would like to see more combo testing not just more part testing.
Great post. I haven't taken the time to thoroughly test the balance of my parts but everyone was really surprised by my red Wyvern's stamina at the previous Toronto tournament. Unfortunately, I don't think there is much that we as an organization can do to mitigate the impact of this; we are somewhat at the whims of Takara-Tomy's product.

I agree with you that there's been a severe reductionism in the way combinations are discussed, while it's mostly just bad habits, I do think it's due somewhat in part to the reductive nature of the metagame right now. The MFB metagame did have a somewhat wider breadth of competitive variety at that point than Burst does now, which is why the analogies make MFB sound ridiculous. Or are you saying that we're just not seeing the potential variety in front of us?

I'm also a bit confused by your example matchup; I wouldn't normally expect VAC to do well against D2-anything. In my experience, people pretty much do exclusively use stationary-Valkyrie combos as Deathscyther counters. If you're saying you can consistently beat D2HD with VAC, that's really cool and it would be an outlier in reported experiences so far I think.
Synergy between parts seems to be much more important in Burst than in previous generations. Whether your goal is creating a "perfect balance" Beyblade, optimizing Burst Defense, or succeeding at one or more of the many strategies that Burst makes possible, it's more important than ever that your Beyblade is a single unit, not a stack of parts. The old "compare parts by swapping out one piece" is not necessarily a valid method of testing anymore, because the huge differences between Layers may make one part far more viable for that specific combo, but does not necessarily mean that it will work for all combos. Manufacturing variance and natural balance just throws another variable into the equation, as you've pointed out, the same parts do not necessarily grant the same results for everyone.
I find it funny that you bring this up, because I customize my combos as a whole, bit by the time i get new beys, something newer and more interesting usually comes out.

Could you show us a picture or describe this "Balancing Machine" that Wari Bey has?
Awesome post! I wasn't even aware of this phenomenon and now it makes me want to get multiples of the Beys I get. It seems like this would be a good recommendation for all Burst players due to the increased random factor.
(Nov. 08, 2016  7:50 PM)UGottaCetus Wrote: Awesome post! I wasn't even aware of this phenomenon and now it makes me want to get multiples of the Beys I get. It seems like this would be a good recommendation for all Burst players due to the increased random factor.

Or, to anyone sensible, it is a valid reason to just quit Beyblade Burst... Because if you even buy multiple copies and open them only to find many boxes later a right mould, you cannot even try to sell your other copies because they are opened and people would understand that you are trying to get rid of bad variations and keeping only the good ones...
I will do what I can with what I have

How do you know a part is balanced when you don't have two of the same part to compare?
(Nov. 08, 2016  7:55 PM)Kai-V Wrote:
(Nov. 08, 2016  7:50 PM)UGottaCetus Wrote: Awesome post! I wasn't even aware of this phenomenon and now it makes me want to get multiples of the Beys I get. It seems like this would be a good recommendation for all Burst players due to the increased random factor.

Or, to anyone sensible, it is a valid reason to just quit Beyblade Burst... Because if you even buy multiple copies and open them only to find many boxes later a right mould, you cannot even try to sell your other copies because they are opened and people would understand that you are trying to get rid of bad variations and keeping only the good ones...

This is not really that new. I bought 12 customize sets containing Virgo in MFB and the solo spin time difference between the best and the worst Virgo Wheel was over a minute. There were variances between each one of them. There's also a history of both visible mold changes and more imperceptible differences between releases, so it's not like manufacturing variance has never affected how people play before.

However what has changed here is that 1) the overall construction method is much looser than previous generations, exacerbating any vibrations caused by poor balance, and 2) many matches now are either mirror matches or decided by split-second out-spins.

Not to make excuses for Burst, which has been thusfar the worst-engineered and least-durable Beyblade series by far. I just wanted to bring up that this is a factor that has been a part of Beyblade for a long time, and has likely influenced many more match outcomes than people realize.
At least Virgo was relatively quickly outclassed though, and there were alternatives. You could litterally use Libra without having to worry about Virgo, and there were barely notable differences within Libra's except the obvious moulds. Now if every single Burst part has balance variations or the possibility of having those at least, the situation is way different.
(Nov. 08, 2016  8:49 PM)Kai-V Wrote: At least Virgo was relatively quickly outclassed though, and there were alternatives. You could litterally use Libra without having to worry about Virgo, and there were barely notable differences within Libra's except the obvious moulds. Now if every single Burst part has balance variations or the possibility of having those at least, the situation is way different.

Is this true or just assumed? It seems unlikely that this was totally unique to Virgo. It's very rare to get the opportunity to compare many different copies of a part in similar condition against each other, so I wonder how often people were actually able to check for things like this. I think it's just more immediately obvious (and of more consequence) in Burst for the reasons I wrote above.

This is an actual question, so if there is testing or other posts about this topic, would love to check them out Smile
Hm, I am certain that there were no variations observed. Or, everytime there was, it was really due to a different mould which was too subtle to be noticeable at first (Gravity), and the variation in performance could be reproduced by anyone who owned that alternative mould. Besides in terms of tips, like people being unable to do well with RDF but others swearing by it for top-tier combinations, I never heard of anyone doubting the performance of a Metal Wheel in a situation where others found it amazing. Usually, it was amazing for everyone, no exception.

The closest similar context I could see is with Synchromes, since two Chrome Wheels could probably rattle if they did not fit well, but I do not remember there being such imbalance issues reported.
Beymazing post friend! I completely agree we should see combo as whole rather.
(Nov. 08, 2016  10:01 PM)Kai-V Wrote: Hm, I am certain that there were no variations observed. Or, everytime there was, it was really due to a different mould which was too subtle to be noticeable at first (Gravity), and the variation in performance could be reproduced by anyone who owned that alternative mould. Besides in terms of tips, like people being unable to do well with RDF but others swearing by it for top-tier combinations, I never heard of anyone doubting the performance of a Metal Wheel in a situation where others found it amazing. Usually, it was amazing for everyone, no exception.

Sure, but establishing whether a part is good or even great is separate from establishing which variances are possible. A bad Virgo still easily outperformed most Wheels at the time for Stamina (except Libra), so even though all players held it in universally high regard there were still huge variations possible. So while all players might have similar results using Libra, and all find it similarly great, that does not mean that all those Libras would perform equally if brought together.

Again though, just because there were variances before does not mean their impact isn't outsized now; it definitely is, especially in the case of Deathscyther, and for the other reasons I already mentioned. Less reliance on stamina-types is pretty much the only answer for this since totally consistent manufacturing isn't possible.
(Nov. 06, 2016  5:53 PM)Bey Brad Wrote: Great post. I haven't taken the time to thoroughly test the balance of my parts but everyone was really surprised by my red Wyvern's stamina at the previous Toronto tournament. Unfortunately, I don't think there is much that we as an organization can do to mitigate the impact of this; we are somewhat at the whims of Takara-Tomy's product.

I agree with you that there's been a severe reductionism in the way combinations are discussed, while it's mostly just bad habits, I do think it's due somewhat in part to the reductive nature of the metagame right now. The MFB metagame did have a somewhat wider breadth of competitive variety at that point than Burst does now, which is why the analogies make MFB sound ridiculous. Or are you saying that we're just not seeing the potential variety in front of us?

I'm also a bit confused by your example matchup; I wouldn't normally expect VAC to do well against D2-anything. In my experience, people pretty much do exclusively use stationary-Valkyrie combos as Deathscyther counters. If you're saying you can consistently beat D2HD with VAC, that's really cool and it would be an outlier in reported experiences so far I think.

It's not really about mitigating the variations in Takara-Tomy's product (as that isn't something we can do), it's more about how we are going to handle the vast range of results caused by these variations. It's just that the statement "Deathscyther is the best Stamina Layer, but Wyvern, Neptune, D2, and Unicorn can OS it if they have good balance" is really complicated and doesn't really offer a real answer of what the best Stamina Layer is. It's more like saying "Deathscyther, Wyvern, Neptune, or D2 might be the best Stamina Layer, try them all yourself and see what works for you", and while there have always been cases where certain parts will perform better for some users than others, the whole balance issues with Burst make this happen to a much larger degree than ever before.

I think that the potential for diversity and variety in the metagame is definitely there, it's just that due to the lack of testing and reductive nature of the game that it's not really being explored. People are quick to dismiss new parts in favor of the old without really examining them (V2 being the best example of this IMO), with few exceptions (D2, and to an extent Revolve and Orbit). There's a whole bunch of parts that have never even been informally tested and could potentially be useful, it just seems that people aren't willing to experiment.

I agree that stationary Valkyrie works best against Deathscyther, but that doesn't mean that it's wholly ineffective against other combos and should be relegated to only beating Deathscyther. The way I see it, people are looking at Valkyrie as a "Layer that counters Deathscyther" rather than "an Attack Type Layer".

(Nov. 06, 2016  7:06 PM)Achi-baba Wrote: Could you show us a picture or describe this "Balancing Machine" that Wari Bey has?

I don't know if there are any detailed pictures of it, but Kei mentions it quite a few times in his thread here (just ctrl+f 'balance' and 'device' and you should be able to find information about them relatively easily).

(Nov. 08, 2016  10:01 PM)Kai-V Wrote: Hm, I am certain that there were no variations observed. Or, everytime there was, it was really due to a different mould which was too subtle to be noticeable at first (Gravity), and the variation in performance could be reproduced by anyone who owned that alternative mould. Besides in terms of tips, like people being unable to do well with RDF but others swearing by it for top-tier combinations, I never heard of anyone doubting the performance of a Metal Wheel in a situation where others found it amazing. Usually, it was amazing for everyone, no exception.

The closest similar context I could see is with Synchromes, since two Chrome Wheels could probably rattle if they did not fit well, but I do not remember there being such imbalance issues reported.

I think that just because there were no observations observed doesn't mean that they didn't exist. Even if balance wasn't explicitly acknowledged, the procedure for Stamina tests has you switch out the control parts between the two Beyblades to make sure that variations between them aren't the reasons behind your results.

"Perfect Balance" Synchroms, are most definitely a thing, although they were never discussed in depth here as far as I am aware. I know Cake has a "Perfect Balance" Balro Bregirados pair that, according to him, spins a good three minutes longer than any other combination of duplicates he has of those wheels. TheBeyNinja also won third in a Standard tournament last May using a Revizer Killerken TR145WD with exceptional balance, which outspun several combos that it normally would have lost to like Duo B:D and Dragooon B:D.

As for doubting the performance of the Metal Wheel, there's been multiple pages of discussion in Limited threads regarding which select few Wheels would be considered Top-Tier: there were the obvious choices that just seem to work well for everyone like Wyvang, Omega, Lightning, and Dark Knight, and then a whole slew of others that worked incredibly well for some people, but not so well for others, like Screw, Pegasis, Vulcan, Beat, Phoenic, Quetzalcoatl, and Cosmic.
Quote:I think that the potential for diversity and variety in the metagame is definitely there, it's just that due to the lack of testing and reductive nature of the game that it's not really being explored. People are quick to dismiss new parts in favor of the old without really examining them (V2 being the best example of this IMO), with few exceptions (D2, and to an extent Revolve and Orbit). There's a whole bunch of parts that have never even been informally tested and could potentially be useful, it just seems that people aren't willing to experiment.

In my experience, my results with V2 have just been too mediocre or inconsistent to be worth writing about here, and I think a lot of people are in the same bucket. I think I was the first person to start talking about D2 at Beyblade North, after the main event — it's crazy that it wasn't really used at all at Beyblade North — and people caught onto that, like they did with OHD, because the results are just provably good. People coalesce around the combos that perform consistently well with minimal effort.

We have seen some use of V2, but I think most of the players who opt to not use V2 aren't using Valkyrie that much, either. It's just that neither of them breaks the threshold of power and versatility to be considered safe enough to play compared to the other options, most of the time, for most players. I'm one of those for now, although my inability to play at home against D2 for now has been hampering my ability to work on new strategies. (Thankfully my RBV4 just arrived!)

You mention that parts haven't been informally tested, but how would you know? There's been discussion here about the death of testing; in my case, I don't really know what to do with my observations that don't have formal testing to back them up. I don't want to share them here for fear of being misleading.

As a community there hasn't ever been much collaboration towards discovering new combos, because sharing uncertain hypotheses and incomplete results would be necessary for that. But is that the kind of thing we should be doing more of?
I do think that sharing hypothesis and incomplete results is the type of thing that we should be doing more of. I feel like testing is dead because a lot of people won't be bothered to go through all those rounds of formal testing. Talking about informal testing gets a discussion going. Based on those discussions, people try out different combos and see what they like and don't like. Honestly, from there the winning combinations thread will be all the results needed.

It doesn't make sense to me that it is absolutely necessary for someone to have done 20 rounds of testing before being able to discuss a combo they think is good.
I go for ten rounds with Burst these days, but really, you cannot just ignore the fact that less than that is extremely inaccurate. I have no issues with people sharing their informal observations, however not only can we indeed not trust everyone like Kaneki's fraud demonstrated, but also at least tell me how many times and against which series of combinations you tested a part before telling me that it is just decent or that it has potential in Attack or something. Those are the key elements from formal testing that should at least be mentioned in informal comments/observations posted for anyone to be able to take them seriously. Of course there are people like Kei whom I trust when they post their observations without specifying all of that, but with anyone else, I really have no guarantee that you did not just test it against two customizations and with only one or two different setups before forming your opinion on it. And now, as this thread also highlights, what about if the combination you tested a part in that you do not even mention was actually not balanced at all?
I think ten rounds is fine, but to automatically invalidate someone's informal observation because of a lack of formal testing kind of staggers the discussion of the observation. Is not sharing informal observations because of a lack of test results for fear of being misleading better than having the conversation that the post would invoke?

Look at a part like Dark Deathscyther. That part saw tournament use before any formal testing indicated that it was that good, simply from people discussing their observations from informal/unranked matches.