PSA: Why You Should Care Less About Testing

It has come to my attention that the community in general has the wrong view on testing. To put it simply, Testing should not be the be all end all for parts. It seems to be a common occurrence for parts being overlooked because of their initial tests without any sort of depth or analysis. Overlooked parts have been a trend, too. War Bear – an Attack Ring that has been showing use in Dragon Saucer based combinations – was shoved under the rug because it was tested for the wrong reasons. Another example was the layer Wyvern. Wyvern on release was considered bad, but after tournament reports, people found out it can be used in stalling combos efficiently.

What seems to be the problem is peer influence. A highly ranked or highly regarded player testing or saying a part is bad seems to have a much bigger influence than, say, someone who is not as highly regarded. What we, the community, should be doing, is seeing each other as equals. Right now, it seems as if it’s very hard to get any traction with your tests if you aren’t a reputable member. This is something that’s definitely been a thing for a while, but I feel like someone has to eventually say something about it. It’s definitely true that people should have more bias to the people who often bring good, valuable information to the table. However, that shouldn’t mean you should neglect lower profile users. Everyone’s opinion should definitely be considered, and yes – sometimes people do have really strange or unorthodox ideas that don’t work at all thinking about it on the basis of Beyblade physics. To put it simply: If they have a solid backing as to why X part might work; you should definitely try it for yourself. It seems as if even a SUGGESTION for a part in use in X combinations get shot down because a higher profile member said it was bad a long time ago.

I’ve come to the conclusion that the WBO has lost the meaning of these tests. Testing should be seen as a way to get a really general, vague approximation of a part’s performance. Relying on tests is playing Beyblade wrong, in my eyes. But, as it stands currently, it looks like tests have gained more weight than it should even have. Sure, testing is good for initial part releases, as there are no tournament results for them. But, you should really rely on tournaments records - which most people should know is a more reputable source regardless.

I have also observed that it’s hard to gain any sort of traction or give any sort of insight without testing. It’s come to the point where you can’t even suggest parts to use or say what combos have been working well for you without tests involved. Besides, a mere 20 rounds won’t really do anything for determining the part’s competitive worth. After saying all of this, yes – I understand that singular tests aren’t worth anything, and that actual testing is finding patterns within other users. That’s something almost everyone should pick up on.

On the format for testing, I feel like it should be changed somewhat. How I don’t know, but the current format does not work. Testing should have more rounds, but with how quickly burst products wear, it’d be definitely VERY discouraging to even suggest trying it. It seems to be a tiny bit of a common occurrence that Burst parts get only one test and have that be it. Even then, people can definitely tell a part's performance without public testing in the first place - as seen with unicorn. Other than that, however, testing itself is very difficult and has too many different factors within each case to even be considered relatively valid, which again, should be why it should be taken much laxer than it is right now.

In conclusion, I’d like for the community as a whole to sort of take a step back from testing and realize that we don’t necessarily need to be so stone cold on testing. This post isn’t about telling people to stop testing; so don’t get that idea. I just want everyone to look at testing in a different light. People often times have varying success in both extremes with different parts. Testing as a whole is too vague and the current format for testing is honestly not very good.
Yeah I definitely agree with the fact that testing seems to be taken too seriously. Testing is, as you said, a vague idea of how a part might work in competition. But as we know, it is near impossible to fully remove all of the variables but the one you are testing and even if you can, unforseen variables get thrown in at a tournament. I think testing should only ever be viewed as a jumping off point to give you some ideas for your own personal testing and then even your own personal testing should be seen as a jumping off point for tournament strategies.
While I fully agree that the current standard method of testing isn't perfect, I don't really think there is much we can do to change it. Requiring more rounds unfortunately would require more wear and tear and time and effort. I think, if it were required to do a proper amount, say 50 rounds, a lot fewer people would test at all. I mean we really only really have a handful of people who test as it is and while tests should never ever be seen as the word of god, they do serve some purpose.
I think the best solution would be just for more people to test and maybe to test with 2 people if at all possible. That way we get to see results from a lot of different perspectives and add to the sample set. A part should never be considered good or bad based on one 20 round test. Ever. A part shouldn't even fully be seen as good or bad from 5 tests by one person or even a handful of people. Especially with burst, there are tons of variables that could change the outcome of a batlle
Also, as you touched on, nobody should be afraid to test. The more tests we get, the better. Whether you joined 10 years ago or yesterday. As long as you follow the standard procedure, your results are just as valuable as anyone's. And don't be afraid to revisit parts that others never tested or think are bad. It has happened many times where parts are overlooked. Just look at plastics for example. In my opinion, the best attack and defense combos were just found wihin this last year
While I agree with your original assertion, I can't help but feel like you're over-exaggerating the issue. There is literally no other way for someone to prove that the combination they're proposing works than by putting it into practice, and as people typically avoid using unproven combinations during tournaments, testing is often the only option. Here on the WBO, people suggest new combinations to try on a daily basis—since most of us aren't going to jump at the opportunity to wear down our parts trying a combination that has never been tested before, the best way to catch someone's eye is by backing up a hypothesis with personal experience to provide a starting point for further examination. That is obviously not to say that 20 positive tests from one person determine whether or not a customization is top tier and I highly doubt that this has ever been the case on this forum. When a combination is under consideration of being top tier, it is tested by a variety of experienced players under various conditions. While positive testing does not guarantee that a customization will work in a tournament atmosphere due to a variety of personal and external factors, it is good for generally assessing the strengths and weaknesses of popular combinations.

On the issue of "peer influence," I'm not saying that what you're arguing isn't an issue on the forum, but I highly doubt that it's as problematic as you claim. From my experience on the WBO, particularly when I was involved in the competitive scene, there was never an atmosphere that discouraged members from trying out combinations with notoriously bad parts. Heck, I made testing threads for the combinations Lightning L Drago F:D and Screw Uranus 90 SF in order to prove those assumptions wrong. In fact, I would argue that members are just as inclined to test customizations with so-called "bad" parts simply for the fun of trying to find something that everyone else has overlooked.

As for customizations from non-reputable members being overlooked, I would say that is simply a matter of people just not having the time to look into every combination that is suggested. While I would agree that this is something that needs improvement, even when I was very active competitively, I still had threads and combinations that got little attention. My suggestion for members who are experiencing this would be not to take it personally. If you feel particularly strongly about work you've done, try reaching out to members of the community for advice or feedback—these requests will rarely go ignored.

Finally, in response to your comment on relying on tournament results first and foremost, I would like to build on my earlier point and say simply that tournament results, while a more accurate assessment of a combination in practice, also depend just as heavily on their user. Not only could a weak launch result in an unfair loss, but the very emotional condition of the player—nerves at a first tournament, for example—could impact their execution of a certain customization. Tournaments by nature take place in a less-controlled environment than testing and being best out of only a few matches, can easily result in wins by luck or issues relating to the opponent. To return to my earlier point, brand new combinations themselves aren't likely to be tried out in a tournament setting, and there is no control over what the opponent is using.

Welp, I think that's all. Just wanted to contribute my two cents on the matter—hope this helps.
I think the problem lies with negativity. Someone saying "don't even bother [testing/using] X part, it's bad" stifles curiosity and experimentation. Often times, strong combos and new ideas come from unprecedented or counterintuitive angles, and we need to provide an atmosphere that encourages people to test and post unique combos, theories, and uses for parts. The first wave of tests (or even hundreds of rounds of testing from dozens of members) may be insufficient simply because no one has looked at a part from the right angle.

It's true that nobody has the time to extensively test every combo that comes up, but please (especially in the case of the higher-profile, more reputable members) consider the effects of your decision not to test something that you could. Even just playing around with a combo for a few minutes is enough to change your implied tone from "Your work isn't worth my time" to "I've tried it, but it doesn't work as well for me". On a similar note, try to avoid creating or becoming part of a bandwagon, for any stance. The words of a few people, no matter how "important" or "popular" they may be on the site, are far less important and convincing than actual experience and test data. If you want to test, test. If you don't think it's worthwhile, don't. But please don't attempt to convince others that something isn't worth their own time; if Blader Jimmy wants to spend a few hours testing Poison as a top-tier Attack wheel, by all means let him. If his equipment and procedure is up to snuff, the results will speak for themselves. My point is, we need people willing to do tests, so don't discourage them from doing them.
So, let's zoom out a bit, because I fear this discussion could get lost in a flurry of arguing over small details.

I wrote about this a bit in the WBO Random Thoughts thread the other day, but I do agree that the community has lost the plot a bit in terms of what testing is supposed to accomplish in the first place.

Beyblade is a game full of countless, nearly-immeasurable variables. Many people testing the same battle conditions can end up with completely different conclusions. Additionally, 20 rounds is not nearly enough to produce statistically-significant results; however, it is enough to give us a rough approximation of a part's performance, in many conditions. And it can be enough to take an idea from "pipe dream" to "relevant."

However, it's gotten to the point where it's become the cornerstone of all Beyblade customization discussion. Like Shirayuki mentioned, I don't even feel like I can discuss parts that I feel are viable/unviable, or share combos I feel are working well for me, because the burden of evidence for even initiating a discussion is just too much. And again, in the case of parts like Oval, people are shamed for saying that they seem unusable (and yet, tests showing viability of terrible parts like these never surface despite non-stop complaints, probably because everyone who actually owns and uses the parts feels that they're useless and not worth testing.)

There's also the issue of tests being taken as fact once they're posted. Wombat's thread on Disks and Burst Defense is an awesome read with some truly interesting data, but it's also a relatively small sample size and representative of just one player's solo experience. And yet I see comments all the time repeating these findings as if they were facts written in stone.

Unlike trading card games or video games, where gameplay "rules" (in this case, rules refers to the effects of gameplay elements, not rules of play) are written/designed by the game designers and the work comes from interpreting how those rules interact with one another, Beyblade's rules aren't arbitrarily defined; they are the consequence of interactions between physical objects rotating at high speeds, moving through three-dimensional space with various heights and angles. Even the designers have only the loosest understanding of the gameplay effects of each part, and no two battles are alike, nor can they be alike. Every single battle in Beyblade is different from every battle that's ever been played. The variables are, as I said before, countless and immeasurable.

And ultimately, tests don't reflect the reality of playing Beyblade competitively. As a small example, when facing 1234beyblade in a tournament, I knew that there was no way my launch power — and subsequently, movement speed — could match his without a handicap. This was pre-Xtreme, so he was using Accel. My only option was to use Assault as a gambit and hope I could KO him. This payed off multiple times. But this combo would never test well, and I would have never called it "competitive."

Despite all of this, my goal isn't to dismiss testing as irrelevant or worthless. It's been used many times to introduce the viability of new and surprising concepts to the community — before we had a real testing process, I did my own (undocumented) tests to check the viability of Libra CH120 RF. Because of the goals of this customization, being able to measure the win percentage was necessary. (Also, this combo didn't perform nearly as well in tournament play as it did during my tests — imagine that!)

And it's valuable in measuring baseline performance of parts and comparing parts within the same class (like opening pandora's box on OHD). I use my own examples not because I think they're the best or most interesting ones, but to show that I'm not against testing (and of course, they're the examples I'm most familiar with).

Finally, it can be very helpful in determining which parts are viable to be used in a competitive context. However, there's a world of difference between determining what's viable and trying to wittle the game down into a very small, fixed list of combinations.

I just think it's necessary to remember that Beyblade is not really a game of two tops interacting in a fixed context. It's also about two players trying to outsmart and outplay each other, and that often leads to surprising results that are simply untestable. We need to re-recognize the value of discussing Beyblade on its subjective merits, not just its objective, measurable ones.



One final nitpick: this line about "Burst parts wearing quickly" needs to stop. It's almost a non-issue for parts other than Valkyrie, and it's clear that Takara-Tomy has figured the design out and we're not going to see a repeat of issues like that again.
It's funny because the last time an issue like this was brought up, it was in the exact opposite context: people took the word of an experienced member at face value without any testing to back it up. Even the fact that I refrained from responding to this thread until I could convey my thoughts in a coherent manner only supports your point about peer influence. We put so much effort into sounding like we know what we're talking about that people actually start to believe what we are telling them.

I think that what we need to do is find a reasonable balance between general impressions of a part, test results, and tournament performance. Testing, whether it is formal or informal, tends to be a way to get a feel for how a part performs, or alternatively how a matchup is likely to play out in a tournament setting. A lot of people have been saying that testing should be less of a basis on how a part performs and we should return to a time where parts were evaluated solely on reputable users' impressions of them, yet a few posts later these same people have been dismissing parts that several reputable users have found to be useful through informal testing just because of the lack of formal testing on them. Regardless, we can all agree that testing, whether it is formal or informal, is not 100% accurate to a tournament scenario, with the most common cause of testing inaccuracies being difference in launch power or technique between players.

Another reason behind test results not correlating to tournament results is due to each player having their own "style" for lack of a better word. For example, Wyvang is unquestionably the most powerful Attack type Wheel in MFB, but for whatever reason I cannot get it to work for me, and I prefer Flash. I'm pretty sure that every time I have used a Wyvang Attack custom in Standard/Limited in a tournament setting I have lost 0-3. With that being said however, that doesn't mean that you can dismiss valid test results as invalid just because they disagree with your opinion of the part by saying "oh well that doesnt prove anything because it works/doesnt work for me". Especially with the debate over Armed brought up recently I have been doubting the results I got in my thread since they seem to be in the minority compared to the ones everyone else has been getting (especially with Xcalibur Xtreme bursting things).

Regarding the Tier List and competitive combos, I disagree with the idea being thrown around recently that it should be based solely off tournament results. I'll use Wyvang as an example again; by this logic, MSF-H Wyvang Wyvang H145RF would not be considered a top tier customization despite being able to consistently defeat literally every combo in the game except Scythe 85/90RS in the hands of a skilled and confident player, as tests have shown. However I don't think it should rely solely on tests (the infamous Dragooon F230GCF never even got its own testing thread) and I do think that tournament usage needs to play some part in defining the list as well. IIRC the last time Diablo was used in a tournament was August 2014 in Mumbai, and realistically its applications against other competitive combos are pretty much limited to stopping Flash (inconsistently) and tall Duo setups, yet it is still top tier based on relatively outdated tests and the impressions of a few users. With that being said, that doesn't mean we need to discard decent combinations just because they aren't "top tier"; this is especially the case in Burst and Limited.

Conclusions have never been my strong suit, but I guess the best thing to say in response to this is to be open minded and to form your own opinions on parts or combos based on your better judgement.
I have been thinking that to to be honest. But didn't think it would get much attention.
Quote:For example, Wyvang is unquestionably the most powerful Attack type Wheel in MFB, but for whatever reason I cannot get it to work for me, and I prefer Flash. I'm pretty sure that every time I have used a Wyvang Attack custom in Standard/Limited in a tournament setting I have lost 0-3. With that being said however, that doesn't mean that you can dismiss valid test results as invalid just because they disagree with your opinion of the part by saying "oh well that doesnt prove anything because it works/doesnt work for me". Especially with the debate over Armed brought up recently I have been doubting the results I got in my thread since they seem to be in the minority compared to the ones everyone else has been getting (especially with Xcalibur Xtreme bursting things).

This is kind of what I'm getting at. Despite lots of "objective" testings showing that Wyvang is superior to Flash (I guess? I don't actually know MFB's metagame), you find better results with Flash. And now you're doubting your own results with Armed and Xcalibur because other people didn't share your experience.

But just because others didn't have the same experience as you doesn't make your experience invalid. It could be due to any number of variables that we have no way of knowing. So isn't it enough to say that both Wyvang and Flash — or V2 and Xcalibur, or whatever — are viable, and let individual players come to their own conclusions? We can post tests and share our results to help if we want, but at the end of the day it's impossible to guarantee that someone will replicate our findings, or that we can replicate theirs.

This is where I think the quest to define the metagame's competitive combos into a very strict, narrow list hurts discussion and gameplay. We end up with a dichotomy of right and wrong choices that ignores the very wide spectrum of subjective experience that Beyblade can produce.

Nobody is trying to say that testing is bad — more data is always better than less data. But we also need to understand the limitations of that data and the kinds of conclusions we can draw from it. When conversation about Beyblade revolves entirely around data that isn't actually that reliable (and to be clear, the kinds of tests and subjectivity of our nature makes the data produced from them very unreliable), it stifles other kinds of conversations that we could be having.

In both the case of trusting a respected member's word at face value without tests, or trusting someone's tests without experiencing for ourselves, the core issue is the same: thinking that Beyblade is a game that can be quantified by a single individual, and trusting their conclusion without experiencing it for ourselves. There will always be combos that float to the top of the metagame, but that doesn't mean we all need to reach a strict consensus.
(Apr. 14, 2016  9:06 PM)Bey Brad Wrote:
Quote:For example, Wyvang is unquestionably the most powerful Attack type Wheel in MFB, but for whatever reason I cannot get it to work for me, and I prefer Flash. I'm pretty sure that every time I have used a Wyvang Attack custom in Standard/Limited in a tournament setting I have lost 0-3. With that being said however, that doesn't mean that you can dismiss valid test results as invalid just because they disagree with your opinion of the part by saying "oh well that doesnt prove anything because it works/doesnt work for me". Especially with the debate over Armed brought up recently I have been doubting the results I got in my thread since they seem to be in the minority compared to the ones everyone else has been getting (especially with Xcalibur Xtreme bursting things).

This is kind of what I'm getting at. Despite lots of "objective" testings showing that Wyvang is superior to Flash (I guess? I don't actually know MFB's metagame), you find better results with Flash. And now you're doubting your own results with Armed and Xcalibur because other people didn't share your experience.

But just because others didn't have the same experience as you doesn't make your experience invalid. It could be due to any number of variables that we have no way of knowing. So isn't it enough to say that both Wyvang and Flash — or V2 and Xcalibur, or whatever — are viable, and let individual players come to their own conclusions? We can post tests and share our results to help if we want, but at the end of the day it's impossible to guarantee that someone will replicate our findings, or that we can replicate theirs.

This is where I think the quest to define the metagame's competitive combos into a very strict, narrow list hurts discussion and gameplay. We end up with a dichotomy of right and wrong choices that ignores the very wide spectrum of subjective experience that Beyblade can produce.

Nobody is trying to say that testing is bad — more data is always better than less data. But we also need to understand the limitations of that data and the kinds of conclusions we can draw from it. When conversation about Beyblade revolves entirely around data that isn't actually that reliable (and to be clear, the kinds of tests and subjectivity of our nature makes the data produced from them very unreliable), it stifles other kinds of conversations that we could be having.

In both the case of trusting a respected member's word at face value without tests, or trusting someone's tests without experiencing for ourselves, the core issue is the same: thinking that Beyblade is a game that can be quantified by a single individual, and trusting their conclusion without experiencing it for ourselves. There will always be combos that float to the top of the metagame, but that doesn't mean we all need to reach a strict consensus.

To add onto the discussion of tier lists - I definitely feel we should stick with the model Th!nk provided in his plastics CC list. While it may not apply to every format, I definitely believe that a list of Good parts that work in synergy to achieve the desired goal is a much better way to sort them. It allows more freedom on what parts are good - even if it has a random niche use. The benefits of this is that they're a lot more open, so they can apply to everyone's "style" and match up with tournament results and reports a lot easier.
A friend of mine just suggested a method for testing stamina/defense with little to no human error. Could partially remedy the issue of varying testing results, but I'll have to test out & then describe it in more detail once I'm back home because it involved rigging up a contraption for launching.
It is unfortunate to say that people listen way too much to the users that post tests. While yes, tests are nice and definitely valuable, they've became much more of a resource than actual tournament experience. I agree that there are flaws to basing combos for tier-lists solely off of tournaments, but the one major reason I like the idea is fabricating tests is almost completely eliminated; there are very few chances someone would lie about something that happens at an event. Nerves are a thing, and because of this, a combo that should consistently be beating another isn't. But I think the community we have right is also smart enough to tell if something was a fluke and is really happening in a tournament elsewhere.

Another thing about testing at home is that not everybody is going to have the same type of launch power. A testing from me, for say, can easily be different from someone like Zoroaste's because of how different our launches are.


I really do thank Shirayuki for posting about this and I do really apologize if anything I said was repeated. We really needed to address this issue sooner than later. Smile
I mean, we also need to use a combination of discretion and data along with tournament results. Didn't Ragnaruk Heavy Survive win a tournament before? lol
I think I scanned the posts more than thoroughly read them, today, but there seems to be a few different aspects being touched upon: test results should not be absolute - personal testing summaries are also relevant (plus tournament winning combinations), but also test things on your own instead of just trusting what gets tested.

For certain, sure, I can agree with all of the points raised here, however all of them have a "but" condition:
- Absolutely, testing as it is is far from perfection even on a statistical base, but having none of it at all getting posted would just be a disaster too. Even though they can be fabricated, the majority of Bladers will actually have conducted the tests and if anything can be reproduced repetitively, it is rather important even without all the tournament circumstances.
- OK, personal testing summaries are nice, but whom I personally trust about those is very limited. Who even tells me that you did more than five proper battles with this or that combination? "Wyvern is a bad Layer", but yo, to what extent did you really try it? You will ultimately have to provide statistics anyway, no matter what.
- Tournament results are great, but as was mentioned, our advice to other Bladers cannot stem only from those winning combinations, because some are odd, perhaps there were no true Attack types in the event, there are many factors we ignore, etc.


Since I have been testing more frequently these days, I did find that anything not getting an extreme result such as 1/10 or 9/10 is simply too much limbo. This is why I had gotten the urge to push only those tests to twenty rounds, closer to the statistical relevancy, but even then, it often ends up around forty percent and that just does not mean anything better than twenty or seventy percent, really. That seems to be the whole margin of error.


In any case, I am quite sure that any important decision or whatever has always been based off all of the criteria above, not just test results. Therefore, I personally do not get the impression that this message is meant for me or the Committee. Were people in the community giving wrong advice to younger or newer Members and were only looking at tests?
(Apr. 15, 2016  5:38 AM)Kai-V Wrote: In any case, I am quite sure that any important decision or whatever has always been based off all of the criteria above, not just test results. Therefore, I personally do not get the impression that this message is meant for me or the Committee. Were people in the community giving wrong advice to younger or newer Members and were only looking at tests?

I think it's more about the shift in how the game is discussed at large; not necessarily about any kind of rule we need to enforce.
Haven't had a chance to read everything all the way through, but one thing that would be really great would be a testing template to make testing easier for newer users. Actually, I'll make one and have it posted shortly.
Yeah that is a great idea. Personally I copy and paste one of my old tests and then edit it when posting new ones. But that would be a great idea. It would make typing up results a lot less daunting for everyone, especially new people. What would be really awesome would be if there was a Beytest app that had burst finish
.
I don't see how making testing easier really relates to what this thread is about. What did you take from it?
Quote:For example, Wyvang is unquestionably the most powerful Attack type Wheel in MFB, but for whatever reason I cannot get it to work for me, and I prefer Flash. I'm pretty sure that every time I have used a Wyvang Attack custom in Standard/Limited in a tournament setting I have lost 0-3. With that being said however, that doesn't mean that you can dismiss valid test results as invalid just because they disagree with your opinion of the part by saying "oh well that doesnt prove anything because it works/doesnt work for me". Especially with the debate over Armed brought up recently I have been doubting the results I got in my thread since they seem to be in the minority compared to the ones everyone else has been getting (especially with Xcalibur Xtreme bursting things).

I think I should touch on this as well!

I first wanted to mention, that there are two different known molds of Wyvang which drastically vary your results depending on how you plan to use Wyvang. One has a longer feather which produces better smash and better suited for attack types, and one with a shorter feather more suited for BD145RDF. I have 10 Wyvang now, and 6/8 are the bigger feather (two of the ten are unopened nib so I haven't measured it). The size of the feather affects the results. (image of molds here).

The bigger thing that people don't necessarily mention, is that the hole and peg in every chrome wheel varies slightly depending on which mold you have as well as the manufacturing tolerances. If you get the right combination of chrome wheels and facebolts, you can shift the wheels around and offset them even more and get it to stay firmly. It will exaggerate the contact point even further which will yield you better results. This was a similar but opposite idea for people who used for BalroBalroBD145MF to get them to line up evenly to improve it's stamina.

Flash appears to have only one mold, and therefore pretty much all of them will perform the same (in fact it probably performs better as flash wears down, since the two bottom contact points becomes more blunt, vertical, and flush with the top portion which reduces the amount that the lower part hits against E230's disk). This further skews results as it will be inherently be more consistent no matter what for flash, simply due to technical manufacturing tolerances.

It's these tiny obscure elements that also add to all the frustration of testing results and relying on someone who actually owns that specific mold for a piece that was already expensive to buy as you had to buy two of them...

The worst part about testing is that most tests are done with one player, it is absolutely easy to put a beyblade into the stadium and shoot at it with another beyblade. It's a sitting duck. It further skews the results.

(Apr. 14, 2016  9:06 PM)Bey Brad Wrote: This is kind of what I'm getting at. Despite lots of "objective" testings showing that Wyvang is superior to Flash (I guess? I don't actually know MFB's metagame), you find better results with Flash. And now you're doubting your own results with Armed and Xcalibur because other people didn't share your experience.

But just because others didn't have the same experience as you doesn't make your experience invalid. It could be due to any number of variables that we have no way of knowing. So isn't it enough to say that both Wyvang and Flash — or V2 and Xcalibur, or whatever — are viable, and let individual players come to their own conclusions? We can post tests and share our results to help if we want, but at the end of the day it's impossible to guarantee that someone will replicate our findings, or that we can replicate theirs.

This is where I think the quest to define the metagame's competitive combos into a very strict, narrow list hurts discussion and gameplay. We end up with a dichotomy of right and wrong choices that ignores the very wide spectrum of subjective experience that Beyblade can produce.

Nobody is trying to say that testing is bad — more data is always better than less data. But we also need to understand the limitations of that data and the kinds of conclusions we can draw from it. When conversation about Beyblade revolves entirely around data that isn't actually that reliable (and to be clear, the kinds of tests and subjectivity of our nature makes the data produced from them very unreliable), it stifles other kinds of conversations that we could be having.

In both the case of trusting a respected member's word at face value without tests, or trusting someone's tests without experiencing for ourselves, the core issue is the same: thinking that Beyblade is a game that can be quantified by a single individual, and trusting their conclusion without experiencing it for ourselves. There will always be combos that float to the top of the metagame, but that doesn't mean we all need to reach a strict consensus.

I think the problem I have with trusting a respected member's word, is that it can completely shape the competitive lists. The combined work of Inguilt, The Black Dragon, and a few others created and completely skewed the late MFB's lists. Pretty much every combo thread they made instantly ended up on the list even though they weren't completely tested (although they are currently good, and they had potential upon reveal), while threads like Mitsu's suggestion for BD145GF is still struggling to be a thing. (thread here). In my tests BD145GF is a solid enough performer to be considered an actual thing. But I guess that's the problem too, word of mouth and peer influence dictates that it isn't a thing, because they said so, it's "too uncontrollable," "doesn't win against x y and z and therefore isn't good and shouldn't be considered a viable option." "OH btw Duo_160PD is viable because ONE GUY said so, and it has NEVER placed in tournament play and that unicorn of a tip PD placed only 6 times EVER."

(Tournament results based on info up until July 19 2015. I think it actually placed once in the time since July 19th.) (source).

Peer influence plays a strong role in tests in my opinion as well as in the aftermath after the discussion.

It's the reason why Pegasus, Gryph, Begirados, and many other chrome wheels are pretty much considered extremely sub-par and useless. Nobody even thinks twice about using these over other options even though I strongly feel that they still have niche potential and use.

I also believe strongly that narrowing lists greatly depreciates the spirit of beyblade's amazing capability to captivate players into experimenting and trying to explore new avenues of success.

If you keep telling people Wyvang is the best wheel, by power of influence, it becomes the best wheel only because that's the only thing people are using, even though something like the super duper lowly "outclassed" Basalt on E230 on rdf launched paralell to the ground hits huge numbers against it because Wyvang and a lot of other 145 height synchrome attackers/balance combos can't reach past the e230 disk to hit the narrow Basalt wheel.

And I supposed to fully address the thread title:

I think we are unfortunately locked in to believing and listening to "important people's tests" because it mostly works. But to throw a complete twist on it: Why do we even have mentors and people of higher power? (I'm not talking about moderators since we do need those). Doesn't that exacerbate the issue? "Let's make this guy important, he has an italic name, he's special, everyone listen to him, he knows stuff, don't listen to Joe Smoe. He doesn't know anything, he's pretty regular so his stuff isn't important."

There is literally an entire sub-forum dedicated to telling people that they aren't important. It's called the Mentor's Circle. Regular unimportant users CANNOT post in to even contribute information to... because they aren't deemed "knowledgeable enough" for it. Because they simply cannot have the knowledge to compete with the Mentors no matter how much experience they have or insight they may hold. There are many times where I read an entire thread before realizing it's in the Mentor's Circle and I hit the button and I cannot contribute. It's absolutely silly that multiple threads must be made for the general public just because it was posted in the Mentor's Circle first.

Isn't all of what I mentioned simply setting up this issue mentioned in the title?

Carry on~
While I do completely agree with juncction on every point (particularly that the popularity of advanced members skews the acceptance of tests results (and opinions about anything, really) as interpreted by less-popular memebers) I also have a word of caution:

It is an unfortunate symptom of the herd that people who feel less important will follow those they consider more important. This is almost never the fault of the people who are considered more important.

With regard to mentors on the WBO, I think it is justified to reward consistantly productive members with special titles and forum access (or with any other kind of reward, really). They earn those rewards. It is wholly the responsibility of any other individual to interpret the sentiments of those mentors in a useful way; either by acting on their advice or not, depending on what is objectively more useful for the individual at the time.

I realise objectivity requires a certain amount of responsibility from the individual which (I agree) is largely lacking throughout the world; and particularly among young people, through no fault of their own. But come on people, this is the internet; you're not going to get stabbed for being objective. Never too late to start.
Eh, people become Mentors not because they are popular, but because they have clearly demonstrated, online or at tournaments, that they have advanced knowledge of the Beyblade game. Anybody can become a Mentor if they just post more and demonstrate the same amount or type of knowledge. It is not meant to be a clique at all and I am not aware that anybody considered it like that from that group; it is just meant to be something to aspire to be a part of, and indeed to reward people who contribute a lot.
(Apr. 21, 2016  2:38 AM)Beylon Wrote: With regard to mentors on the WBO, I think it is justified to reward consistantly productive members with special titles and forum access (or with any other kind of reward, really). They earn those rewards. It is wholly the responsibility of any other individual to interpret the sentiments of those mentors in a useful way; either by acting on their advice or not, depending on what is objectively more useful for the individual at the time.

(Apr. 21, 2016  3:27 AM)Kai-V Wrote: Eh, people become Mentors not because they are popular, but because they have clearly demonstrated, online or at tournaments, that they have advanced knowledge of the Beyblade game. Anybody can become a Mentor if they just post more and demonstrate the same amount or type of knowledge. It is not meant to be a clique at all and I am not aware that anybody considered it like that from that group; it is just meant to be something to aspire to be a part of, and indeed to reward people who contribute a lot.


Ah I was a bit too harsh. I was definitely in a mood at the time of writing it... my apologies!

I guess to clarify, I do believe that Mentor is a tag that people deserve. It is an honor for most to receive it because it show that they put in the time to demonstrate their knowledge and their dedication to keeping this game amazing and fun. Without those people, the game honestly wouldn't be what it is today. Most of the knowledge I've gained about MFB was due to the insights of those players alone, so it is without a doubt a tag well deserved.

The talk of Mentors was more so to speak of how the popular people that others listened to on this forum were also concurrently Advanced members at the time, and therefore held more weight in what they said and suggested about parts and tests.

I suppose also, I'm much happier with it being called "Mentors" rather than "Advanced Members." I think that was a step in the right direction.