Just do what works best for you in my opinion. I like to be consistent with my testing style because I'm just trying to get a general idea of how the specific part in question performs when compared to other specific parts. Using the same setups on both combos and switching the parts halfway through the tests is what I prefer because it helps eliminate variables and can help me understand where the part falls in terms of use when compared to other ones. If I'm putting Part A and Part B on the same setups and switching parts halfway through and Part A loses most of the time by OS to Part B, that tells me that Part B probably has more Stamina than Part A. That doesn't necessarily mean that Part B is always better than Part A, as with fine-tuning of combos, I'm sure you could find a combo out there involving Part A that could have a better shot against a combo involving Part B. That stuff is also interesting to test and I'll try to get to that soon, but currently, I'm not trying to find that. I'm just focusing on part vs part for now because I want to know that information first. Once I finish with that, I'll get into trying out different combos against each other.
EDIT: To clarify, my style is basically comparing two parts on a basic level. Just because one part beats another part on the same setup doesn't mean it's outright outclassed. It's just not as good when compared to the other part for whatever thing I'm trying to test. However, that lower-tier part could be good at something else, which is why more testing would be done to discover that. My style though is just the baseline information, which helps me figure out what I should do next with the part.