Mastering UX Benchmarking with the ULX® Benchmarking Score: A Comprehensive Guide

Name: Mastering UX Benchmarking with the ULX® Benchmarking Score: A Comprehensive Guide
Start: 2024-03-01T00:00:00-05:00
End: 2024-03-01T02:00:00-05:00
Location: Online

Mar 01, 2024

7566 views

Watch Now

The ULX® Benchmarking Score: A New Era In UX Benchmarking

Summary video

Welcome to our webinar on mastering UX benchmarking with the ULX® Benchmarking Score.

In the ever-evolving technological landscape, understanding and improving user experience is paramount. The ULX® Benchmarking Score, developed by Userlytics, offers a groundbreaking approach to benchmarking the entire spectrum of user experience, going beyond traditional usability metrics.

This unique benchmarking tool gives you a 360-degree measurement of user experience by evaluating 18 key UX attributes across 8 critical constructs, including Usability, Appeal, Trust, and more. The ULX® Benchmarking Score gives you a holistic overview, enabling a comprehensive analysis and comparison of your digital assets against industry standards and competitors.

Alt

Beyond Usability: A Holistic UX Measurement

Discover how the ULX® Benchmarking Score empowers you to transcend conventional usability testing. Learn about its capacity to offer insights into every facet of the user experience, ensuring that your app or website is not only functional but also engaging and memorable. Our webinar delves into the methodology behind the ULX® Benchmarking Score, showcasing how it encapsulates the full user experience spectrum.

Who is it for?

This webinar is a great resource for product managers, UX designers, UX researchers, and digital marketers seeking to elevate their UX benchmarking practices. Gain in-depth knowledge of the ULX® Benchmarking Score’s framework, understand its application in real-world scenarios, and learn how to leverage this tool to identify and address UX challenges effectively.

Remember, optimizing the user experience is not just about fixing usability issues; it’s about creating an appealing and memorable journey for your users. The ULX® Benchmarking Score is your key to unlocking this potential.

Transcript

Okay, so we’re about to start and now that I see that some more of you are joining. First of all, thank you everyone for joining this webinar. My name is Sarita Saffon. I am a senior UX research consultant from Userlytics. And as you’ll know, we will be presenting today in this webinar, the ULX benchmarking score. But before we actually speak about that score, I wanted to go a little bit back and give a little brief recap of what is UX research for those of you that are not very familiar with the field, but also to give you a bit of a background of why in Userlytics we saw the need to create a new score or index in this realm of work. Also, we’ll pass over what is actually the Yolaks’ benchmarking score, but digging deep into how it was created, how the Userlytics came up with this ULX score. And then we’ll present an example with an interesting and very captivating dual of websites so that you can see what is presented in the ULX benchmarking score study. And finally, we’ll leave a session to answer some questions that you may leave in the Q&A feature on Zoom, which my colleagues will be answering directly if it’s possible, or that we will cover during this last section of the webinar.

So as mentioned, we need to back up a little bit and remember what is UX research and what this work has provided us throughout the years to really understand why in Userlytics we saw the need for a new UX research course.

So what is user research or UX research? There might be million or maybe more definitions of what is UX research. However, I wanted to bring you just a few of them. One of them is given by the Interaction Design Foundation that describes UX research as the systematic study of target users and their requirements to add realistic context and insights to defined processes.

This might be a little bit difficult to understand or maybe too technical, not really down to earth. So I personally like more a definition given by fellow researchers when it involves a lot more of the human side of UX research, talking about the purpose of being to improve the customer satisfaction through the ease of use and the quality of the user’s interactions.

However, this definition still lacks a lot of variables that are included in the user experience. And other researchers actually show this when they add into the UX research definition, not only that it goes beyond the human computer interaction, but also that it includes aspects such as beauty, amusement, pleasure, satisfaction, and others.

And you might have heard or even used multiple UX chaos that actually evaluate and measure this user experience. Such as them you can even hear it or use as well as the NPS, the SUS, measuring satisfaction, emotion, even hedonic values, customer effort, or technology acceptance.

But as you can see in all of these, they’re evaluating one or maybe two variables that are included in that user experience, but they’re not including them all. Because yeah, it is great that your app it is easy to use. But what is the point if nobody really wants to use it? It’s amazing that your platform designed is meeting user expectations. But if they don’t trust the content, they’re not really going to have a very good user experience. And also, it is amazing that it’s engaging. But if they don’t really actually get into the website, if it’s not in their top of mind when they need that service, they’re not going to actually enjoy that engaging website.

So this is why we created the user analytics experience benchmarking score, whereas we calibrated the ULX benchmarking score. To take into account all of those aspects that encompass the full user experience with a digital service or asset, going beyond measuring only one of its elements and measuring them all, which is what most of the other ULX index lack.

The ULX score was built internally by the Userlytics UX consulting team, applying their long-standing research expertise, which has involved not only working with different industries, countries and brands, but also in different aspects of the research of academic, market research, or even brand research. And in this case we discern which of those UX attributes are most impactful to measure and track in the user experience. So to do so, we ran several closed pilots that were statistically tested and screwed in us. So that each time that we did one pilot, we had a more precise iteration. And finally, it resulted in a very robust and potent research tool that not only had internal consistency and reliability, but we also use this as statistical pets to actually evaluate the weight that each of those constructs that compose this ULX score, how much of them they really impact in that overall ULX score so that it is well balanced and that it really represents the reality of the user experience.

But I want to stress that the ULX benchmarking score is not as any other index, because it is not only an index. It is a complete study where not only your digital that asset is evaluated, but also those of two of your main competitors. So it provides a snapshot of how your site or app is performing on the several dimensions, thus serving as a precursor for your product’s improvements and further research, but also compares it to two of your main competitors, therefore serving as a benchmark on how to develop and impact this over time.

The ULX score also differentiates from others because the 240 user sample that it’s reachable thanks to a Userlytics panel, does more than just respond to a simple question. They actually use the platform of Userlytics to use those three different assets, completing several tasks on them so that users can truly experience these assets when using them in the real life with real life scenarios so that they can provide a rating with informed experience.

On top of that, we discovered that we should take advantage also of the Userlytics platform to provide even more in this study and get some qualitative feedback to complement those quantitative results that we get from the ULX score. Therefore, 10% of the sample actually goes through the questionnaire and the test, but they do so on a non-moderated session so that we inside Userlytics can get videos and user-providing to illustrate to you why those ULX metrics are coming up. What are those reasons why the users are giving those ratings in the ULX score? And therefore, looking at what things can be improved and what research can be done in order to improve these ULX scores.

In the questionnaire, users rate each of those digital assets in 18 user experience-related attributes. And then these 18 attributes are divided into eight unique constructs that, as I said, we’ve statistically identified to be the main drivers of the user experience and that each of them have their own weight depending on the influence that they have on the overall ULX score. These are appeal, adequacy, distinction, usability, trust, performance, affinity, and appearance.

So here we can see how the different elements of those indexes that we saw before come all into play into one single score, but that will have a breakdown to see where your asset can work to improve on that overall user experience.

Because in the final report, you don’t just receive one single score, one ULX score for each of those different assets, but you also get the eight unique construct scores and the 18 attribute scores. up with qualitative insight that is going to actually provide you the wise of those quantitative metrics that we see.

Also, we provide action of recommendations and research roadmaps that will allow you to take action immediately, but also to plan that research roadmap in order to see exactly in which areas you have to focus the research on.

Also, the ULX study allows you to decide on what type segmentations, what type of target would be more interesting to see and to compare in order to see what is the difference between the ULX scores, maybe between ages or specific targets that are of your specific interest.

Also by doing recurring studies, we can track and compare that evolution that your asset may have and the impact that the improvements that are done on the product have had on that overall user experience.

And obviously, we take care of all of the preparing of the study, the design, the recruitment, however we also meet with you in a kick-off call so that we can discuss all of the details, what assets to evaluate, who are those competitors that are important to be taken care of, and that we will be comparing, and who will be responding to your ULX study so that it suits exactly to your needs.

Because even if we do have to maintain a broad-based profile in order to achieve the big sample and the statistical significance sample, we can screen out general criteria that you would want your users to answer the test. So for example, if you like people that are very high on online shopping or that they play at least one score or that they have traveled this year by plane, this is a criteria that we can use so that we can segment those users and that your target in specific is answering to our ULX score. We will check the incidence rate of the criteria with our operations team in order to see the viability of the recurring of the sample to carry out the test.

And what would you invest to get all of this that I’ve just talked about? We have a standard price of 7,450 US dollars. However, as you may know, for the attendees of this webinar, we are offering a 50% discount. So if you are interested in this, please contact our team or your account manager and we can get started.

For those that may have some doubts, Bill, we wanted to show you an example by running what we called a mini-yollux, which is basically the same study I presented to you before. but that we reduce the size to a sample of 30 users, which actually does not provide a statistically conclusive result, but a tendency, and that it will show you a sneak peek of the power of using the ULX benchmarking score. And to make things a little bit more interesting, we chose two competing websites that might generate even a bit of controversy that we thought that would be exciting for you guys in this webinar to see.

And that is a duel between the Democrat and the Republican Party websites. We wanted users from all political affiliations in the United States to test these two websites and provide us with their evaluation of both to see if one of them provides maybe a better experience in what constructs is one better than the other one in which attributes of the ULX each perform better or worse and also where they both could improve.

So the ULX score is calculated on a scale from zero to a hundred. Where a higher score obviously means a more exceptional user experience. However, as with other benchmarking indexes, the score here is also as good or as bad, depending on how it compares with the other assets that is being compared with. So in this case, we saw that the Democrat website of team a score of 75, which may not be a very high score. And it shows that it has room for improvement. But we do see that it is higher than the Republican website rating, which got a 66 overall score. So we already can say that the Democrat website provided a better website, a better experience. However, in order to confirm that the two ratings truly differ from each other, we ran other statistical tests. And in this case, we actually discovered that even though we can see that difference between the website CLX scores, this difference is not statistically significant.

What does this mean? It means that there is not actually a meaningful difference between the two sites. The differences are not likely due to real and real effect, but could be due to random viability or chance. But we do have to take into consideration that the sample here is reduced. It is only 34 participants. And also that we would have to do a full ULX in order to see if those differences are too limited or not.

When taking a deeper look at the constructs that compose that ULX score, we can see that the Democrat website obtain higher ratings than the Republican website in all of the constructs, especially in terms of adequacy and appearance, with a difference of nine and 11 points correspondingly. However, once again, we have to indicate here that only that adequacy construct have had a difference that was statistically significant. Therefore, once again, the natural viability might be the reason why these differences were presented rather than a consistent pattern or effect from the actual website experience. This could be also due to the sample’s composition, and therefore we would recommend, again, doing a fully-alike sample to confirm this.

But what we should take out from these results is what are those constraints that the website should work on regardless of the difference between the two competitors and where they can improve in their user experience. Like for example, the Republican website that show very low scores on the extension, affinity, appeal, and trust. However, we see also that in the Democrat, these same constructs are those that they score the lowest. So especially here, we see that maybe the problem is, in general, with the political website, in the industry website, which in turn could also mean an opportunity for any of them to stand out if they work on those aspects that will make them stand out and provide a better user experience.

Taking those constructs that the website need to improve is when we go even deeper, even smaller, even more detailed, and we study those attributes that are included in those constructs or that construct that we want to work with. And we review the qualitative information that the ULX benchmarking study provides us to see why those ratings are presented and what actionable insights we can take out of them.

So for example, here, we have one of the attributes that is involved in that distinction construct that we saw that both of the websites need to work on the most. In this case, we present here an example of that Democrat website where we can see that actually there is a lot that they have to work on in terms of the uniqueness and actually showing updated information. Today, there are so many websites and the users are constantly on multiple websites that if they don’t see that their sites are up to date, they don’t want to actually go into this website. If they don’t see the effort of the website trying to amaze them, they’re not going to be actually driven and motivated to even go into the website less of all to go and browse the website and spend time on it. So this is one of those constructs that is very important to be worked on and that we can see here in the user verbatim that it shows why they are evaluating with low scores this type of attribute.

Another construct that both of the websites don’t have very high, but especially the Republican website is trust, which could be seen as very crucial aspects for websites in the realm of politics because they do have at least part of their purpose is to convince the constituent to join their party. Also donate. And as we can see here in this verbatim where the when the user actually goes into the donation, they really don’t feel confident providing that information. The users are really felt like their data is not secure in this website. Nowadays, being transparent into why and for what our data is being asked on the website is not only seen as a must, but it can be a devastating characteristic for a website and have a direct effect on the brand perception and therefore on the website experience. So reducing as much as possible the information that is requested is vital. And if it has to be asked, it’s especially important that it is done very clear what are the terms and the potential uses of their information if they wish to provide it. You would think that this is already a common knowledge and that everyone knows this, but apparently we see evidence in here that this is not the case.

Additional to getting verbatim from the users, Another benefit of having the qualitative sessions in the ULX benchmarking score, and especially in the Userlytics platform, is that it provides us with videos of the users, which sometimes reveal issues that the users do not remark verbally, or do not even reflect on their evaluations, but that they are useful for product designers to improve their websites and users’ experience in them.

So for example, in this case, the ease of use attribute scores might not be very alarming. We see a 68 and a 75, which is not good, but it’s okay. However, when we review the navigation of the users in both of the websites, we could’ve spot several issues that we can go from small things, such as elements, as we see on the left of a carousel that moves too fast and that does not allow the users to actually read the information on it, or maybe just a very small typo on the website, then maybe we can just improve it in one second. But also it can spot us very big issues. Like we can see in this highlight reel that we have created on the Userlytics platform to illustrate precisely those study facts.

Well, what we do, support tab. join them. Or we’re just going to scroll down on the page probably. Use me for this. I’m trying to. Well, what we do, support tab, or we’re just going to scroll down on the page probably. There. Connect. Well, I mean it should be right there, but that’s not letting me do anything. Take action maybe. Oh, if it’s near me, please let’s see. So, opportunities to volunteer. So, take action. And – I guess sign up. Okay, I’m looking for the platform. platform featured, are you America Connect co-chair? That’s interesting. Maybe it’s up here. That’s weird. Take action, vote news about our party. I don’t really see a party platform. It is who we are. Member gallery. There’s a platform. I don’t see any place. I don’t see any place that discusses the platform. Okay. Once you’ve located the portas platform, go ahead and copy and paste the first sentence. I, I don’t see it. I’m going to say maybe this is the platform. It’s the closest I can find. I’m going to create it here. I don’t really think that’s a platform. What’s where we stand? That would be the platform. Party platform right there. Boom. Now that, click right on the headline. Did you see that? No, that’s the way it should be done. Not like little tiny letters at the bottom with the color of the font that matches the whole thing below. Download the party platform. Preamble. Let’s see. Okay. Wow. So that’s that’s the preamble. That’s the preamble. Talk about ambling. Oh my god. This is like a wow. Oh boy. So there is nothing unlike the Republicans. The platform is created to uplift. I guess that’s it. The platform I thought I think that the the Republican message on what is the platform is much better than the Democrats here. The first sentence of the platform. I’m gonna have to do this, man. I’ve got to download the PDF. I just have to. Consider Democratic. Oh boy. Nobody’s gonna… 92 pages! Nobody is going to read this! Even the Congress people aren’t gonna read this. We’re not gonna click. Do I see anything? I’m confused. Please show where you would go to learn about the party’s platform. Speak your thoughts. Once you’ve seen the platform, okay, well that was what I was looking for. It’s not readily apparent to me what the platform is. Do I X out of this and see, okay, that makes a little bit more sense. Again, that wasn’t intuitive.

Okay, so we could see here a lot of examples of how the platform wasn’t actually very intuitive, how they have issues with the actual interaction of those different websites. What do you do? Support tab. Well then we just… Do we want to go? Or we’re just going to scroll down on the page probably. There.

Okay, let me just go back to that presentation. Okay, here we go. Once again, the report would present this type of reels.

Those qualitative sessions allow us to show to you, to illustrate to you, not only the quantitative results, but also see firsthand how the users and why the users are presenting these type of evaluations.

The ULX score also allows us as well to divide the sample by segmentation groups of interest, like for our example, we initially asked the users their political affiliation in order to analyze the results by each of them and see the impact it may have in their experience and evaluations. These results obviously are based in an even smaller sample, so it should be taken directly very delicately, but it shows us all some initial tendencies.

As we can see in this left graph, a slightly larger group of Democrats than Republicans of the sample, but we also have a very interesting independent amount of users in the test, which was interesting to see also the differences when they provided the ULX scores.

And as expected and seen on the right, the users who associated themselves with the Democrat or the Republican political parties provided a higher overall ULX score to the websites of the party they identified with. However, it is intriguing that the difference between the ULX in the Republican group is much smaller than the one on the Democrats groups. And actually, when we saw the importance of this, we did actually see that there is only significant statistical differences between the Democrat website and the Republican website scores for the Democrat affiliated group and not for the independent affiliated or the the Republican affiliated group.

When comparing the scores across the segments, only the difference in score for the Republican website was statistically significant and only between those identified in the Democrat group and in the Republican group. The Democrat website scores, given by the different affiliation groups, were not statistically significant between any of the segments.

Regarding the constructs, the Democrat website, in the Democrat website, the higher rating for most constructs was expected, provided by those users that identified with the Democrat Party. Nevertheless, it was the Republican users who rated better the Democrat in terms of usability and appearance, and a same score in the performance construct. However, none of those differences were observed, were statistically significant between the Republicans and the Democrats, or the Republicans and independents. It was only statistically significant in one of those constructs when looking at the Democrats versus the independents, where the Democrats provided their website with a significant better performance score than the independent group. But this again could be due to the sample size and would have to be confirmed in a full study. But in general, what we can take out from this construct table is that even for the users affiliated to this political party, the Democrat website has to work on many of the constructs to reach a better user experience. And again, the distinction construct is one of those two constructs that they would have to put more focus on.

Passing on to the Republican website and how the constructs were rated by the different affiliations, we could say that it is definitely confirmed that the influence of the party identification has on the user experience on the websites. Once again, the Republican identified group are those that provide better evaluations to this website in all of the constructs than both of the other two affiliation groups. And in this case, with many of them being actually statistically significant difference, even with the small sample size that we had. And even though we can see also the impact of the affiliation in that construct of trust that we explored before, we can see here that even the Republicans still provide ratings that could be much more improved in terms of appeal, which refers, for example, to how engaging the website is. Adequacy, which refers to having a complete information and all of that information that the users need. Affinity, which talks about the recommendation on the top of mind that we mentioned before. And again, in that distinction construct that we have seen that it’s lacking in both of those political parties websites.

In a full ULX, we could see the detail of those qualitative sessions per affiliation group and maybe discern the actions that could be taken by both of the websites, taking into account the impact of the user’s identification, but also those of their own point.

So based on all of the findings, we could provide with recommendations, both on actions, that could be applied on the website at the moment that we present the report, to provide a better experience immediately, but as well, research activities that could be performed by the brand or by the team in order to dig deeper into those constructs that we have identified that need more work on and to search in more detail why are there under-performing and also what can be done so that ratings increment. Here we have just a couple of brief examples based on our political party zeal, but obviously in a ULX benchmarking score report, these areas would be covered in a lot more depth in order to provide you with more actionable insights.

Obviously, another recommendation would be to do several ULX studies in a recurring manner. For example, every year, or every six months, obviously depending on the product development pace, as this allows to track the progress of the asset. It allows you to see what impact those previous, not only insights, but all the work that you have done on it, how this impacts in the overall ULX, and in each of those different constructs and attributes. But also, it can give you that benchmarking of how you’re doing versus your competitors, and therefore evaluate what other type of priorities you have to put in your roadmap. And also, as an additional benefit, and a small and last tip. After the first study, you actually get a better price of those recurring ones. So definitely it’s worth taking an advantage of.

So thank you so much for joining this and I hope that I have covered everything but we are going to answer some questions that you might have left. So I’m just going to go over the questions that have left open and start answering one by one.

So we have an anonymous attribute that actually mentions how do you evaluate each of those constructs, what kind of questions and the scales are you using. So as I mentioned, these are evaluated in a questionnaire where we have 18 attributes that are asked to the users, which are presented in a questionnaire after they actually do the task in each of those different websites, they respond to one questionnaire for each of those digital assets. The scale is precisely what we have designed and they do respond to each of those 18 elements for each of those different assets.

So, as the second person asks, how do you account for color plural bias in your global research for this benchmarking? So, actually, what we do inside the study to avoid some of the biases, for example, is also presenting the different, in different orders, those different assets that we are evaluating so that the presentation bias is not also affected. In terms of the cultural bias that it could be presented here, we are not at the moment generating global benchmarking. We’re generating more of benchmarking specifically for the target or the audience that you’re more more interested in. So as I was mentioning before, you can actually talk to us and meet with us to decide what is that target that you want to base on. So for example, if your main market is in the UK, we can actually reduce the universe to only UK, so that only people in that specific country that is of your interest, replied to those users. We have not yet presented a global because it’s more of a customized study specific for each of those clients that want to compare their assets to their main competitors in those markets that they are more interested in.

So a third question that we have here is, what are we considering as impactful in the user experience to weight the constructs in the score? What are we comparing and measuring the impact on the weight? So this is actually something that is done through statistical significant tests. Is not something that we provide subjectively. And during all of those close pilots that we did, we gather all the information in order to use statistical tests that will provide us with those weights that each of those different constructs have on the overall ULX score. So this is not things that we are considering subjectively, that we say that this construct weights more than another. is something that results from statistical tests.

Following question is how do you calculate these numbers for affinity, appeal, et cetera? So yes, this, as I mentioned at the beginning, we have those 18 attributes that constitute those eight constructs and those eight constructs, again, constitute the whole overall ULX score. So basically what we do is with those different statistical tests, we create those different constructs with different amounts of those 18 attributes that we present to the user and that they evaluate for each of those digital assets. Therefore, the score is measured based on those different groups that conform each of those different constructs.

We have here another question from Derry. Thank you so much, Derry, for joining. If 68 and 75 is not good, I said it was average, what would constitute good then in this concept? What are we aiming for? Yeah, thank you so much for this. As I was mentioning at the beginning, we do have that scale of zero to 100. So we would say that 100 is very good. And 68 and 75 is not good. It is, as you were saying, it’s an average score. I would say that it’s not good. It’s just that it leaves room for improvement. There are things that should be worked on in order to augment that ULX score. What are we aiming for? We are aiming always to improve. I would say that we are not, well, obviously we are aiming for that 100. I don’t know if perfection is ever achievable, but we always aim to that 100%. And obviously, as I mentioned also at the beginning, the importance here in this index is also depending on the industry. Obviously, if we have a 75 and our competitors have a 50 and a 40, well, we’re doing great. But if we have a 75 and our competitors have 85 or 87, we’re not doing so well. So that is also something to take into consideration when we evaluate ourselves, and this is why we include also that those two competitors in order to show that if we’re good or if we’re not, but obviously always aiming to augment our own ULX score.

Okay, so next question that we have here, how did you end up with 240 participants and optimal statistical significance? Does it apply for any website under test? Yes, so the 240 was also something that we came up with after those several close pilots. We actually started with a much higher sample size and doing these iterations of different pilots, we actually achieved an optimal number that also allowed us to give some tendencies when we separate by the segmentation groups. So this is why those segmentation we can do maybe three or four different groups, but we cannot break it down to many, many groups so that we don’t lose that importance that at viability inside of the actual results that we get. And also, does it apply to any website under test? The sample size is the same for every test. The importance here is to have the quantity and in order to provide that statistical significance that we can actually gather from those statistical tests that we run with the data.

OK, so we have another question. While evaluating usability, do you take into account accessibility, if yes, for which disabilities do you test for? Yes, thank you so much for this. This is a very, very important subject. We do think that accessibility is something that should be taken care of in every single website, and that does impact in that user experience. However, in this case, we are not testing accessibility, because we as UX consultants, actually we run accessibility tests. And this requires a much more customized and very much detailed, and specifically the targets that you have to use for these types of studies and are something more of a custom project than maybe having. But yes, I do agree with you that accessibility is part of this. Obviously, we are trying to involve in the ULX score as many variables that are included in the UX indexes. But obviously, in this case, this is one variable that we cannot include, but precisely because of those things that we have to take into consideration and that we as experts know that have to be involved in accessibility studies.

Okay, so the next one, what kind of test did you go, did you do to get statistical significance for a sample of 32 people? What assumptions did you have about the size of the population? Okay, so actually for statistical significance, we used T-Test, the student test, a statistical test, and the assumptions here, we actually had the population of the United States, but obviously a sample of 34 users, and that’s why I mentioned at the beginning, we just wanted to give you some tendencies, so some sneak peek of those different results that you could get in a ULX. That’s why we call it a mini-ULX, because it has to be said at the beginning and put into notice that the conclusions that we get from a ULX cannot be generalizable into general population.

How long does a full study with 240 users take end to end? So, actually, the questioner, the whole test, the users fill it out in 30 minutes. Those that run through the qualitative sessions that are in the unmoderated take up a little bit longer because they talk through the different tasks. And this is why they give us a lot of information through those qualitative sessions. But normally it takes around 30 minutes to do from beginning to end. If that is the time that the person that is asking is referring to, in another case, if you are talking more about how long we take to do the whole study from the beginning of the kickoff call into presenting the results. Normally, we would take one day in to have this kickoff call and this meeting, and then maybe one week to create that study for you, launch, and obviously depending on the target, we would say that it would take another week to recruit all of the users and around 10 to 15 days to generate the report for you and have that presentation to the stakeholders that may be interested in the results.

OK, so passing on to the next question, are you able to run, I guess, competitor sites that are not public? Yes, unfortunately not. This is something that is part of that natural ability that we have with our panel. We have to be able to send a link to the users. For example, in this case, if there is an app also that they can download, this is something that we can test. However, something that is private or that they need to pay maybe, it would make more difficult and especially if it’s not something not public, this might not work. What we can do is actually, for example, if we can test prototype versus another prototype or versions that you have up to different websites that you might want to differentiate the user experience that people are having with one version or another. This is something that we can compare by providing them with the links. But this would also have to be something that they could access within our platform.

OK, going on to the next. What would be okay? Would it be okay if we do not have 240 participants? Would the results still be valid? Okay, so as I was mentioning before, the 240 users is what we discovered is going to give you a generalizable results. This is actually something that we can extrapolate to the population that you would get. If the sample is smaller, it can be done. It can be done as you saw it on the mini-ULX. However, each time we’re reducing that validity of the results, so we would call it as a tendency and not as conclusive results that could be extrapolated to the general population. So in this case, we do not only recommend, but when we run the study, we do enforce those 240 in order to provide you with really conclusive results and that we can actually stand by those results that we present to you.

Okay, so next question, if we work B2B and we have two main personas who uses the site quite differently, we do need to separate takes with for 480 participants or still 240 would still be okay. Okay, so in this case it is an actually very interesting, not a case I had presented before, but I would guess just from the top of mind that it would have to depend on how different their navigation would be if maybe it’s just a task or something that they have to do differently. Maybe we can do it with the 240 present both of the tasks to both of the targets. And then in that segment of the report where we divide the different segments, there is where we can divide the sample. But once again, we would be dividing the sample and giving you tendencies of one group and another. In the other case, if the targets are very different, their navigation is very different, even the tasks that they have to do in the website differ completely. This would mean that maybe it would be necessary to do two different view access for each of those targets. So we have Michelle here. Thank you, Michelle, for joining. Do you account for neurological, physical diversity in participants? Yes, so going back to that accessibility question, this is a bit related. So obviously in our panel, We have users with several disabilities, but we do not discern with one or another. This is not something that we are including now. And also the accessibility part, as I mentioned before, is not something that we are including in the study because of those conditions that we have noticed doing accessible accessibility studies as UX consultants. And so it’s not something that we have included as part of the evaluation of the ULX. However, we cannot rule out that inside our panel, inside those users that actually respond to our test, they could be included in those panels any type of disabilities that they may have.

Okay, so going over to Arturo, thanks Arturo for your question. Could we use a narrow profile to test our website like Arizona Housewives between 45 and 48? Yeah, so thanks for this answer. For this question, as I mentioned during the presentation, we do have to remain with a broad profile, not as broad as the general population or global population. However, we cannot go as narrow as you’re mentioning here, because in general, 240 people inside Arizona, house wide between 45 and 48, is not an incident rate that would be over 5%. So that is our condition, just to allow us and our panel in Userlytics to be able to achieve that 240 sample for you. So this is why we do also that initial kickoff call with you so that we can see maybe what are those priorities of the criteria and maybe not do it only in Arizona but doing housewives in general with 40 plus years but in the whole United States and validate with our operations team that the incidence rate with those criteria can apply to achieving our 240 sample size.

Okay, thanks so much. So we have a last question here. How do you conduct your benchmarking qualitative study between subjects and within subjects? Why? Okay, so the first question, how do you conduct your benchmarking qualitative study? So basically it’s not, we don’t do two different studies. We just provide a complementing qualitative information in order to support that quantitative metrics that the user provide us. So from the 240 user sample that we have, 10% of those users go through unmoderated sessions. So they do exactly the same test that the rest of the users do, But we record them and they think out loud while they do the different tasks of those websites and while they answer the questioner in order for us to see more of those insights that are shown by how they interact with the website, how they explain why they’re giving one or other score. So this is how we provide that qualitative insight, that qualitative complementation from the actual quantitative results that we also gather from the questionnaire.

Okay. Arturo, thanks for replying. Can this test be run in several countries or is limited to one single country? Great. Yes, this can be run in several countries. Obviously, again, we would have to validate with our operations team that those countries that we include are accomplishing that incidence rate that I mentioned before, but it is something that it is possible to launch. Obviously, if we, for example, have countries with different languages, this either would have to be completed all in English or we would have to translate and require those additional services of translation. However, it is possible. It is something that always we would validate with our operations team and that we would gather the information maybe of two or three countries as one single sample and then in those tendency analysis that we do for the segmentations group, There is where we would divide by the countries and give you the tendency results based on that, those divisions taking always into account that it would be a tendency result rather than the conclusive that we would get from the full sample.

Great, we have another question here, I like this. Are you going to share the recording via, I know, yes. I think we are sharing this, let me just confirm with the team that this is something that we can do. But I’m sure that we can provide you with the recording. Yes, indeed we are sharing this with the instructions also to claim your offer of the 50% for that ULX score and for you to be more motivated into joining their study. If you have also any type of questions you can also reach our team and reach your accounting manager as well in case there is any additional doubts or if you want to meet us just to see if this is something that interests you.

Excellent. So I’m going to give a couple of minutes just to see if anyone has any additional questions they want to share and then that I can reply to you here.

Excellent. I think that there are no more doubts. Fortunately, I hope, I really hope that everyone got what they wished for in this webinar. And really glad that everyone was involved. Thank you so much, everyone, that gave some questions. So we had this Q&A session to clarify the ULX score. And thanks everyone for joining, for attending this webinar. We are very happy to be presenting this product to you and to always help you experience the best of Userlytics and help you improve with your digital assets. Hope you have a great day or night depending on where you are and we’ll see you soon.

Sarita Saffon

Sarita Saffon is a UX Researcher with extensive experience in conducting end-to-end market, brand, consumer and design research, using both quantitative, qualitative, and mixed methods. She has been working in the research field for over 7 years, both on the client and the consulting side, and in diverse industries from technology to consumer goods. Her goal is to be the voice of users inside the businesses, so that their viewpoint is taken into account, achieving this by transforming their opinions and behaviors into actionable insights that help companies make user-centered and data-based decisions.

Schedule a Free Demo

FAQs

What is usability benchmarking?

Usability benchmarking is the process of measuring a product or website’s usability performance over time or against industry standards. By tracking specific metrics, usability benchmarking helps identify areas for improvement and allows for objective comparison of user experience against competitors.

How does the ULX® Benchmarking Score differ from traditional usability testing?

The ULX® Benchmarking Score goes beyond traditional usability testing by evaluating 18 key UX attributes across 8 constructs, including Usability, Appeal, and Trust. Unlike standard usability tests, this score provides a comprehensive user experience assessment, offering a 360-degree measurement that addresses both functional and emotional aspects of UX.

What are some examples of usability benchmarking metrics?

Usability benchmarking often includes metrics like task success rate, error rate, and time-on-task. The ULX® Benchmarking Score expands on these by including additional metrics that evaluate user satisfaction, engagement, and perceived trust, enabling a well-rounded view of user experience.

How can UX benchmarking be conducted effectively?

To conduct UX benchmarking effectively, it’s essential to select relevant metrics, establish baseline scores, and compare data over time or against competitor benchmarks. The ULX® Benchmarking Score provides a structured approach by focusing on critical UX attributes and providing industry-standard comparisons.

What are the key areas for benchmarking in user experience?

In UX, the primary areas for benchmarking include Usability, Appeal, Trust, Accessibility, Engagement, Efficiency, and Loyalty. These areas help assess both the practical and emotional components of user experience, providing a holistic perspective on how well a product or website meets user needs.

Why is competitive benchmarking important in UX?

Competitive benchmarking allows companies to measure their UX performance against competitors, helping to identify strengths and weaknesses in their digital assets. By using tools like the ULX® Benchmarking Score, organizations can gain valuable insights into where they stand in the market and how to improve their user experience.

Where can I find resources on UX benchmarking frameworks and case studies?

For those looking to deepen their understanding, the ULX® Benchmarking Score webinar offers a practical introduction to UX benchmarking frameworks and real-world case studies. It’s an ideal resource for product managers, UX designers, and digital marketers.

Can the ULX® Benchmarking Score be used for competitive benchmarking?

Yes, the ULX® Benchmarking Score is specifically designed for competitive benchmarking, allowing you to measure and compare your UX performance against industry standards and competitor metrics. This tool provides actionable insights that can guide improvements and foster user loyalty.

More FAQs

Watch It Now

Please, fill the form below to watch the webinar

First Name *

Last Name *

Work Email *

Phone

Company *

Role *

Country *

Company Size *

Keep me posted on new features and blog posts

Keep me posted on new features and blog posts

I have read and agree to the privacy policy *

I have read and agree to the terms of use and privacy policy *

Let's get in touch!

Popular Resources

Blog

January 22, 2026

10 Features to Look for in a User Testing Platform

Evaluating a user testing platform? Learn the 10 must-have features to consider, including moderated, unmoderated testing, mobile app testing, and accessibility testing.

the strategy of 2026 represented by a chess move.

Webinar

November 10, 2025

Right Research at the Right Time – Building the 2026 UX Research Roadmap

Plan smarter, more strategic UX research in 2026. Join Userlytics for expert insights on timing, collaboration, and measurable impact.

Whitepaper

July 10, 2025

LLM Showdown: Usability Analysis of ChatGPT, Claude & DeepSeek

ChatGPT, Claude, or DeepSeek? See which LLM stands out in UX and why! Powered by real user data and our ULX® Benchmarking Score.

Podcast

January 21, 2026

Turning Answers Into Insights: How Research Drives Better Business Decisions

Turning answers into insights, this episode explores how Vivian Pinto transforms data into clear understanding of human behavior and user needs.

Ready to Elevate Your UX Game?

Dive into our Resources Hub for a wealth of UX insights and tools, or jumpstart your journey with a free demo today.
Discover how Userlytics can transform your user experience strategy!

Schedule a Free Demo

Resources Hub