Leveraging AI
Our proprietary AI tools resolve the tension between speed and quality, empowering our world-class researchers to deliver faster and also deeper insights.
How we work

Data is only useful if it inspires compelling insights and actions. We use multiple tools to help our clients move seamlessly from data to insight and action.

Features

Build A Program

Pricing

Resources

Sign In

EmpytMenuItem

asd

Solutions

Case Studies

Upsiide Market Simulator vs MaxDiff

Upsiide Market Simulator vs MaxDiff

We have recently released Market Simulator on the Upsiide platform. Market Simulator democratizes advanced analytics, automatically transforming Upsiide Idea Screen data into forecasts of share of choice, source of volume and incrementality

Post Content

Market Simulator uses the same advanced Hierarchical Bayesian (HB) models that MaxDiff practitioners use. Some have asked how Market Simulator results compare to more traditional MaxDiff analysis.

Executive Summary

Upsiide’s Market Simulator achieves comparable or better share of choice results using less respondent time than MaxDiff. Upsiide is 10%-40% faster for respondents to complete and gives us the following additional benefits over MaxDiff:

  • Mobile first respondent interface
  • More intuitive & engaging respondent experience
  • Produces Idea Scores in absolute terms, not just relative terms (as with MaxDiff)
  • Access additional benefits of the Upsiide platform:
    • Produces additional data visualizations that support strategic decisions. The simple ranking that MaxDiff produces can lead to simplistic interpretation.
    • Fully automated dashboard reporting.
    • Access to questionnaire templates.
    • Access to audience marketplace for programmatic sample buying.

What is the respondent experience of MaxDiff and Upsiide Idea Screen?

MaxDiff presents multiple ideas simultaneously, asking the respondent to choose between those ideas. The methodology is not mobile-friendly because it requires significant scrolling.

Upsiide presents ideas one at a time. Respondents provide a positive (swipe right) or negative (swipe left) reaction to each idea. After respondents review all of the tested ideas using this interface, the respondent is presented with pairs of liked ideas and asked to choose a favorite. The respondent assesses all liked ideas with these paired comparisons.

Idea Ranking vs. Idea Validation

Both Market Simulator and MaxDiff rank your ideas, providing share of preference. However, MaxDiff lacks a clear understanding of the independent strength of each tested idea. What this means in practical terms is that all of your tested ideas might be strong or they might be weak.

Both Upsiide Market Simulator and MaxDiff will rank the ideas relative to each other. Only Upsiide Market Simulator additionally tells you, independent of this ranking, if the ideas are strong or weak. To give an example, Upsiide and MaxDiff could both rank the insects people most want to eat. Upsiide would additionally tell you that (for

the most part) people don’t want to eat any of the insects. This kind of absolute read of the strength of your ideas is critical to making an effective decision.

Synthetic Data Analysis

The remainder of this white paper discusses our analyses comparing the performance of MaxDiff’s share of choice to Upsiide’s share of choice. We start with a technical analysis of synthetic datasets to see how well our models recover the preferences of created “robot” respondents. Next, we review four independent studies conducted as A/B tests, where each study was setup as both an Upsiide Idea Screen and as a MaxDiff.

A typical machine learning (ML) exploratory analysis starts with evaluating your models with synthetic data to understand the statistical properties. With synthetic data, we create known preferences for “respondents” (a.k.a. digital robots), then we use those preferences to “answer” an Upsiide Idea Score and a MaxDiff, then we model the data and compare the results vs. the known preferences. This analysis tells us the theoretical capacity each approach has.

MaxDiff studies require an experimental design to decide how many choice tasks each respondent will complete. If 20 ideas are being tested with five ideas shown per screen, it’s common to use either 8 or 12 choice tasks. We label these as small and large MaxDiffs for this paper. These experimental designs allow each respondent to see each idea two orthree times respectively across the whole exercise. In general, as this number increases, the researcher collects more information from a respondent. Theoretically, more information leads to more accurate models. Experimental designers must make decisions to balance survey length with sufficient data to build quality models. In contrast, Upsiide Idea Screen shows each idea once, minimizing repetition and fatigue. We use synthetic data to compare Upsiide’s Idea Screen performance to both small and large MaxDiff.

The hit rate is simply the percentage of individual’s choices a model can accurately predict. Figure 1 shows the accuracy for Upsiide against the two sizes of MaxDiffs. Here we see Upsiide does well but is slightly outperformed by MaxDiff. However, real-world factors such as respondent engagement, fatigue, and exercise intuitiveness have important effects on real respondents that we can’t capture with synthetic respondents. Synthetic data can’t tell the full story on its own because real respondents aren’t perfectly rational robots! However, this synthetic data analysis indicates that Upsiide can potentially identify the same underlying preferences that MaxDiff is known for.

Figure 1: Accuracy of Model Predictions on External Head-to-Head Validation Questions

Before we move on, a quick note on statistical significance. In quantitative research, we consider statistical significance to tell us if a result occurred due to chance or if it would occur again if the study were repeated. Because this analysis was done with synthetic data, we can easily run this across multiple seeds with thousands of simulated respondents. The differences shown in Figure 1 reveal that Upsiide is significantly less accurate vs. MaxDiff when run on synthetic data.

Testing Upsiide vs MaxDiff

The real-world test of Upsiide vs. MaxDiff performance comes in the form of fielded studies with human respondents. Table 1 below outlines some of details on our base sizes across the tests.

When we analyze model accuracy, we don’t know the true underlying preferences of our respondents, so we ask validation questions we can use our model to predict. We use hit rates, an intuitive metric that tells us how frequently the model predicted the correct validation answers (which were held out from the model). Our validation questions were identical across MaxDiff and Upsiide. The validation sets compared across the four studies include 5-6 head-to-head item comparisons per study.

StudyNumber of Items TestedUpsiide Idea ScreenMaxDiff ExcerciseTotal
Nutrition & Dietary Opinions (text)20N=400N=400N=800
Burrito Toppings (text)25N=400N=400N=800
Big 5 Personality Traits (text)20N=400N=400N=800
Grocery Produce (images)30N=400N=400N=800
TotalN=1600N=1600N=3200

Table 1: Inputs and Sample Size by Cell

Results

Below we summarize the case study results in Figure 3. There is some random variation from study to study, but the average of all four studies shows that Upsiide’s accuracy is equal to MaxDiff when the analysis is based on real-world data (as opposed to synthetic data).

Figure 4 shows that the results between the two exercises are highly correlated. Between 91%-94% of the variation is the same between the two exercises. In other words, they both rank items very similarly and consistently.

Lastly, Figure 5 shows the median survey length. Upsiide is consistently faster to complete for respondents, a result that is most pronounced when compared to the larger MaxDiff.


Figure 3: Summary of Accuracy


Figure 4: Scatterplots Comparing Upsiide and MaxDiff (Large) preferences


Figure 5: Summary of Time to Complete Survey

Conclusion

To summarize, compared to MaxDiff and run on real-world data, Upsiide’s simulations are:

  • Equally accurate (statistical tie)
  • Significantly less time consuming for respondents (statistically faster)
  • Produce Idea Scores in absolute terms, not just relative
  • Fully automated
  • Mobile first
  • More intuitive & engaging

With the synthetic data analysis, we saw Upsiide’s simulation performance was strong but below that of MaxDiff. However, with real people who have attention and engagement constraints, the intuitive approach of Upsiide shines through. In a real-world context, we showed that Upsiide’s Market Simulator predicts external validation questions with equal accuracy as MaxDiff models.

Writing engaging surveys is core to the DNA of Dig Insights and we were awarded with the Quest Award from Dynata for consistently high-quality respondent experiences. This Upsiide analysis shows that we have much to gain if we can design an engaging survey experience.