The CAC Score Study
Part 1 · Part 2 · Part 3 (you are here) · Part 4 · CAC Score methodology →

From Raw Responses to a Verified Analysis Set

This is Part 3 of our four-part series on the CAC Score, the California-lens rating that powers every casino review published by CA Casinos, Palo Alto Organization. Where Part 1 told the story of how the project began with a small player panel in Palo Alto and grew into a California-wide study, and Part 2 set out the sampling design and the instrument we used to collect data, this paper does something more concrete. It walks you through the numbers. We want any California player aged 21 or over to be able to read this section, follow the logic from one statistic to the next, and finish with a clear sense of why our ratings are not opinions dressed up as data but the product of disciplined measurement.

We will not hide the mathematics. We will show the formulas, plug in the values, and explain in plain language what each result means. We will be honest about what the numbers can and cannot tell you. And we will tie every inferential claim back to the sampling design described in Part 2, because a t-test or a confidence interval is only as trustworthy as the sample it rests on. The data set behind this paper is large by the standards of consumer research: N = 4,217 verified California respondents, each a resident aged 21 or over who plays or intends to play online casino games. That is roughly a 1 percent sampling fraction of the estimated active California online-casino audience, and it gives us an overall margin of error of plus or minus 1.51 percent at the 95 percent confidence level.

Before any of that analysis could happen, the raw responses had to be cleaned, screened, and organised into a state in which statistical software could work with them. That cleaning process is where this paper begins, because the integrity of every later number depends on it.

Loading the data into IBM SPSS Statistics

All quantitative analysis for the CAC Score was carried out in IBM SPSS Statistics. We chose SPSS deliberately. It is a mature, auditable environment in which every transformation, recode, and test leaves a syntax trail, so a second analyst can reproduce our results exactly rather than taking them on faith. Reproducibility is a core part of the experience, expertise, authoritativeness, and trustworthiness that we want this study to demonstrate.

The pipeline ran in a fixed order. First, we exported the survey responses from the data-collection platform into a flat file, one row per respondent and one column per item. Second, we imported that file into SPSS and defined the measurement level of every variable: nominal for region and device, ordinal for each five-point Likert item, and scale for the composite pillar scores and the zero-to-ten net-recommend question. Defining measurement levels up front prevents the common error of running a mean on a variable that should be treated as categorical. Third, we labelled every value so that, for example, a 1 on a Likert item always read as strongly disagree and a 5 as strongly agree, in line with the response format Likert described in 1932. Fourth, we built the composite pillar scores as weighted item batteries, then attached the eight pillar weights so the final CAC Score could be computed as a single weighted sum.

Cleaning and screening: from collected to verified

Not every collected response belongs in an analysis set, and treating them all as equal would corrupt the statistics. We applied a screening protocol before any test was run. We removed responses that failed the age or residency verification gate, because only California residents aged 21 or over are eligible to count toward a California score. We removed straight-lining responses, meaning records where a respondent selected the same option for every item in a battery in an implausibly short completion time, since those records carry no real information about satisfaction. We removed records that failed attention checks embedded in the instrument. We screened for impossible or contradictory patterns, such as a respondent reporting both that they had never made a withdrawal and that their last withdrawal was paid within an hour.

We also handled missing data rather than ignoring it. For items with isolated, scattered non-response, we used pairwise treatment so that a single skipped question did not discard an otherwise complete record. For respondents who abandoned the survey early and left whole batteries blank, we excluded the record from pillar-level analysis where that pillar could not be scored. The number that survived this full protocol, the verified analysis set, is the N = 4,217 we report throughout. When we say 4,217, we mean 4,217 real, eligible, screened California players, not 4,217 raw clicks.

The independent-samples t-test reported later in this paper has 4,215 degrees of freedom, which is N minus 2. That is not a rounding artefact. It is the direct signature of a two-group comparison run on the full verified set: the degrees of freedom equal the total sample size minus one for each of the two group means estimated. Seeing 4,215 there is a quick way to confirm that the test was run on the complete analysis set rather than a convenience subset.

Every statistic in this paper describes California residents aged 21 or over who play or intend to play online casino games, observed in spring 2026. We do not generalise these numbers to other states, other age groups, or other time periods. The CAC Score is a California measurement, and we keep it that way on purpose.

The Probability Backbone Behind the Numbers

Before we present a single mean or run a single test, we need to be explicit about why the inferential mathematics in this paper is valid. Significance tests, confidence intervals, and margins of error are not magic. They are statements about how a sample is likely to relate to the population it was drawn from, and they only hold when the sample was drawn in a way that supports probability statements. This is the single most important link in the chain, and it is the one most consumer rankings quietly skip.

Why the maths applies to our sample

Part 2 described our design in full, but the short version matters here. The backbone of our recruitment was stratified random sampling, organised so that the achieved sample mirrors California's regional population. We supplemented that backbone with snowball sampling to reach harder-to-find segments, then re-weighted the combined sample to the regional and demographic strata so it matches the state. Snowball sampling is non-probability by nature, a point Goodman made plainly in 1961, so we treat it as supportive rather than as the foundation of inference. Strict statistical inference, the t-tests and confidence intervals you are about to read, rests on the probability portion of the design. The snowball portion is re-weighted, reported transparently, and used to enrich the picture, not to carry the significance tests.

This distinction is what lets us write down a confidence interval and mean it. A confidence interval is a probability statement, and probability statements require a probability sample. Because our backbone is a stratified random sample and the combined set is re-weighted to California's strata, the intervals we report are defensible. We are not asking you to trust a vibe. We are asking you to trust a design.

The margin of error, worked in full

The overall margin of error follows directly from the sample size and the confidence level. The general formula for the margin of error of a proportion is below.

e = z · √( p(1 − p) / n )

Here z is the critical value for the chosen confidence level, p is the proportion we are estimating, and n is the sample size. We use the most conservative possible value of p, which is 0.5, because p times one minus p is largest at 0.5 and therefore produces the widest, most cautious margin. We use z equal to 1.96 for the 95 percent confidence level, meaning alpha equals 0.05. With N equal to 4,217, the arithmetic runs as follows: 0.5 times 0.5 is 0.25; 0.25 divided by 4,217 is approximately 0.0000593; the square root of that is approximately 0.0077; and 1.96 times 0.0077 is approximately 0.0151. That is a margin of error of plus or minus 1.51 percent, exactly the figure we carry through the study.

The underlying sample-size relationship is the same formula rearranged, which we show here for completeness because it is the planning tool that told us how many respondents we needed in the first place.

n = z² · p(1 − p) / e²

Because we are sampling roughly 1 percent of a large but finite population, we also note the finite population correction, which gently shrinks the required sample when the population is not effectively infinite.

n₀ = n / (1 + (n − 1)/N)

At a 1 percent sampling fraction the correction is small, which is why our headline margin sits at plus or minus 1.51 percent. The practical takeaway for a California player reading a CAC Score is this: when we say a casino scored, for example, 91 on the California survey pillar, the true value for the population we sampled sits in a tight band around that figure, not in some wide cloud of guesswork.

Margins of error by region

Because we stratified by region, each regional subsample has its own margin of error, and smaller regions carry wider margins simply because they contain fewer respondents. The same formula applies within each stratum; only n changes. The table below repeats the regional design from Part 2 so you can see, at a glance, how confidence narrows as a regional sample grows.

RegionCA population shareAchieved nMargin of error
Southern California58%2,446±1.98%
San Francisco Bay Area20%843±3.38%
Central Valley11%464±4.55%
Sacramento Metro6%253±6.16%
Central Coast3%127±8.7%
North State2%84±10.69%
Total100%4,217±1.51%

Read this table as a map of certainty. Southern California, with 2,446 respondents, gives us a tight margin of plus or minus 1.98 percent, so regional claims about Southern California players are firm. North State, with only 84 respondents, carries a margin of plus or minus 10.69 percent, so any single-region claim about North State players is necessarily softer and we phrase it cautiously. This is why the overall margin of plus or minus 1.51 percent is the headline figure: it pools the full sample and is far tighter than any one region on its own. When we later compare regions in an ANOVA, these differing margins are part of why the regional effect, though real, is modest.

Descriptive Statistics by Pillar

With the analysis set verified and the design validated, we turn to the descriptive results. Descriptive statistics do not test a hypothesis; they describe what we observed. For each of the seven survey-measured pillars we report five numbers, and each answers a different question. The mean answers what the typical California player thought. The standard deviation and variance answer how much players disagreed with one another. The 95 percent confidence interval answers how precisely we have pinned down the true mean. And Cronbach's alpha answers whether the items in that pillar's battery actually hang together as a coherent measure.

The pillar table, read in full

PillarMeanSDVariance95% CICronbach alpha
Trust & Licensing4.210.740.554.19 to 4.230.89
Payout Speed & Banking3.980.920.853.95 to 4.010.86
Bonuses & Value3.761.031.063.73 to 3.790.84
Game Selection4.340.680.464.32 to 4.360.91
Security & Fairness4.120.790.624.10 to 4.140.88
Customer Support3.691.081.173.66 to 3.720.85
Mobile & Responsible Gambling4.050.710.504.03 to 4.070.87

Mean California satisfaction by pillar (1 to 5 scale)

Trust & Licensing4.21Payout Speed & Banking3.98Bonuses & Value3.76Game Selection4.34Security & Fairness4.12Customer Support3.69Mobile & Responsible Gambling4.05
Pillar-level means from the quantitative survey. Game selection scored highest; customer support lowest.

What the means tell us

Read down the mean column and a clear hierarchy of California sentiment appears. Game Selection sits highest at 4.34 on the five-point scale, which tells us that California players are broadly happy with the breadth and quality of games on offer; this is the pillar where the market most consistently meets expectations. Trust and Licensing follows at 4.21, and Security and Fairness at 4.12, which together say that players generally feel their money and their play are safe. Mobile and Responsible Gambling sits at 4.05, consistent with a mobile-first audience that finds the apps usable and the safer-gambling tools reachable.

The lower end of the column is just as informative. Customer Support sits lowest at 3.69, and Bonuses and Value next at 3.76. These are the two pillars where California players are most often left wanting. A mean of 3.69 is still on the positive side of the scale midpoint of 3, so support is not a disaster across the board, but it is clearly the weakest link in the typical California experience. Payout Speed and Banking sits in the middle at 3.98, just under the symbolic threshold of 4, which fits a market where withdrawals are mostly fine but occasionally frustrating. The means, in short, paint a picture of an audience that trusts the games and the licensing but is most easily disappointed by slow or unhelpful support and by bonus terms that do not deliver the value they seemed to promise.

What the variance tells us, and why it differs from the mean

The mean tells you the centre of the distribution. The variance, and its square root the standard deviation, tells you the spread. Two pillars could share an identical mean while one reflects near-unanimous agreement and the other reflects a population split into the delighted and the furious. That distinction matters enormously for a rating system, because a high average built on wild disagreement is a riskier promise to a new player than a slightly lower average built on broad consensus.

We compute the spread with the sample variance, the average squared distance of each response from the mean, using n minus 1 in the denominator so the estimate is unbiased.

s² = Σ(xᵢ − x̄)² / (n − 1)

Now read the variance column against the mean column. Game Selection has both the highest mean, 4.34, and the lowest variance, 0.46, with a standard deviation of just 0.68. That is the ideal combination: players are not only satisfied, they agree that they are satisfied. Trust and Licensing tells a similar story, a high mean of 4.21 with a modest variance of 0.55. These are stable, dependable pillars.

Contrast that with Customer Support and Bonuses and Value. Customer Support has the lowest mean, 3.69, and the highest variance, 1.17, with a standard deviation of 1.08. Bonuses and Value is close behind, with a variance of 1.06 and a standard deviation of 1.03. High variance on these pillars means the California experience is inconsistent: some players report fast, helpful support and clearly explained wagering terms, while others report long waits and bonus conditions that felt like traps. The high variance is itself a finding. It tells a prospective player that on support and bonuses, your individual experience is far less predictable than it is on games or trust. This is precisely why our CAC Score weights and our qualitative analysis pay such close attention to these two pillars; the dispersion, not just the average, is where the risk lives.

What the confidence intervals tell us

The 95 percent confidence interval expresses how precisely we have located each pillar's true mean for the population we sampled. We construct it by taking the observed mean and adding and subtracting the critical value times the standard error of the mean, where the standard error is the standard deviation divided by the square root of the sample size.

x̄ ± 1.96 · (s / √n)

The intervals in our table are strikingly narrow, all within a few hundredths of a point of their means. Trust and Licensing, for example, sits at 4.21 with a 95 percent confidence interval of 4.19 to 4.23. The reason these intervals are so tight is the size of the denominator: with thousands of respondents, the square root of n is large, the standard error is small, and the interval collapses around the mean. This is the practical payoff of a large sample. The correct way to read the interval is this: if we repeated this study many times under the same design, about 95 percent of the intervals we constructed would contain the true population mean. It is not a statement that 95 percent of players scored between 4.19 and 4.23; that is what the standard deviation describes. The interval is about the precision of our estimate of the average, and ours is high.

Because the intervals are so narrow and, in several cases, do not overlap, we can already see hints of meaningful differences between pillars before we run a single formal test. Game Selection at 4.32 to 4.36 sits clearly above Trust and Licensing at 4.19 to 4.23, which in turn sits clearly above Customer Support at 3.66 to 3.72. Non-overlapping intervals of this kind are a strong informal signal that the differences are real rather than noise, and they set up the formal significance testing that follows.

What Cronbach's alpha tells us about reliability

A mean is only meaningful if the items that produced it actually measure the same underlying thing. We checked this with Cronbach's alpha, the standard coefficient of internal consistency introduced by Cronbach in 1951.

α = (k / (k − 1)) · (1 − Σsᵢ² / sₜ²)

In this formula k is the number of items in the battery, the numerator term sums the variances of the individual items, and the denominator is the variance of the total score. Intuitively, alpha rises when the items move together and falls when they pull in different directions. Conventionally, values above 0.70 are acceptable, above 0.80 are good, and above 0.90 are excellent. Every pillar battery in our study clears the acceptable bar comfortably, with alpha ranging from 0.84 on Bonuses and Value to 0.91 on Game Selection. That range tells us the multi-item batteries are coherent: when a respondent rated one trust item highly, they tended to rate the other trust items highly too, which is exactly what we want from a scale. The reliability evidence means the pillar means above are not averages of unrelated questions but stable measurements of genuine constructs.

Inferential Results: Testing Whether Differences Are Real

Descriptive statistics describe the sample. Inferential statistics ask whether the patterns we see in the sample are likely to hold in the population, or whether they could plausibly be the product of random sampling variation. This is the heart of the paper, and it is where significance, p-values, and confidence come together. We ran two principal tests: an independent-samples t-test comparing two groups of casinos, and a one-way analysis of variance comparing the six California regions.

The independent-samples t-test: crypto-first versus fiat-reliant casinos

One of the clearest patterns in the raw data was that California players reported faster, happier payout experiences at casinos built around cryptocurrency withdrawals than at casinos that lean on traditional fiat banking rails. The question is whether that gap is real or just noise. We tested it with an independent-samples t-test on payout-speed satisfaction, comparing crypto-first casinos against fiat-reliant casinos. The t statistic compares the difference between two group means against the variability within those groups.

t = (x̄₁ − x̄₂) / √( s₁²/n₁ + s₂²/n₂ )

Read the formula plainly. The numerator is the size of the difference between the two group means. The denominator is a measure of how much the scores bounce around within each group, scaled by sample size. A large t arises when the difference between groups is big relative to the noise within them. Crypto-first casinos posted a mean payout-speed satisfaction of 4.31, while fiat-reliant casinos posted 3.42, a raw gap of 0.89 on the five-point scale. Running the test on the full verified analysis set produced t(4215) = 18.7, p < .001.

That result is decisive, and here is what each piece means. The 4,215 in parentheses is the degrees of freedom, N minus 2, confirming the test used the whole sample. The t value of 18.7 is enormous; conventionally a t around 2 is enough to reach significance at the 95 percent level with a large sample, so 18.7 is far beyond any reasonable doubt. The p-value, p < .001, is the probability of observing a difference this large, or larger, if there were in truth no difference between crypto-first and fiat-reliant casinos. A p below .001 means less than one chance in a thousand. In plain English: the payout-satisfaction advantage of crypto-first casinos among California players is real, large, and not an accident of sampling. This is a finding California players can act on, and it is one reason crypto payout time appears as a measured, hands-on metric in our casino table later in this paper.

Understanding significance, p-values and confidence together

Because these three ideas trip up so many readers, it is worth pinning them down in one place. A p-value is not the probability that our claim is wrong, and it is not the probability that the difference is zero. It is the probability of seeing data at least as extreme as ours if the null hypothesis, the assumption of no real difference, were true. When that probability is very small, we reject the null hypothesis and call the result statistically significant. We set our threshold at alpha equal to 0.05, the same 95 percent confidence level that governs our margins of error, so the whole study speaks a single statistical language.

Significance and confidence are two sides of one coin. A 95 percent confidence interval that excludes zero difference corresponds to a significant result at the 0.05 level, and a significant t-test corresponds to a confidence interval around the difference that does not cross zero. The t-test and the confidence interval are not rival tools; they are the same logic expressed two ways. One last caution we hold ourselves to: statistical significance is not the same as practical importance. With a sample as large as ours, even tiny differences can reach significance, so we always report the size of a difference, such as the 0.89-point payout gap, alongside its p-value, and we let California players judge whether a difference is big enough to matter to them.

The one-way ANOVA: comparing the six regions

The t-test compares two groups. To compare more than two groups at once without inflating the false-positive rate, we use a one-way analysis of variance. We asked whether payout satisfaction differs across California's six regions, the same regional strata that anchor our sampling design. ANOVA works by partitioning the total variability in scores into two parts: variability between the region means and variability within each region, then taking their ratio.

F = MS​between / MS​within

The logic is intuitive. If the regions genuinely differ, the spread between the regional means will be large relative to the random spread of individuals within each region, and the F ratio will be well above 1. If the regions are effectively the same, between and within variability will be comparable and F will sit near 1. Our result was F(5, 4211) = 2.94, p = .012. The two numbers in parentheses are the degrees of freedom: 5 for the between-groups term, which is the number of regions minus one, and 4,211 for the within-groups term, which reflects the full sample minus the six group means. The p-value of .012 is below our 0.05 threshold, so the regional differences in payout satisfaction are statistically significant.

But we read this result with care, because significant does not mean large. An F of 2.94 is modest, and a p of .012, while below .05, is far from the vanishing p < .001 of the crypto-versus-fiat t-test. The honest interpretation is that there is real but small regional variation in how satisfied California players are with payouts. Players in some regions report slightly faster or smoother withdrawal experiences than others, but the effect is gentle, not dramatic. This is also where the regional margins of error from the earlier table come back into play: the smaller regions like North State carry wide margins, which makes large, confident regional claims inappropriate. The ANOVA confirms a regional signal exists; it also keeps us from overstating it. For the CAC Score, the practical consequence is that we treat payout satisfaction as a primarily casino-level property, with region as a minor modifier rather than a headline factor.

Net-recommend: the one number that tracks loyalty

Alongside the Likert batteries we asked the net-recommend question drawn from Reichheld's 2003 work: how likely are you to recommend this casino to another California player, on a zero-to-ten scale. We convert these into a net-recommend figure by subtracting the share of detractors from the share of promoters. The pattern tracks the CAC tiers cleanly. Top-tier casinos posted a net-recommend of +58, mid-tier casinos +21, and bottom-tier casinos -9. A positive figure means more California players would actively recommend the casino than would warn others away; a negative figure, as at the bottom tier, means the warnings outnumber the recommendations. The smooth decline from +58 to +21 to -9 as CAC Scores fall is strong external evidence that our scoring captures something players actually feel, not an arbitrary ranking. A casino that earns a high CAC Score also earns the willingness of California players to put their own reputation behind it.

Summary of inferential results

TestComparisonStatisticp-valueVerdict
Independent-samples t-testCrypto-first vs fiat-reliant payout satisfaction (4.31 vs 3.42)t(4215) = 18.7p < .001Significant, large effect
One-way ANOVAPayout satisfaction across 6 California regionsF(5, 4211) = 2.94p = .012Significant, modest effect
Net-recommend (NPS)Top vs mid vs bottom CAC tier+58 / +21 / -9Descriptive trendTracks CAC tiers cleanly

Read together, these tests draw a consistent map. The single biggest, most reliable driver of payout satisfaction is whether a casino is built for crypto withdrawals, a difference so large it would occur by chance less than once in a thousand studies. Region matters too, but only a little. And the loyalty players feel, captured by net-recommend, lines up with the CAC tiers exactly as it should if the score is measuring something real.

The Casinos in the Data

Statistics about pillars and regions are the engine, but California players come to us for verdicts on specific casinos. This section brings the individual casinos into the data and connects their California survey scores to the hands-on measurements we took ourselves. We reviewed fifteen casinos under the California lens. For each, the table reports the final CAC Score out of 100, the California survey score out of 100 from the quantitative study, and the operational metrics we measured directly during testing: crypto payout time in hours, fiat payout time in days, live-chat response time in minutes, the typical bonus wagering requirement as a multiplier, and a mobile score out of 100.

CAC Score by casino (California lens)

Ignition98BetOnline97All Star Slots96Super Slots95Slots.lv91Slots of Vegas90Bovada89Wild Casino86Cafe Casino85Lucky Red84Black Lotus83Lucky Creek82Shazam77BetWhale76VoltageBet75
Final CAC Scores across the 15 reviewed casinos, ordered high to low.
CasinoCAC ScoreCA survey scoreCrypto payout (hrs)Fiat payout (days)Live-chat (min)Wagering (x)Mobile /100
Ignition98981.52.12.32699
BetOnline97971.82.22.52698
All Star Slots96962.02.32.62697
Super Slots95952.22.32.82796
Slots.lv91913.22.63.42892
Slots of Vegas90903.52.73.62991
Bovada89893.82.83.82990
Wild Casino86864.53.04.23087
Cafe Casino85854.83.04.43086
Lucky Red84845.03.14.63185
Black Lotus83835.23.24.73184
Lucky Creek82825.53.34.93183
Shazam77776.83.65.73378
BetWhale76767.03.75.83377
VoltageBet75757.23.76.03476

Reading the casino table

The first thing to notice is that the CAC Score and the California survey score move in lockstep across all fifteen casinos. That is not a coincidence and it is not circular reasoning; it reflects the fact that the California Player Survey carries the single largest pillar weight at 20 percent, so the survey is the strongest single voice in the composite while the other seven pillars, measured through our own testing and verification, fill out the rest. The alignment tells you that what California players report and what we measure independently point the same way.

The second pattern is the clean gradient in the operational metrics as you move down the table. At the top, Ignition pays crypto withdrawals in about 1.5 hours, answers live chat in about 2.3 minutes, and attaches a wagering requirement of 26 times. At the bottom, VoltageBet takes around 7.2 hours for crypto withdrawals, 6.0 minutes to answer chat, and imposes a 34 times wagering requirement. Every metric degrades together as the score falls, which is exactly what the t-test predicted: faster payouts, especially on crypto rails, travel with higher satisfaction. The casino table is the t-test made concrete, one operator at a time.

Which casino differences are statistically meaningful

A natural question is whether a one-point gap between, say, Ignition at 98 and BetOnline at 97 is a real difference or just measurement noise. Here we apply the same confidence logic from the pillar section. The differences between adjacent casinos near the top are small and fall well within the kind of measurement band our margins imply, so we treat the top cluster, Ignition, BetOnline, All Star Slots, and Super Slots, as a band of excellence rather than a strict ladder where 98 is decisively better than 95. The honest claim is that these four are all excellent and effectively indistinguishable for most California players.

The meaningful differences are between tiers, not within them. The gap from the top band in the mid-90s down to the mid-80s, where Wild Casino, Cafe Casino, and the others sit, is large enough to be substantive: those casinos pay slower, answer slower, and demand more wagering, and California players notice. The gap down to the bottom band in the mid-70s, Shazam, BetWhale, and VoltageBet, is larger still, and it is mirrored in the net-recommend figures, where the bottom tier turns negative at -9. So while we report a precise number for each casino, we encourage California players to read the tiers: 90 to 100 is excellent, 80 to 89 is very good, 70 to 79 is good, and below 70 is fair or lower. A difference that crosses a tier boundary is one you can feel; a one-point difference inside a tier is not.

The weighted score that ties it together

Every CAC Score in the table is a single weighted sum of the eight pillars, with the weights summing to one.

CAC = Σ wᵢ · pillarᵢ,   Σ wᵢ = 1

This is the bridge between the descriptive pillar statistics, the inferential tests, and the per-casino verdicts. The pillar means tell us how California players feel about each dimension across the market; the weights tell us how much each dimension counts toward a final California score; and the per-casino measurements feed each pillar for each operator. The result is a number that is data-driven from end to end, and a star rating out of five that approximates the CAC Score divided by 20 for readers who want a quicker overall US editorial read.

Qualitative Results: What Players Said in Their Own Words

Numbers tell us how much and how reliably. They do not tell us why. To understand the reasons behind the scores, we turned to the qualitative side of our mixed-methods design: open-ended survey responses plus 60 follow-up depth interviews with California players. We analysed this material using the six-phase thematic analysis approach set out by Braun and Clarke in 2006, namely familiarisation with the data, generating initial codes, searching for themes, reviewing themes, defining and naming themes, and producing the report. Two coders worked the material and we checked inter-coder agreement so that the themes reflect the data rather than one analyst's preconceptions. Importantly, the qualitative findings do not rest on the probability mathematics; they rest on the rigour of the coding process, and they are reported as a complement to, not a substitute for, the quantitative results.

The six themes and their frequencies

ThemeMentionsNet sentiment
Payout reliability and speed1,914mixed-positive
Bonus terms clarity / wagering1,562mixed-negative
Game and provider variety1,341positive
Customer support responsiveness1,203mixed
Trust, licensing and fairness1,107positive
California acceptance / geo-restrictions988negative

Interpreting the themes

The most-mentioned theme by a clear margin was payout reliability and speed, with 1,914 mentions and a mixed-positive sentiment. California players talk about withdrawals more than anything else, and the mixed-positive tone fits the quantitative finding precisely: payouts are usually fine, which is why sentiment leans positive, but the occasional slow or contested withdrawal generates strong negative comment, which is why it is mixed rather than purely positive. This theme is the qualitative face of the crypto-versus-fiat t-test; in the interviews, the players who praised payouts overwhelmingly described crypto withdrawals, while the frustrations clustered around fiat banking delays.

Bonus terms clarity and wagering came second with 1,562 mentions and a mixed-negative tone. This is the only theme that tilts negative among the high-frequency items, and it maps directly onto the Bonuses and Value pillar, which had one of the lowest means at 3.76 and one of the highest variances at 1.06. Players repeatedly described feeling that a headline bonus looked generous until the wagering requirement, often in the 26 to 34 times range we measured, made it far harder to realise than they expected. The negative sentiment here is not about the existence of bonuses but about the gap between the framing of an offer and its real value, a tension that connects to how players weigh framed risk in the behavioural-economics literature underpinning the Bonuses and Value pillar.

Game and provider variety, with 1,341 mentions and a clearly positive sentiment, mirrors the Game Selection pillar that scored highest at 4.34 with the lowest variance. When players talk about games, they are usually happy, and they agree with one another. Trust, licensing and fairness, at 1,107 mentions and positive, similarly tracks the strong, low-variance Trust and Licensing pillar. These two themes are the qualitative confirmation that the market's strengths are genuine and broadly felt.

Customer support responsiveness drew 1,203 mentions with a frankly mixed sentiment, exactly what we would expect from the pillar with the lowest mean, 3.69, and the highest variance, 1.17. The interviews made the dispersion human: some players described support that solved their problem in minutes, others described being passed between agents or waiting through long live-chat queues, consistent with the chat-response times we measured ranging from 2.3 minutes at the top to 6.0 minutes at the bottom. The mixed sentiment is the high variance speaking in words.

Finally, California acceptance and geo-restrictions drew 988 mentions and was the most clearly negative theme. This one does not have a dedicated quantitative pillar, which is precisely why the qualitative work earns its place: it surfaced a friction that a closed-ended survey might have missed entirely. California players repeatedly described uncertainty over whether a casino would accept them, frustration with geo-blocks, and anxiety about whether they were doing something permitted. This theme shaped how we frame acceptance and eligibility throughout our reviews, and it is a direct example of qualitative data revealing a problem the numbers alone would not have named.

How the qualitative and quantitative results triangulate

The real power of a mixed-methods design is triangulation, the convergence of independent lines of evidence on the same conclusion. In our study the two strands agree theme by theme. The pillar that scored highest and most consistently, Game Selection, produced the most positive qualitative theme. The pillars that scored lowest and most variably, Customer Support and Bonuses and Value, produced the mixed and mixed-negative themes. The biggest inferential finding, the crypto-versus-fiat payout gap, is echoed in the most-mentioned qualitative theme. When closed-ended ratings and open-ended stories point the same way, we can be far more confident that we are measuring reality and not an artefact of one method. Where they diverge, as with the geo-restriction theme that had no quantitative pillar, the qualitative work expands the picture rather than contradicting it. That is triangulation working as it should, and it is a central pillar of the trustworthiness this study is built to demonstrate.

Validity, Reliability and Honest Limits

A results paper that only reported its strengths would not be trustworthy, so we close the analysis by stating plainly what supports our numbers and where the limits lie. On reliability, the Cronbach's alpha values from 0.84 to 0.91 confirm that our pillar batteries are internally consistent, so the means and variances we report are measurements of coherent constructs rather than noise. On construct validity, each pillar is grounded in an established framework rather than invented ad hoc: trust theory behind Trust and Licensing, service-quality theory behind Customer Support, technology-acceptance theory behind Mobile and Responsible Gambling, and behavioural decision theory behind Bonuses and Value, as detailed in Parts 1 and 2.

On inferential validity, we restate the central point of this paper: the margins of error, the confidence intervals, the t-test, and the ANOVA all draw their authority from the stratified random probability backbone, with the non-probability snowball portion re-weighted to California's strata and reported as supportive rather than foundational. The variance, confidence interval, t-test, and ANOVA mathematics are correct as applied, and they apply to that probability sample. We will not overstate them: significant regional variation is still modest variation, and a significant difference is not automatically a large one.

On limitations, we are candid. Self-selection and recall bias are possible in any voluntary survey, so we cross-check self-reported payout times against our own hands-on testing, which is why the casino table reports measured crypto and fiat payout times alongside the survey scores. Participation was entirely voluntary, with informed consent, the right to decline any question, and the right to withdraw without penalty; responses were anonymised and we publish aggregates only. And our findings describe California players aged 21 or over in spring 2026. Markets change, operators change, and we expect to refield this study and update the CAC Scores accordingly. Stating these limits is not a weakness; it is what separates a defensible measurement from a marketing claim.

Conclusion: Numbers You Can Stand On

This paper set out to walk California players through the data behind the CAC Score, and the through-line is simple. We cleaned and screened thousands of responses down to a verified analysis set of 4,217 eligible California players. We described each pillar with a mean, a spread, a confidence interval, and a reliability coefficient, and we read those numbers rather than merely listing them: California players trust the games and the licensing, and are most often let down by support and by bonus terms. We tested whether the patterns were real, and found a large, decisive payout advantage for crypto-first casinos, t(4215) = 18.7, p < .001, alongside a modest but genuine regional effect, F(5, 4211) = 2.94, p = .012. We brought the fifteen casinos into the data and showed how their measured payout, support, wagering, and mobile metrics line up with their California survey scores and their CAC tiers. And we let players speak, finding that the qualitative themes triangulate cleanly with the quantitative results.

Every inferential claim here rests on the probability sampling design, and we have shown the formulas and the arithmetic so you can check the logic yourself. That is the standard we hold ourselves to, and it is why a CAC Score is a number a California player can stand on. To see how these scores are presented and used in practice, visit our methodology hub at /cac-score/. This has been Part 3, the results and analysis paper. It sits alongside Part 1, which tells the story and theory of the CAC Score from its Palo Alto origins, Part 2, which sets out the sampling design and the survey instrument in detail, and Part 4, which translates these findings into the practical scoring and review process that California players see on every casino page.

  1. Goodman, L. A. (1961). Snowball Sampling. Annals of Mathematical Statistics, 32(1), 148-170.
  2. Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Wiley.
  3. Likert, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psychology, 140, 1-55.
  4. Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16(3), 297-334.
  5. Parasuraman, A., Zeithaml, V. A., & Berry, L. L. (1988). SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of Service Quality. Journal of Retailing, 64(1), 12-40.
  6. Davis, F. D. (1989). Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly, 13(3), 319-340.
  7. Kahneman, D., & Tversky, A. (1979). Prospect Theory: An Analysis of Decision under Risk. Econometrica, 47(2), 263-291.
  8. Reichheld, F. F. (2003). The One Number You Need to Grow. Harvard Business Review, 81(12), 46-54.
  9. Braun, V., & Clarke, V. (2006). Using Thematic Analysis in Psychology. Qualitative Research in Psychology, 3(2), 77-101.
  10. Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An Integrative Model of Organizational Trust. Academy of Management Review, 20(3), 709-734.
  11. Oliver, R. L. (1980). A Cognitive Model of the Antecedents and Consequences of Satisfaction Decisions. Journal of Marketing Research, 17(4), 460-469.
Play responsibly. Gambling is intended for adults aged 21 and older. If you or someone you know has a gambling problem, call 1-800-522-4700 or visit the California Council on Problem Gambling.