4.3 Technical excursus

Before we go into a more technical explanation of what went wrong in these two cases, let us first move from proportions to probabilities. The difference between a proportion and a probability is important here. Note that when Minister Shanmugam asserted the REACH poll provided evidence that the Government’s “assessment of public sentiment turned out to be correct”, he was not suggesting that 680 Singaporeans form the whole Singapore public. The underlying assumption was that since most survey respondents (who were aware) supported the ban, it is likely that most Singaporeans (who are aware) will also support the ban. That is, he was using the proportion of supportive survey respondents (a description of the sample), to infer the probability (a hypothetical quantity) of any one Singaporean supporting the ban.

The difference between a probability and a proportion may be simplified using a coin flip example. If I flip a fair coin 4 times, the proportion of heads may be 0, 0.25, 0.5, 0.75, or 1. However, since it is a fair coin, the probability of getting a heads is, by definition, 0.5. So the proportion may or may not equal the probability. What we know is that the more times I flip the coin, the more likely the proportion of heads will reflect the true probability of getting a heads. It is thus common to hear people say that the probability is the “long-run proportion of an event”. Below is some code (in R) for you to try out the coin flip example.

# Set the number of trials to 4. 
# You may change the number to see what happens.
n <- 4 
# Get the proportion of heads after flipping a fair coin n times. 
# Try this a few times.
sum(rbinom(n, 1, prob=0.5))/n

We have now established that the main reason why we are interested in proportions from a REACH poll is because they purport to tell us something about Singaporeans in general. That is, the REACH poll suggests that if we were to randomly pick a Singaporean from those who are aware of the ban, the probability of this person supporting the ban is about 0.64 (or 64%). The problem at hand then reduces to a trivial probability question, assuming that we all remember basic probability rules from secondary (primary?) school²⁰. If the REACH poll is indeed representative of all Singaporeans, then we have the following quantities:

\[ \begin{aligned} &\Pr(\text{Aware of Ban}) = 0.63 \\ &\Pr(\text{Not Aware of Ban}) = 1 - \Pr(\text{Aware of Ban}) = 0.37 \\ &\Pr(\text{Support Ban } | \text{ Aware of Ban}) = 0.64 \end{aligned} \]

\(\Pr(\text{Support Ban } | \text{ Aware of Ban})\) is a conditional probability, but the quantity that is being asserted in the news article is \(\Pr(\text{Support Ban})\), which is the total probability. Using the law of total probability, we know that:

\[ \begin{aligned} \Pr(\text{Support Ban}) &= \Pr(\text{Support Ban } | \text{ Aware of Ban})\cdot \Pr(\text{Aware of Ban}) \\ & \quad + \Pr(\text{Support Ban } | \text{ Not Aware of Ban})\cdot \Pr(\text{Not Aware of Ban}) \end{aligned} \]

Plugging in the numbers that we have,

\[ \begin{aligned} \Pr(\text{Support Ban}) &= 0.64 \cdot 0.63 + \Pr(\text{Support Ban } | \text{ Not Aware of Ban}) \cdot 0.37 \\ \end{aligned} \]

we see that \(\Pr(\text{Support Ban}) = 0.64\) if and only if \(\Pr(\text{Support Ban } | \text{ Not Aware of Ban})\) also equals \(0.64\). That said, \(\Pr(\text{Support Ban } | \text{ Not Aware of Ban})\) is logically impossible, and should equal zero. Similarly, for the Web-savvy Seniors example,

\[ \begin{aligned} \Pr(\text{Use Internet Daily}) &= \Pr(\text{Use Internet Daily } | \text{ Use Internet})\cdot \Pr(\text{Use Internet}) \\ & \quad + \Pr(\text{Use Internet Daily } | \text{ Don't Use Internet})\cdot \Pr(\text{Don't Use Internet}) \\ &= 0.78 \cdot 0.37 + \Pr(\text{Use Internet Daily } | \text{ Don't Use Internet})\cdot 0.63 \\ \end{aligned} \]

where \(\Pr(\text{Use Internet Daily } | \text{ Don't Use Internet})\) is impossible and should be zero. In both cases, total probabilites are substantially different from the conditional probabilities, and there is no reason to believe they would be the same.

Or that we can Google it if not↩︎