Archive

Monthly Archives: January 2017

Joe Average bowled in a recreational league on Tuesday nights from April through August. On Wednesday mornings, relying on his memory, Joe entered his three game scores in a log before breakfast. After breakfast it was his habit to read the box scores of those American League baseball games played on Tuesdays and reported in the Wednesday morning edition of USA Today. That typically consisted in seven games with six data each, namely the runs, hits and errors of the two teams. That was a total of forty-two baseball data compared to the three data, which were Joe’s bowling scores.

joebowl

(E,J) is the number of Joe’s logged bowling scores that are erroneous.

E is the sum of Joe’s erroneous scores plus reported erroneous AL box scores.

J is the total of Joe’s logged scores, erroneous plus correct.

T is the grand total of scores, Joe’s logged bowling scores plus the reported AL box scores

Given:

(E,J) / E = X = 2/3

E/T = Y = 1/300

J/T = Z = 1/15

What is the probability that a bowling score in Joe’s log is erroneous?

Answer:

Employing Bayes’ Theorem,

(E,J) / J = (X * Y) / Z

(E,J) / J = ((2/3) * (1/300)) / (1/15)

(E,J) / J = 1/30

The fraction of erroneous bowling scores in Joe’s log was 0.0333

Comments:

If the time period consisted in twenty weeks, Joe would have recorded 60 scores of which 0.0333 or 2 were in error. If over the same time period, USA Today recorded 42 * 20 = 840 AL box score data, then the total data in the population would be 60 + 840 = 900 of which 0.00333 were in error or 900 * 0.00333 = 3. Thus, there was one typo in the AL box scores recorded by USA Today over the same time period.

By Bayes’ theorem, the probability of error in Joe’s logging of his bowling scores was calculated to be 3.333%.

Could we conclude that the box score data of the American League determined the probability of Joe’s making an error in his bowling log?

What would be the standard of comparison for determining the correctness or error of a datum in Joe’s bowling log? Could the standard of comparison inherently be the data of American League baseball box scores?

To apply Bayes’ theorem a population of data must be partitioned by two independent criteria. In the above example, one criterion partitioned the population into Joe’s data and non-Joe’s data. The other criterion partitioned the population into erroneous data and non-erroneous data.

What is often lost sight of in applying Bayes’ theorem is that the theorem does not treat subsets as antithetical to one another. Rather, it deals with subsets as compatible, as complementary in forming a whole. In the illustration, the baseball scores are not treated as baseball data, but as non-Joe’s data, the complement of Joe’s data.

In Proving History, page 50 ff, Richard Carrier partitions a population of data into historical reports from Source A and reports from non-Source A. Carrier’s other criterion partitions the population of reports into true reports and non-true reports. He then employs Bayes’ theorem to calculate the probability of true reports among all the reports of Source A. That is not what he indicates he has done. He indicates that what he has done is to evaluate the truth of a Source A report where the evaluation is based on the content of non-Source A reports. That would be comparable to claiming that a datum, in Joe’s bowling log, could be determined to be correct or erroneous based on the content of American League baseball box scores as reported in USA Today by employing Bayes’ theorem.

Both Bayes’ theorem and the reports of the American League box scores are pertinent to calculating the probability of errors in Joe’s bowling log. That probability is the fraction of his logged scores which are erroneous. The pertinence is due to the fact that both Bayes’ theorem and probability deal with complementary subsets. In this instance, the complementary subsets are: Some of Joe’s logged scores are erroneous. Some are non-erroneous.

Neither Bayes’ theorem nor the reports of the American League box scores are pertinent to determining whether any particular score in Joe’s log is erroneous or correct. That distinction is between antithetical propositions: This score is erroneous. This score is not erroneous.

Subsets subject to Bayes’ theorem may be nominally antithetical, such as true and non-true, and, in that sense, incompatible. Yet, relevant to Bayes’ theorem, such subsets are merely complementary and in that sense compatible. Their sum equals the entire set. It is their compatibility as complementary which renders the subsets subject to Bayes’ theorem.

Carrier in Proving History, p 50 ff, by conflating antithetical with different, while ignoring the complementary of subsets, completely misrepresents Bayes’ theorem and its utility.

For an algebraic validation of Bayes’ theorem see the first five paragraphs of the essay.

Advertisements

On page 50 of Proving History, Richard Carrier states,

Notice that the bottom expression (the denominator) represents the sum total of all possibilities, and the top expression (the numerator) represents your theory (or whatever theory you are testing the merit of), so we have a standard calculation of odds: your theory in ratio to all theories.

Carrier is proposing that Bayes’ theorem can be used to determine the truth of your theory which is one among many theories. Carrier implicitly claims that Bayes’ theorem can be used to determine the truth of your theory according to the numerical value of the probability of your theory with respect to all theories, i.e. ‘your theory in ratio to all theories’.

If there are n theories of which yours is one, then the probability of your theory is 1/n, but so too the probability of every other theory in the set of all theories is 1/n. Consequently, such a probability is no indication of the truth or non-truth of your theory. If Carrier’s statement of what is calculated by Bayes’ theorem were true, then Bayes’ theorem has no relevance to determining the truth of your theory.

What Probabilities of Your Theory(s) are Determinable by Bayes’ Theorem?

Probability is the ratio of a subset to a set. Thus, what we are asking is what ratios, within the context of Bayes’ theorem, have your theory(s) alone in the numerator and your theory(s) plus other theories in the denominator.

The population of elements to which Bayes’ theorem applies, may be viewed as a surface over which the population density varies. A Bayesian population is divided into two portions by each of two independent criteria. One criterion may be viewed as dividing the population into two horizontal portions, while the other criterion divides it into two vertical portions. The result is the formation of four quadrants, which differ in population due to the non-uniformity of the population density.

The two portions formed by the horizontal division may be distinguished as the horizontal top row, HT, and the horizontal bottom row, HB. The two portions formed by the vertical division may be distinguished as the vertical left column, VL, and the vertical right column, VR. The two portions, HT + HB add up to the total, T, as do the two portions, VL and VR. The quadrants are designated as Q1 through Q4. Each of the portions is the sum of two quadrants, e.g. HT = Q1 + Q2 and VL = Q1 + Q3.

Tabulation of a Bayesian Population

Tabulation of a Bayesian Population

In the illustrated Bayesian population, the column VR has the role of non-VL. Thus, rather than being one column, VR, may be any number of columns, whose sum is the complement of VL. Analogously, the row, HB, has the role of non-HT. Consequently, Bayes’ theorem is applicable to any number of rows and any number of columns, where the additional rows and columns may be treated in their sum, respectively as non-HT and non-VL, i.e. as HB and VR, respectively.

Bayes’ theorem, in its algebraic expression, which focuses on Q1, is:

Q1/VL = ((Q1/HT) / (VL/T)) * (HT/T) Eq. 1

The two terms, HT, cancel out as do the two terms, T. This leaves the identity, Q1/VL ≡ Q1/VL, which proves the validity of Bayes’ theorem. In the application of Bayes’ theorem the numerical values of the numerators and the denominators of the fractions are not given. What is given are the numerical values of the three fractions on the right hand side of the equation, which permits the calculation of the numerical value of the fraction, Q1/VL, as a fraction.

In the context of the quotation of Carrier: HT are true theories and HB are non-true theories; VL are your theories and VR are non-your or others’ theories. Thus Q1/VL, which is calculated by Bayes’ theorem is the probability of your true theories in ratio to all of your theories. This is what Carrier falsely states is ‘your theory in ratio to all theories’. (I will substantiate that Carrier is referring to Q1/VL later in this essay.)

Let me first list the other probabilities of your theory(s) calculable using Bayes’ theorem, Eq. 1. We can solve Eq. 1 for three other probabilities of your true theories, and of your theories, besides Q1/VL. They are Q1/HT, VL/T and Q1/T.

Q1/HT = (Q1/VL) * ((VL/T)) / (HT/T)) Eq. 2

VL/T = ((Q1/HT) / (Q1/VL)) * (HT/T) Eq. 3

Q1/T = (Q1/VL) * (VL/T) Eq. 4

To What Bayesian Ratio is Carrier Referring as ‘your theory in ratio to all theories’?

In Eq. 2, Q1/HT is the probability of your true theory(s) in ratio to all true theories. This probability is restricted to true theories. If this probability were what Carrier is referring to by ‘your theory in ratio to all theories’, he would be granting that your theory is true, and is not a ‘theory you are testing the merit of’.

In Eq. 3, VL/T is the probability of all of your theories, true and non-true, in ratio to all theories. This probability lumps both your true theories and your non-true theories together, so it could not be a test of the merit of your theory(s). For example, you have ten theories, whether true or non-true, the fact that there are five or a million other theories, has no relevance to the merit of your theory(s).

In Eq. 4, Q1/T is the probability of your true theory(s) in ratio to all theories. This ratio, which acknowledges the truth of your true theory cannot be a test of the merit of your theory. Nevertheless, Q1/T appears close to ‘your theory in ratio to all theories’. It lacks the word, true, after the word, your. However, as shown below, Carrier cannot be referring to Q1/T, but must be referring to Q1/VL.

Carrier in the Quote is Referring to Q1/VL

The common expression of Bayes’ theorem is Eq. 1, which calculates Q1/VL.

Q1/VL is the probability of your true theories in ratio to all of your theories. It is this which Carrier falsely labels ‘your theory in ratio to all theories’. Admittedly, Carrier’s expression, ‘your theory’ can be understood as your true theory(s), but it is obvious that by the words, ‘all theories’, Carrier means all theories and does not mean only all of your theories.

We must ask if Carrier could not have been referring to Q1/T, expressed as Eq. 4 and not Q1/VL, expressed in Eq. 1. The reason that it is Q1/VL becomes apparent by his verbal presentation of Bayes’ theorem as,

verbal-3

Typically, Bayes’ theorem is expressed as Eq. 1. In Eq. 1, the denominator is VL/T. However, VL/T is often expressed as the sum,

VL/T = (Q1/HT) * (HT/T) + (Q3/HB) * (HB/T) Eq. 5

The denominator of Carrier’s verbalized version of the Bayesian equation is undeniably an attempt to express this sum.

The validity of Eq. 5 is apparent in that,

VL/T = Q1/T + Q3/T = (Q1 + Q3)/T, where Q1 + Q3 = VL

Due to the fact that Carrier is attempting to verbalize the standard expression of Bayes’ theorem, i.e. Eq. 1, then the denominator is VL/T. VL/T is the ratio of all your theories to all theories. It cannot be in any way construed to be simply ‘all theories’ as Carrier claims. VL/T is obviously a ratio, in which T is all theories.

If Carrier had meant to express, Q1/T, as in Eq. 4, by his verbalization, the term VL/T, expressed as a sum, would then be a direct factor as it is in Eq. 4. VL/T would not be in the denominator, i.e. an inverse factor, as it is in Carrier’s verbalization and as it is in Eq. 1.

There is another reason that it is apparent that Carrier’s verbalization is expressing Q1/VL as in Eq. 1. The numerator of Eq. 1 is (Q1/HT) * (HT/T). This is the first term of VL/T when VL/T is expressed as a sum as in Eq. 5. In his verbalization, Carrier acknowledges that the first term of the sum of his denominator is his numerator. Thus, Carrier’s verbalization is meant to express Eq. 1, where the denominator, VL/T, is not ‘all theories’, as Carrier claims. VL/T is the ratio of all your theories to all theories.

Also, it should be noted that the numerator of Bayes’ theorem, Eq. 1, which is (Q1/HT) * (HT/T), is Q1/T. Thus, the numerator of Bayes’ theorem is the probability of your true theories over all theories, and is not as Carrier claims simply ‘your theory’.

Conclusion

Carrier’s explanation of Bayes’ theorem on page 50, of Proving History as ‘your theory in ratio to all theories’ is completely erroneous.