# Bayes, Baseball and Bowling

Joe Average bowled in a recreational league on Tuesday nights from April through August. On Wednesday mornings, relying on his memory, Joe entered his three game scores in a log before breakfast. After breakfast it was his habit to read the box scores of those American League baseball games played on Tuesdays and reported in the Wednesday morning edition of USA Today. That typically consisted in seven games with six data each, namely the runs, hits and errors of the two teams. That was a total of forty-two baseball data compared to the three data, which were Joe’s bowling scores.

(E,J) is the number of Joe’s logged bowling scores that are erroneous.

E is the sum of Joe’s erroneous scores plus reported erroneous AL box scores.

J is the total of Joe’s logged scores, erroneous plus correct.

T is the grand total of scores, Joe’s logged bowling scores plus the reported AL box scores

Given:

(E,J) / E = X = 2/3

E/T = Y = 1/300

J/T = Z = 1/15

What is the probability that a bowling score in Joe’s log is erroneous?

Employing Bayes’ Theorem,

(E,J) / J = (X * Y) / Z

(E,J) / J = ((2/3) * (1/300)) / (1/15)

(E,J) / J = 1/30

The fraction of erroneous bowling scores in Joe’s log was 0.0333

If the time period consisted in twenty weeks, Joe would have recorded 60 scores of which 0.0333 or 2 were in error. If over the same time period, USA Today recorded 42 * 20 = 840 AL box score data, then the total data in the population would be 60 + 840 = 900 of which 0.00333 were in error or 900 * 0.00333 = 3. Thus, there was one typo in the AL box scores recorded by USA Today over the same time period.

By Bayes’ theorem, the probability of error in Joe’s logging of his bowling scores was calculated to be 3.333%.

Could we conclude that the box score data of the American League determined the probability of Joe’s making an error in his bowling log?

What would be the standard of comparison for determining the correctness or error of a datum in Joe’s bowling log? Could the standard of comparison inherently be the data of American League baseball box scores?

To apply Bayes’ theorem a population of data must be partitioned by two independent criteria. In the above example, one criterion partitioned the population into Joe’s data and non-Joe’s data. The other criterion partitioned the population into erroneous data and non-erroneous data.

What is often lost sight of in applying Bayes’ theorem is that the theorem does not treat subsets as antithetical to one another. Rather, it deals with subsets as compatible, as complementary in forming a whole. In the illustration, the baseball scores are not treated as baseball data, but as non-Joe’s data, the complement of Joe’s data.

In Proving History, page 50 ff, Richard Carrier partitions a population of data into historical reports from Source A and reports from non-Source A. Carrier’s other criterion partitions the population of reports into true reports and non-true reports. He then employs Bayes’ theorem to calculate the probability of true reports among all the reports of Source A. That is not what he indicates he has done. He indicates that what he has done is to evaluate the truth of a Source A report where the evaluation is based on the content of non-Source A reports. That would be comparable to claiming that a datum, in Joe’s bowling log, could be determined to be correct or erroneous based on the content of American League baseball box scores as reported in USA Today by employing Bayes’ theorem.

Both Bayes’ theorem and the reports of the American League box scores are pertinent to calculating the probability of errors in Joe’s bowling log. That probability is the fraction of his logged scores which are erroneous. The pertinence is due to the fact that both Bayes’ theorem and probability deal with complementary subsets. In this instance, the complementary subsets are: Some of Joe’s logged scores are erroneous. Some are non-erroneous.

Neither Bayes’ theorem nor the reports of the American League box scores are pertinent to determining whether any particular score in Joe’s log is erroneous or correct. That distinction is between antithetical propositions: This score is erroneous. This score is not erroneous.

Subsets subject to Bayes’ theorem may be nominally antithetical, such as true and non-true, and, in that sense, incompatible. Yet, relevant to Bayes’ theorem, such subsets are merely complementary and in that sense compatible. Their sum equals the entire set. It is their compatibility as complementary which renders the subsets subject to Bayes’ theorem.

Carrier in Proving History, p 50 ff, by conflating antithetical with different, while ignoring the complementary of subsets, completely misrepresents Bayes’ theorem and its utility.

For an algebraic validation of Bayes’ theorem see the first five paragraphs of the essay.