Richard Dawkins has identified a single stage of Darwinian evolution, resulting in the evolution of such as the mammalian eye, as absurd due to the improbability of such a one-off event (The God Delusion, p. 122). A one-off stage would require the existence of an immediate offspring with a fully developed mammalian eye as one of the many immediate mutants of a progenitor, who has no significant precursors to the mammalian eye. Dawkins notes that the problem of improbability of Darwinian evolution in a one-off event is solved by replacing such a one-off stage of Darwinian evolution terminated by natural selection, with a series of substages, each substage terminated by natural selection. By this process,

“natural selection . . . breaks the problem of improbability up into small pieces. Each of the small pieces is slightly improbable, but not prohibitively so.”

Critique

Improbability and probability are complements. Their sum is 1. Consequently, if the process of replacing a stage of probability with a series of substages breaks up the large improbability of the single stage into the smaller pieces of improbability of the individual substages, then it must be true that the process concomitantly breaks up the complementary small piece of probability of the single stage into the larger pieces of probability of the series of substages. That is absurd.

Dawkins uses an expression of the process of addition (break up into smaller pieces) to express a relationship of multiplication. That relationship of multiplication is that the product of a series of probabilities equals the probability of the series as a whole, which is the probability of the corresponding overall single stage of probability. Dawkins’ error requires that nonsense be true, i.e. the breaking up of a small piece into larger pieces.

Dawkins has not presented a solution to what he labels ‘the problem of improbability’. Actually, there is no problem of improbability. Every value of probability (or improbability) is equally valid over its continuous range of definition. If any value of probability is accepted as an explanation, every value over the entire continuous range of definition must be accepted as an explanation. Richard Dawkins cannot divide the range of a continuous variable into two discrete segments, such as “prohibitively improbable’ and ‘not prohibitively improbable’. In fact, he has rightly argued against any such division of the range of definition of a continuous variable. By his own argument he cannot accept values of probability in one segment of the continuous range as an explanation and reject values of probability over a different segment of that range, as an explanation. He has not cogently defined a ‘problem of improbability’.

Replacing a single stage of Darwinian evolution with a series of substages, has no effect on the probability of success of evolution through natural selection. Rather, it increases efficiency in the generation of mutations.

The Darwinian dial locks of Richard Dawkins are excellent for illustrating the role of gradualism in Darwinian evolution. However, Dawkins chose a numerical example (beginning at minute 4:25) that was too complicated for him to understand. Indeed, he fooled himself. He chose one lock of 3 dials of 6 positions each (3 mutation sites of 6 mutations each). He compared this to a series of three locks with each lock having one dial of six positions (3 independent mutation sites of 6 mutations each). The one lock of 3 dials was subjected to mutation and natural selection in a single stage. The series of three locks corresponded to Darwinian evolution in three substages.

Richard Dawkins would have understood his own illustration of Darwinian evolution, if he had chosen locks of fewer positions per dial (fewer mutations per mutation site). A good choice would still be one lock of 3 mutation sites (dials) , but of just 2 mutations (positions) each. This would be compared to a series of three locks (each one a single mutation site) as in his illustration, but of only 2 positions (2 mutations each) instead of 6 mutations each.

The single lock would define a total of 8 possible mutations in just one stage of natural selection:

0, 0, 0        0, 1, 0        1, 0, 0        1, 1, 0        0, 0, 1        0, 1, 1        1, 0, 1        1, 1, 1

The set of three independent locks would define a total of 6 possible mutations, 2 mutations in each of 3 substages of natural selection. In each substage only one lock is subject to mutation. The other two mutation sites (locks) are Not Affected within the substage:

Substage #1              Substage #2              Substage #3

NA, NA, 0                   NA, 0, NA                  0, NA, NA

NA, NA, 1                   NA, 1, NA                  1, NA, NA

It requires a maximum pool of 8 non-random mutations to ensure the presence in the pool of one copy of the mutation that can survive natural selection in the single stage, which affects all three mutation sites simultaneously. The probability of success of natural selection is 100%.

It requires a maximum pool of 2 non-random mutations to ensure the presence in the pool of one copy of the mutation that can survive natural selection in each of the three substages, where only one mutation site (lock) is affected in each substage. The probability of success of natural selection is 100% for each substage and 100% for the series of three substages.

Both processes are equal in the probability of success of natural selection. The series is more mutationally efficient in that it requires only six mutations in total, rather than 8 for the single stage.

The same situation of greater mutational efficiency of the series of substages compared to the single stage of Darwinian evolution is true for Dawkins’ example where each mutation site was subject to six mutations each rather than merely two. However, the numbers were too large for Dawkins to see that the contrast was that of mutational efficiency, not that of a difference in the probability of success of natural selection.

In explaining his example, Dawkins described the probability of success of the single stage as 1/216 and thereby requiring a maximum of 216 “tries”. In contrast, he characterized the series of three substages, as a probability of 1/6 per substage and a probability of 1/18 overall. He thereby claimed that the series of substages resulted in a greater probability of success of natural selection and thus solved ‘the problem of improbability’ of Darwinian evolution (The God Delusion, p 121). Without this solution, Darwinian evolution remains “absurd” in Dawkins’ judgment (The God Delusion, p 122).

Due to the relatively large numbers of possible mutations (216 compared to 8), Dawkins confused the total of different non-random mutations as random, that is, as “tries”. However, he also identified 216 as a minimum rather than as the maximum of non-random mutations in the pool subjected to natural selection, required to ensure the inclusion of at least one copy of the mutation capable of surviving natural selection.

If Dawkins was really discussing random mutations or “tries”, rather than non-random mutations, it would take a minimum of one random mutation to achieve success in the single stage, but at a probability of 1/216 = 0.46%. Raising the number of “tries” to 216 random mutations would result in a probability of success of 63.3% in the single stage. The number 216, however, cannot be construed to be a minimum.

For 2, 3, 4, 5, and 6 mutations per mutation site, the single stage affecting all three sites, requires, respectively, a maximum of 8, 27, 64, 125, and 216 non-random mutations to ensure a probability of 100% success of natural selection. In contrast, the series of three substages requires a total of 6, 9, 12, 15, and 18 non-random mutations, respectively, to ensure a probability of 100% success of natural selection. This makes it quite clear that the series of substages is more efficient than the single stage, with sub-staging having no effect on the probability of success.

If the time of generation of mutations was proportional to the number of mutations generated, then mutational efficiency could be expressed as the efficiency of time.

In discussing Darwinian Locks in The God Delusion (p 122), Dawkins does this. He refers to the series of substages of Darwinian evolutionary locks as successful “in no time” in contrast to the longer time required by the single stage of a Darwinian evolutionary lock. Yet, he remains oblivious to his citing this correct difference in efficiency, mistaking it for a difference in the probability of success of natural selection, which is 100% for both the series of substages and the single stage.

The significance of ‘prior’ in the jargon of Bayes’ theorem is that of priority in acquiring knowledge. It is the distinction between an algebraic term as given, i.e., as a ‘prior’ in contrast to an algebraic term to be calculated. This renders the distinction trivial because Bayes’ theorem consists of four algebraic terms, each of which can be expressed as to be calculated as a function of the other three as given, i.e., as ‘prior’.

Bayes’ theorem is an algebraic equation of the form

Y = X × (U/V)                      Bayes’ theorem equation

This, of course is the classic equation of a straight line, where Y, is a straight-line function of X of which (U/V) is the slope of the line. Implicitly, X is similarly a straight-line function of Y, of which (V/U) is the slope of the line. Depending on context, the expression of Y as a straight-line function of X may be apropos or the expression of X as a straight-line function of Y may be apropos, or neither may be appropriate.

Bayes’ theorem is of the form, where Y is expressed as a straight-line function of X. In Bayesian jargon, X is said to be the ‘prior’ of Y. In other words, X, which is the probability with respect to the general population, is known and Y, which is the specific probability of a subset of the population, is calculated. Of course, the ratio (U/V) of the two probabilities, U and V, must also be known. Three probabilities must be known to calculate the fourth. Of course, to calculate the fourth, the ‘prior’ may be known plus the ratio, U/V, without knowledge of U and V individually.

Let us call all populations which have probability relationships identical to those to which Bayes’ theorem is applicable, Bayesian populations. If for such a population we knew the value of (U/V) and the probability of a specific subset, Y, rather than the probability of the whole population, X, we could calculate the value of X. Even though the calculation is not technically that of Bayes’ theorem, we could employ Bayesian jargon and call the known value of the probability of the specific subpopulation, the ‘prior’. This illustrates the triviality of the term, ‘prior’ in Bayesian jargon. One non-Bayesian equation applicable to the Bayesian population is,

X =Y × (V/U)                       One non-theorem equation of a Bayesian population

The difference between the equation of Bayes’ theorem and the equation of this non-theorem is whether one knows (is given) as ‘prior’ the value of the probability of the general population, X, or the probability of a specific subset, Y, of the population. Indeed, depending on the particular population, one might more likely know the probability of the subset of the population rather than the probability of the general population.

An Illustration

In illustration, consider the population of students in a high school, where the property, whose probability is of concern, is blue eyes and the particular subset is freshman. This is a Bayesian population. It is a population divided into two subsets by each of two properties, blue eyes and non-blue eyes and freshmen and non-freshman. The two properties are independent of one another. For both equations:

X is the probability of blue eyes in the general student population.

Y is the probability of blue eyes in a specific subset of the population, the subset composed of all freshmen.

U is the probability of blue-eyed freshman in the subset of all blue-eyed students.

V is the probability of freshmen in the general student population.

For the equation of Bayes’ theorem, X is the ‘prior’, the ‘known’ as given, while Y is the unknown to be calculated. Thus, knowledge starts with the probability of blue eyes for the general population, i.e. the entire student body. The probability of blue eyes is then calculated for the specific subset, namely, the subset of freshmen.

For the non-theorem equation of the Bayesian population, Y is the ‘prior’, the ‘known’ as given, while X is the unknown to be calculated. Thus, knowledge starts with the probability of blue eyes for the specific population, the subset of freshmen, while the probability of blue eyes for the general student population is calculated.

For both equations, the ratio of U/V is given. In the equation of the theorem, it appears as U/V and in the non-theorem equation as V/U. U is the probability of blue eyed freshmen among all blue eyed students. V is the probability of freshmen among all students.

A Non-application to a Bayesian Population

No one would apply Bayes’ theorem to the freshman-blue eyes illustration because of the obscurity of what must be known in order to apply it. Ranking X, Y, U, V as least to most obscure, yields:

V = the fraction of students, who are freshmen.

Y = Of freshman, the fraction that have blue eyes.

X = Of all students, the fraction that have blue eyes.

U = Of students with blue eyes, the fraction who are freshmen

In this example, Bayes’ theorem calculates Y, the next to the least obscure information in the list. Therefore, one would not use Bayes’ theorem. One would simply survey the freshman class to determine the fraction of freshman with blue eyes. That would be much easier to ascertain than (1) X, the fraction of all students with blue eyes plus (2) U, the fraction of freshman in the set of all students with blue eyes. Such obscure information is required by Bayes’ theorem to calculate that which is less obscure.

An Application of Bayes’ Theorem

Let there be an inexpensive dermatological test for TB, which has no false negatives, but some false positives. The test would identify as positive all those with TB plus some without TB. Those few that tested positive could be given more expensive chest X-rays, that definitively distinguish between those who have TB and those who do not.

Suppose you test positive with the dermatological test. While waiting for the chest X-ray and its result, you would like to know the probability of your test’s being a true positive. You could employ Bayes’ theorem to calculate Y, if you knew the values of X, U, and V, where

Y = the fraction of those who test positive, who have TB

X = the fraction of the general population, who have TB

U = the fraction of those with TB, who test positive.

V = the fraction of the general population, who test positive.

A numerical example of this would be a population of which:

  • 2 % had TB (X), all of which tested positive (U = 1). There are no false negatives.
  • 4% of the population tests positive (V).

Employing Bayes’ theorem,

Y = X × (U/V) = 2% × (1/4%) = 0.5. The probability of having TB based on a positive dermatological test, in this example, would be 50%.

Another Non-Application of Bayes’ Theorem

The jargon of Bayes’ Theorem emphasizes the knowledge of prior probabilities. Contrast the valid conjecture of the general probability of blue eyes in the students of two high schools prior to the possible application of Bayes’ theorem to the freshman classes. One high school is in Sweden, the other in Kenya. Most likely the general probability of blue eyed students for the Swedish school would be high and that for the Kenyan school would be low.

Would this be an example of Bayesian reasoning with respect to ‘priors’? Definitely not, for several reasons. One of these is that the comparison in the example is between two different populations, whereas Bayes’ theorem considers a population and a subset of that one population.

Conclusion

Bayes’ Theorem is clearly defined, but its jargon often renders its apparent application ambiguous and often erroneous. Also, Bayes’ theorem is often not useful because it requires more obscure information in order to calculate the less obscure.

Richard Dawkins claims that the solution to the ‘problem of improbability’ is gradualism. A large stage of probability is replaced by a series of substages of probability. Each substage is slightly improbable, but not prohibitively so (p 121, The God Delusion). In a lecture, “Climbing Mount Improbable” (beginning at min 4:25), Dawkins uses a set of three mutation sites of six mutations each to illustrate this. Mathematically, this is analogous to the mutations of three dice.

In Chapter 4 of The God Delusion, Dawkins uses this mathematical analysis as his central argument of ‘why there almost certainly is no God’. Obviously, God could not develop gradually. Thus, the mathematical solution to ‘the problem of improbability’ does not solve the improbability of God.

The Problem and Dawkins’ Numerical Solution

In a lecture, “Climbing Mount Improbable”, Dawkins notes that three mutation sites (e.g. three dice) of six mutations each (six faces), defines 6 x 6 x 6 = 216 possible mutations. If, instead of subjecting the three dice to mutation simultaneously, each is subjected individually, the result is a series of three substages of six mutations each. Dawkins characterizes the single stage as a probability of 1/216 and the three substages as a probability of 1/6 each.

In Dawkins’ jargon, the gradualism of the series breaks up the big improbability, 215/216, of the single stage into three smaller improbabilities of 5/6 per substage. But, he is not that explicit. His illustrated comparison is between the 216 defined mutations of the single stage to the sum of 6 mutations each, for a total of 18 defined mutations for the series. His comparison is essentially of these two listings of total mutations, 216 vs. 18.

Dawkins’ Error

What Dawkins is comparing are not probabilities or improbabilities but total defined mutations. The total listing of defined mutations for the single, overall stage is 216. The total listing of defined mutations for the series of three substage is 18. BOTH lists contain the same three digit sequence of interest. The single list is 111 to 666. The series of three substages yields three sub-lists of 1 to 6, 1 to 6, and 1 to 6. The digits of the three sub-lists, correspond to the three digits of the single stage listing.  Dawkins’ comparison is not of probability or improbability, but the total number of mutations in lists, i.e. the one list of 216 compared to the sum of the three sub-lists, which in total is 18. Consequently, both the single stage and the series of substages represent a probability of success of 100% of containing the special 3-digit sequence. The two processes do differ. They differ in mutational efficiency, the series of three substages requiring fewer mutations by the mutational efficiency factor of 216/18 = 12.

The Same Analysis Using the Analogy of Three Coins

Let us make the problem simpler and more easily understood by illustrating it with the three mutation sites as three coins: P, a penny; N, a  nickel: and D, a dime. Each site is of two mutations: heads and tails, or 1 and 2.

Instead of a list of 216 defined mutations, we have a list of 8 defined mutations for the three mutation sites subjected to mutation together. These 8 are: P1,N1,D1; P1,N1,D2; P1,N2,D1; P1,N2,D2; P2,N1,D1; P2,N1,D2 ; P2,N2,D1; P2,N2,D2.

Instead of a list of 18 defined mutations, we have a list of 6 defined mutations for the three mutation sites subjected to mutation individually in series. These are: P1,P2; N1,N2; D1,D2. Both lists contain every possible combination and both, therefore, have a probability of 100%. The series of three substages is more efficient in mutations at achieving the 100% probability than the single stage by a mutational efficiency factor of 8/6 = 1.33.

The gradualism of Dawkins’ solution does not increase the probability of success, thereby solving his ‘problem of improbability’. Dawkins’ gradualism increases mutational efficiency, while having no effect on the probability of success.

Dawkins’ argument is irrelevant to the existence of God. However, it renders Darwinian evolution ‘absurd’, which is the label Dawkins applied to Darwinian evolution in a single, overall stage (p 122, The God Delusion). Dawkins did not solve his ‘problem of improbability’, which, in his judgment, renders Darwinian evolution in a single, large stage ‘absurd’. The series of substages and the single stage have the same probability of success as one another, namely 100%. They differ in mutational efficiency. It is noteworthy that Dawkins uses nonrandom mutation  in his comparison of a single stage vs its series of substages.

The Mutational Efficiency of Gradualism in the Case of Random Mutation

Three Mutation Sites of Six Mutations Each (Dice)

With random mutation, The single overall stage would require 497 random mutations to achieve a probability of success of 90%. However, it would take only 19 random mutations per substage to achieve an overall probability of 90%. That is a total of 57 random mutations for the series of substages. Thus the series is 497/57 = 8.7 times mutationally more efficient than the single stage without any effect on the probability of success.

Three Mutation Sites of Two Mutations Each (Coins)

Five random mutation in each substage of the series would yield a probability of success per substage of 96.875% and an overall probability for the series of 90.9%. That would be a total of 15 random mutations. A probability of success of 90.9% for the single stage would require 18 random mutations. Thus the series is  18/15 = 1.2 times mutationally more efficient than the single stage without any effect on the probability of success.

Summary

In The God Delusion and in a Lecture, “Climbing Mount Improbable”, Richard Dawkins claims that low values of probability pose a mathematical problem which is solved by replacing a single stage of probability with a series of substages. The solution, which is that of gradualism cannot apply to God, who cannot be subject to gradualism.

Dawkins uses a set of three mutation sites of six mutation each to illustrate his mathematical solution. What he has demonstrated is not, as he claims, an increase in the probability of success, but an increase in mutational efficiency.

This essay demonstrates the mutational efficiency of the series of substages. It also demonstrates that the gradualism of a series of substages has no effect on the probability of success. Dawkins’ claim is false. This is illustrated using 3 mutation sites of 6 mutations each (dice) and 3 mutation sites of 2 mutations each (coins).

Dawkins has not presented a God Delusion, but a Mathematical Self-Delusion, his personal error in mathematics, which is mistaking efficiency for probability.

Dawkins’ critics are right, but often for the wrong reasons. Often, their counter arguments are not only invalid, but less cogent than Dawkins’ false arguments with which they disagree. A case in point is the counter to Dawkins’ solution to ‘the problem of improbability’ in The God Delusion that Hahn and Wiker proposed in Answering the New Atheism.

Dawkins’ solution to ‘the problem of improbability’ is the central argument in The God Delusion (p 157) from which Dawkins concludes there almost certainly is no God

A Strawman

Dawkins rejects very low values of probability as explanatory and proposes that the replacement of a large, “one-off” stage of random mutation and natural selection by a series of substages, solves ‘the problem of improbability’. The series of substages increases the probability by breaking the improbability up into small pieces (TGD p 121). The example Dawkins cites is the development of the mammalian eye in one stage of Darwinian evolution, replaced by a series of substages (TGD p 122).

This proposed mathematical solution to ‘the problem of improbability’, namely, by gradual development, obviously cannot be applied to God.

In spite of Dawkins’ claim that he is presenting the solution of how to escape from chance (TGD p 120), Hahn and Wiker ignore that solution completely. They blatantly accuse Dawkins of unlimited faith in “Dawkins’ god, Chance” (ATNA Chapter 1). They argue against a strawman. Even so, Hahn and Wiker fail against that strawman.

Dawkins proposes that the solution to low values of probability is replacing the single stage of low probability with a series of substages, each of a high probability. In contrast, Hahn and Wiker propose that the solution to the low value of probability of a material event is the event’s being explained, not by a higher probability, but by its being due to an intelligent agent. They propose as an analogy, the event of a perfect deal in bridge: Its low probability, would prompt the conclusion that someone stacked the deck (ATNT p 125).

Hahn and Wiker’s Almost Complete Lack of Understanding of Probability (Chance)

The probability of every deal in bridge is the same. Therefore, one could not conclude anything unusual about the ‘prefect deal’ based on its probability. If, in response to witnessing a perfect deal, one exclaimed, based on its probability,  “Someone must have stacked the deck!”, he would have to make that exclamation after every deal in bridge.

Han and Wiker actually do come close to stating the definition of probability, namely, the ratio of a subset to a set. They note that the probability of rolling a five with a die is 1/6. That is the ratio of the subset of fives to the set of the six integers, 1 through 6. Rolling a die, is a simulation of the static mathematical concept of probability illustrated by random selection.

But then Hahn and Wiker completely mess up. They define chance as “a secondary shadow of other beings and causes” (ATNA p 21). That is utterly meaningless. They arrive at this definition after claiming that the probability of 1/6 can be eliminated by removing the pips on the six faces of a die. In other words, a cube has six distinct faces, only if they are labelled distinctly. In fact, it is the other way round. The faces of a cube can be labelled with six distinct labels because a cube has six distinct faces, whether labelled or not. But why let algebra and geometry stand in the way of critiquing Richard Dawkins’ published arguments?

Probability is not a property of anything material. It is a definition within the logic of mathematical sets. Humans typically have a set of five fingers per hand, but the concept of a set is fundamentally logical. Probability, the ratio of a subset to a set, is definitively logical.

There are valid critiques of Dawkins’ solution to ‘the problem of improbability’.

In his 1991 lecture, ‘Climbing Mount Improbable’ Richard Dawkins made two main errors. He counted a set of 7, namely 0 through 6, and came up with a total of 6. Each of the dials of the lock in the illustration has 7, not 6 stops or ‘mutations’ (a still frame of the dial set at minute 7:36 of the video is shown above). This error is a trivial error of practice, not an error of concept. The other error was conceptual. Dawkins analyzed the stops of each dial of the lock as a chance rather than as a choice. He has persisted in this error, The God Delusion (p 121-122) as his ‘solution’ to ‘the problem of improbability’.

In the 1991 lecture, Dawkins contrasted two variations. In one case, the three dials were interdependent in unlocking a single locking mechanism. In the other case, each dial independently controlled its own locking mechanism. All three had to be unlocked to achieve complete unlocking.

Dawkins considered: What is the minimum number of ‘mutations’, which must be listed to include every possible unlocking combination in the two different cases? That is, what is the minimum number of deliberate choices? (Let 6 be the number of stops or mutations per dial.) If one were to write them down, the first list would number 216 three-digit sequences from 111 to 666. In the second case of three independent locks, the three digit sequence would be covered by lists of 1 through 6, one list for each of the dials. That is a minimum total of 18 mutations, in contrast to a list of a minimum of 216 mutations to insure 100% probability of unlocking. That is a mutational efficiency of 12 in favor of the independent dials of choice, the ratio is 216/18.

Dawkins conceptual error arose from the fact that he used the jargon of chance and probability in an analysis of the efficiency in the minimum number of mutations of the dials, i.e. the minimum number of deliberate choices. He calls the deliberate choices on the lock dial ‘tries’ and ‘random’ as if they were stops on a wheel of chance spun at a carnival.  He contrasts a probability of 1/216 to three probabilities of 1/6. Due to such jargon, he conceptually reaches the false conclusion that the series of three independent lock dials presents an increase in the probability of success in unlocking when compared to the three interdependent locking dials.

His jargon did him in conceptually. He mistook dials of choice for wheels of chance and consequently mistook an increase in mutational efficiency for an increase in the probability of success. He claims to have smeared out the luck, getting it in dribs and drabs, rather than in one big dollop. In fact his entire analysis was of efficiency in the number of defined mutations and had nothing to do with probability.

The central theme of The God Delusion by Richard Dawkins depends upon the validity of his demonstration of a solution to a mathematical ‘problem of improbability’. He claims that a process of gradualism increases a value of probability from  a ‘prohibitive’ value close to zero to a ‘non-prohibitive’ value. According to Dawkins, this increase is the role of gradualism in Darwinian evolution. Replacing a single stage of Darwinian evolution with a series of substages “breaks the problem of improbability up into small pieces.” (p 121). There is no comparable mathematical solution to the improbability of God (p 113).

Dawkins’ arguments fail to support his central theme, namely: The mathematical solution to the ‘problem of improbability’ is the replacement of a single stage of low probability with a series of substages. In this essay I propose to demonstrate that Dawkins fails to understand the distinctions among three mathematical ratios relevant to his central theme. These three ratios are those of density, probability, and efficiency.

Density

A density is the ratio or concentration of a specific material uniformly distributed within a generic material. Physical units are part of the definition.

For example, 1 gram of salt per liter of water

For example, 1 green marble/5 marbles

When sets are defined by a density, all subsets of an identified population are of the same composition. With respect to density, e.g. that of one green marble per every 5 marbles, every subset of 5 marbles is identical in composition.

Probability

A probability is the numerical ratio of the elements of a specific subset to the elements of the generic set. There are no physical units.

For example, a probability of green marbles: 1/5

When a population of subsets is algorithmically defined as randomly generated on the basis of a probability, the population of subsets generated by the algorithm is a distribution of subsets which differ from one another in composition.

For example, the population of subsets of 5 marbles each, based on a probability of green marbles of 1/5, contains six types of subsets differing by composition. The six types contain 0 to 5 green marbles each. Further, the number of subsets of each of these 6 different compositions in the overall population are distributed, respectively: 1024, 1280, 640, 160, 20, and 1. These six different kinds of subsets total 5^5 = 3125 subsets, which is the minimum number of subsets required to characterize the random population defined by the algorithm. The probability of each of these six different subsets is its number in the minimum population divided by 3125. The minimum number of marbles required to represent the population distribution is 5^6 = 15625.

Numerical Distribution of the Six Different Kinds of Subsets of 5 Marbles Each,
Based on a Source Probability of Green Marbles of 1/5

ABCDEFG
# of green marbles per random subset of 5 marbles4 raised to this power is column, CThe number of different combinations of non-green marbles Formula for calculating column E based on green marblesThe number of different color patternsC times E The total of random subsets of type AProbability of a Subset of Type A, i.e. F/3125
0510241110240.3277
142565/1512800.4096
2364(5∙4)/2!106400.2048
3216(5∙4∙3)/3!101600.0512
4145!/4!5200.0064
5015!/5!110.0003
Total‏‎→    3125 Subsets1.0000

The sum of column F is 5^5 = 3125 subsets of 5 marbles each.

In contrast to column C  for non-green marbles, the number of different combinations of green marbles is 1 for each of the six different types of subsets.

The basis listed in column D is that of green marbles. The same results are had on the basis of non-green marbles.

In contrast to density where the entire population of sets is modal, a population of set size X, formed on the basis of a probability of 1/X, the fraction of the population which is of modal sets is: [(X-1)^(X-1)] /[X^(X-1)]. In the above case, (4^4)/(5^4) = 0.4096. This fraction is a maximum of 0.5 for X= 2 and decreases with increasing X.

The probability of each of the six types of random subsets of 0 through 5 of green marbles is its value in column F divided by 3125. Thus the probability of the modal subset, which contains 1 green marble is 1280/3125 = 0.4096 or 41%. In contrast, based on a density of 1 green marble/5 marbles, the probability would be 100% that each subset of 5 marbles contained 1 green marble.

Dawkins’ Mistaking Density for Probability

On pages 137-138 of The God Delusion, Dawkins takes one billion × one billion planets as a conservative estimate of the number of planets in the universe. He then calculates the number of earth-like planets in the universe, allegedly based on a probability of earth-like planets in the universe as 1/billion. His answer is one billion earth-like planets.

“This conclusion is so surprising, I’ll say it again. If the odds of life originating spontaneously on a planet were a billion to one against, nevertheless that stupefyingly improbable event would still happen on a billion planets.”

It is true that based on a density of 1 earth-like planet per 1 billion planets, then a set of one billion × one billion planets (that of our universe) would contain one billion earth-like planets. However, based on a probability of earth-like planets of 1/billion, then a random subset of one billion × one billion planets (such as our universe) might contain any number of earth-like planets from zero to one billion × one billion.

In the tabulated example above, a probability of green marbles of 1/5, defines a population distribution of 3125 subsets of 5 marbles each. The probability of a set of exactly one green marble is 41%; the probability of a set of 0 is 32.8%; and the probability of a set of 1 or more is 67.2%. In this example, the erroneous conclusion analogous to that of Dawkins would be that each of the 3125 different subsets, comprising the population distribution, would not differ from one another. Instead each subset of five marbles in the population would contain exactly 1 green marble and four non-green marbles. Dawkins mistakes density for probability.

Sample Size with Respect to Density and to Probability

In his example of earth-like planets, Dawkins considers a density, the numerical value of which is 1/X, and a sample size of X^2. The number of earth-like planets in the sample is N = Density × Sample Size, i.e. N = (1/X) × X^2 = X. The sample size of X^2 is fully adequate in this application and in every application of a density of 1/X. Not so for a probability of 1/X. Notice from the table above, that a sample size of X^2 is a sample of 25 marbles, which number would be woefully inadequate to represent the population defined by a probability of green marbles of 1/5 and comprised of subsets of 5 marbles each. The minimum number of subsets required to characterize the population is not 5, but 3125 = 5^5. The corresponding minimum number of marbles to characterize the population is not 25, but 5^6 = 15625. A total of 25 marbles is too small a fraction of the minimum number of marbles required to be characteristic of the population. That fraction is 25/15625 = 1/625.

For a defining probability of 1/X and where the subset size of the defined population of subsets is X, the number of subsets required to characterize the population is X^X and the number of elements required is X ×( X^X) or X^(X + 1).

Dawkins considers a subset size of X^2.

For a defining probability of 1/X and where the subset size of the defined population of subsets is X^2 elements, the number of subsets required to characterize the population is X^(X^2) and the total number of elements required is X×[ X^(X^2)].

The sample size of X^2, as a fraction of the minimum number of elements required to represent the population, gets smaller and smaller with increasing X. For a defining probability of 1/X and where the subset size of the defined population of subsets is X^2 elements, that fraction is:

X^2/{X×[ X^(X^2)]} = X/[ X^(X^2)]

For X = 2, the fraction is 1/8

For X = 3, the fraction is 1/6561 = 1/[6.56 × 10^(3)]

For X = 4, the fraction is 1/[1.07 × 10^(6)]

For X = 5, the fraction is 1/[5.96 × 10^(16)]

For X = 10, the fraction is 1/(10^99)

For X = 10^2, the fraction is 1/(10^19,998)

For X = 10^3, the fraction is 1/(10^2,999,997)

For X = 10^4, the fraction is 1/(10^399,999,996)

For X = 10^5, the fraction is 1/(10^49,999,999,995)

For X = 10^6, the fraction is 1/(10^5,999,999,999,994)

For X = 10^7, the fraction is 1/(10^699,999,999,999,993)

For X = 10^8, the fraction is 1/(10^79,999,999,999,999,992)

For X = 10^9, the fraction is 1/(10^8,999,999,999,999,999,991)

The larger the value of X, the smaller the probability, 1/X, and even more so does X^2 become smaller as a fraction of the minimum number that is required to characterize the population. Thus, for a randomly generated population of subsets, each containing one billion × one billion elements, based on a probability of one per billion (this is Dawkins’ case of  X = 10^9), the ratio of a sample size of one billion × one billion elements, compared to the number required to characterize the population, would be infinitesimal, roughly 1 part in 10, raised to the power (9 billion × billion). These values illustrate the possible magnitude of the error of failing to distinguish probability from density.

Efficiency

Efficiency is the ratio of output to input.

The physical units are those of output to those of input.

A Game Show Illustration of Efficiency

Let the prop of a game show be two sets of two doors, one set red and one set green. Behind one door of each set is a prize. However to win, the contestant must open both of the two doors behind which are the prizes.

If the contestant is allowed to select any number of combinations of two doors, one from each set, what is the minimum number of selections he would need to insure winning? The answer is four: Red1-Green1, Red1-Green2, Red2-Green1, and Red2-Green2.

If the contestant is allowed to select any number of doors to open from each of the two sets, what is the minimum number of selections he would need to insure winning? The answer is four, Red1, Red2, Green1, and Green2.

The efficiency of winning in both cases is (100% success)/(4 selections).

Suppose we add a third pair of doors, a pair of blue doors with a prize behind one of the blue doors. The contestant must now open each of the three doors, behind which is a prize. Where each selection consisted of one door per set, the minimum number of selections of combinations, would now be 2^3 = 8 selections. The efficiency of winning would be (100% success)/(8 selections). In contrast the minimum number of selections for the three individual pairs of doors would be 6. The efficiency of winning would be (100% success)/(6 selections).

Dawkins’ Demonstration of Efficiency, Calling It Probability

On page 122 of The God Delusion, Dawkins proposes an analogous comparison of two selection processes or selection algorithms. He compares the efficiency of selection (mutational efficiency) in a single stage to that of a series of substages. He concludes that the series of substages is more efficient because, the fewer the selections required for 100% success, the less time is required. His expression of greater efficiency at a level of 100% success is “in no time” for the algorithm of greater mutational efficiency. Yet, he mistakes the greater efficiency of the series of substages as an increase in the probability of success, not realizing that both processes are equal in the probability of success, namely 100%.

Dawkins makes the same mistake in his 1991 lecture, ‘Climbing Mount Improbable’ (Beginning at minute 4:25). He compares (1) the efficiency of a series of three substages of mutation and natural selection, where each substage is confined to one of three mutation sites of 6 mutations each, to (2) the mutational efficiency of the three mutation sites in a single stage. The mutational efficiencies are (100% success)/(a minimum of 18 nonrandom mutations) vs. (100% success)/(a minimum of 216 nonrandom mutations). Dawkins claims the comparison is between a probability of success of 1/18 for the series of three substages and a probability of success1/216 for the single stage. In fact in his demonstration, both processes have a probability of success of 100%. They differ in the minimum number of nonrandom mutations required to achieve that same 100% probability of success. Thus, they differ in mutational efficiency, not in probability. The series of substages is more efficient than the single stage by a factor of 12. (100%/18 mutations) / (100%/216 mutations) = 12.

Dawkins doesn’t even understand his own parable of ‘Climbing Mount Improbable’. Using the label of “absurd”, Dawkins misidentifies the analogy of “leaping from the foot of the cliff to the top in a single bound” (p 122) as contrasted to “evolution”, which in the analogy is “creep(ing) up the gentle slope to the summit”. What he actually identifies in his parable as “absurd” is not an alternative to evolution. From the preceding paragraph on page 121, it is evident that Darwinian evolution in a single stage (“a one-off event”) is that which in the analogy is “leaping from the foot of the cliff to the top in a single bound” and is “absurd”.

Dawkins’ parable of ‘Climbing Mount Improbable’ is the analogy: Darwinian evolution in a single stage (“a one-off event”) IS TO Darwinian evolution in a series of substages AS “leaping from the foot of the cliff to the top in a single bound” IS TO “creep(ing) up the gentle slope to the summit”. Dawkins makes the first comparison on page 121 of The God Delusion. The next paragraph makes the second comparison of the analogy as a parable. That paragraph begins, “In Climbing Mount Improbable, I expressed the point in a parable.”

Dawkins identifies Darwinian evolution in a single stage as “absurd” because of its low value of probability (high improbability). He thinks the series of substages results in increasing the probability of success of Darwinian evolution. In fact, it increases the mutational efficiency of Darwinian evolution, while having no effect on the probability of success. That leaves Darwinian evolution in a series of substages just as “improbable” and just as “absurd” as Darwinian evolution in a single stage or “one-off event” (p121). They are equally probable (or improbable). Dawkins’ misidentification of efficiency as probability, prompts him to identify Darwinian evolution as “absurd” whether in a “one-off” stage or a comparable series of substages, because they are equally improbable. He is just unaware of his blanket identification of Darwinian evolution as “absurd”.

Conclusion

Richard Dawkins’ failure to understand the definitions of the three mathematical ratios of Density, Probability, and Efficiency, completely destroys his central argument of The God Delusion.  In The God Delusion, Richard Dawkins mistakes density for probability and mistakes mutational efficiency for probability. The central theme of The God Delusion is thus completely false. That theme is that there is a mathematical solution to the improbability of Darwinian evolution in a single overall stage, but there is no mathematical solution to the improbability of God. Dawkins mistakenly believes that he has demonstrated that the probability of success of Darwinian evolution of a single stage of mutation and natural selection is greatly increased by gradualism, i.e. the replacement of a single stage of Darwinian evolution by a series of substages.

He hasn’t demonstrated that. Dawkins has demonstrated that gradualism in Darwinian evolution increases mutational efficiency, while having no effect on the probability of success.

Fortunately, his mistaking efficiency for probability does not diminish the lucidity with which Richard Dawkins actually demonstrated the mutational efficiency of gradualism in Darwinian evolution. No one else has demonstrated or even mentioned this efficiency, which is the mathematical role of gradualism in the Darwinian algorithm of evolution. In this we are all beneficiaries of Richard Dawkins.

End Note

What does determine the probability of success in any stage of Darwinian evolution, where a stage consists of random mutation terminated by natural selection? It is the number of random mutations in that stage. As an illustration consider Dawkins’ example of a single stage, namely, three mutation sites of three mutations each. For 216 random mutations, the probability of success of Darwinian evolution is:

P = 1 – (215/216)^(216) = 63.3%

For 497 random mutations, the probability of success of Darwinian evolution in a single stage is:

            P = 1 – (215/216)^(497) = 90.0%

For the series of three subsets in Dawkins’ example, it would take 19 mutations in each subset or a total of 57 mutations for the series to achieve a probability of 90%. Thus, the series would be 497/57, or 8.7 times more mutationally efficient than the single stage of Darwinian evolution at the 90% level of the probability of success.

Richard Dawkins has single-handedly demonstrated that the role of gradualism in Darwinian evolution is to increase mutational efficiency.

Gradualism Versus Immediacy

Biological evolution is modification with descent. The modification with descent must be a process having a very low gradient, but yielding major changes by accumulation over many generations. The permanence of change is effected through natural selection which terminates each stage in a series of intermediate stages. The resulting modern biological forms of two divergent evolutionary lines may be dramatically different. For example, the modern pig and modern man are identified as the current forms of two divergent evolutionary lines having a shrew-like, common ancestor 85 million years ago. Each line is composed of thousands of evolutionary stages. Each stage is terminated by natural selection, resulting in an intermediate, now extinct.

Biological evolution is necessarily gradual. No one would propose the immediacy of evolution, e.g. that the shrew-like common ancestor of the modern pig and modern man could have directly given birth to a modern pig or to a modern man, or to both. Richard Dawkins states,

If a time machine could serve up to you your 200 million greats grandfather, you would eat him with sauce tartare and a slice of lemon. He was a fish. Yet you are connected to him by an unbroken line of intermediate ancestors, every one of whom belonged to the same species as its parents and its children.

This quotation is an illustration of a mathematical principle, which Dawkins identifies as essential to biological evolution, i.e. essential to modification with descent in which kind begets its own kind. The mathematical principle is: Any two values of a variable, which is continuous over its range of definition, differ from one another by degree, not kind. It should be noted that in the expression, ‘same species’, in the above quotation, the word, species, has the meaning of nature in philosophy. It does not refer to species in the context of biological taxonomy. Thus, according to Dawkins, the modern pig (Sus scrofa) and modern man (Homo sapiens) differ from one another by degree, not by kind, i.e. they do not differ in species/nature. The word, species, in evolution and in taxonomy has two different meanings.

 Serial Probability and Chance

According to Dawkins, the identity of an individual at conception is a matter of probability. It includes the probability that any one sperm, among the many in an ejaculation, would fertilize a human egg. Further, the probability is not in isolation. The true probability of the identity of an individual at conception is that of a series extending back through many human generations, such that any human fetus is properly identified as an astronomical improbability (“The Great Tim Tebow Fallacy” by Richard Dawkins, 2010, once accessible at the Internet address, http://newsweek.washingtonpost.com/onfaith/panelists/richard_dawkins/2010/02/the_great_tim_tebow_).

Dawkins acknowledges that some critics of Darwinian evolution, whom he identifies as creationists, have proposed that this same serial perspective must be applied to biological evolution, thereby identifying the probability of evolution as chance in Dawkins’ lexicon.

Dawkins argues that Darwinian evolution is not a matter of chance. The pig and the human did not evolve by chance. “Chance is not a solution, given the high levels of improbability we see in living organisms, and no sane biologist ever suggested that it was.” (p 119-120, The God Delusion) The solution to what appears to be chance, to what Dawkins identifies as the improbability of evolution, in a ‘one-off’ event or in a series in toto, is the ‘power of accumulation’. Mutations accumulate over a series of small stages of Darwinian evolution, where each small stage is terminated by natural selection. It is natural selection  “which breaks the problem of improbability (of a series) up into small pieces. Each of the small pieces is slightly improbable, but not prohibitively so” (p 121, The God Delusion). Dawkins concedes that Darwinian evolution would have to be rejected, if its probability is taken as the overall probability of the series, rather than its probability being taken as the probability of each discrete, incremental substage.

The Obvious Self-Contradiction

According to Dawkins, conception of an individual human may not be viewed as the termination of a discrete substage and its probability viewed as that of one sperm of all those in a single ejaculation. Such a view would be an instance of the tyranny of the discontinuous mind. No, the individual human produced at conception, is the end product of a long series of ancestral ejaculatory probabilities and, is thus, an astronomical improbability, the improbability of the series as a whole.

In direct contrast, according to Dawkins, the probability of biological evolution must be viewed as being terminated by natural selection in each and every discrete substage of mutation. Thus, the probability of each discrete substage is isolated from all the other substages in the evolutionary series of mutational descent.

Recognizing the probability of each substage of the evolutionary series as discrete is not an instance of Dawkins, himself, imposing the tyranny of the discontinuous mind. According to Dawkins, natural selection, by terminating each substage of mutation, actually divides the series of evolutionary probabilities into discrete probabilities. Those who oppose such discrete divisions, in the case of evolution, are not allied with Dawkins by opposing the tyranny of the discontinuous mind. In the instance of biological evolution, they oppose Dawkins’ discrete analysis of serial substages of evolution because they just don’t “understand the power of accumulation” (p 121, The God Delusion).

Dawkins’ two positions are self-contradictory. (1) He claims that the probability of an individual in generational descent at the time of his conception cannot be the probability of the one sperm (which fertilizes the egg) among the sperm in one ejaculation. Rather the probability of the individual at his conception is the arithmetical product of the probabilities of the entire series of conceptions of the individual’s ancestors. Yet, (2) he claims that the probability of survival by the mutant, surviving natural selection in one substage of evolutionary descent, is the probability of survival of that discrete substage alone, prior substages having no probabilistic relevance.

According to Dawkins, the serial probabilities of survival of the substages of evolutionary descent are probabilistically discrete, because each substage of evolution is terminated by natural selection. In contrast, fertilization allegedly does not terminate the probability of the conceived individual’s identity, assessed as the probability of one sperm of an ejaculation. Conception may not be viewed as terminating a discrete stage of probability. The ejaculatory probability of the individual’s identity at conception is the mathematical product of the series of probabilities of all his ancestors.

The Basis of Valid Perspectives: Probability Is a Static Concept, Not a Dynamic Material Property

Probability is a static concept in the mathematics of sets. It is the ratio of a subset to a set. When probability is applied to dynamic material processes, one is proposing an analogy based upon human ignorance of the dynamic material forces which definitively determine a material outcome. Probability is solely logical, not material. Material analogies of probability are visual aids to the understanding of logical principles. (Of course, the mathematics of probability may be applied in material cases to compensate for the human ignorance of the dynamic material forces involved. Probability characterizes human ignorance of, not, the material forces.) Whether one chooses to consider a single probability in a specified series or the probability of the total series, the choice is one of perspective within the logic of mathematics, having nothing to do with material reality.

Whether one considers the probability of one sperm in an ejaculation resulting in conception or the probability of a series of such in an ancestral line, it is a matter of solely logical perspective. Similarly, considering the probability of a discrete substage of Darwinian mutation and natural selection or the probability of a series of such, is not a matter or right or wrong. It is solely a matter of logical perspective.

Dawkins is right in that any two values of a variable, which is continuous over its range of definition, differ from one another by degree, not kind. Probability (as well as improbability) is such a variable. He is, therefore, wrong in distinguishing two kinds of improbability, namely (1) prohibitively improbable and (2) slightly improbable, but not prohibitively so (p 121, The God Delusion).

In contrast to probability, science is not an exercise in pure logic. It is the inference of mathematical relationships inherent in measurable material properties.

 Dawkins’ Major Contribution to Our Understanding of Darwinian Evolution

Dawkins has erred in failing to recognize as merely a logical perspective, the distinction between (1) the probability of a discrete probability in a series and (2) the overall probability of the series. However, this error is entirely irrelevant to his truly great contribution to our understanding of gradualism and mutational efficiency in Darwinian evolution. As inconsequential as his error happened to be, its setting was fortuitous. It was the setting which prompted his lucid and definitive demonstration that the role of gradualism in Darwinian evolution is to increase mutational efficiency.

In 2006, on page 122 of The God Delusion, Dawkins indicated that the role of gradualism is that of mutational efficiency in noting that replacing a single large stage of mutation and natural selection with a series of discrete substages would result in greater temporal efficiency in evolutionary success. He described this mutational efficiency as “in no time”, where time is directly proportional to the number of mutations, which is the measure of mutational efficiency.

In 1991, in a lecture entitled, Climbing Mount Improbable (Beginning at minute 4:25), Dawkins definitively illustrated the role of gradualism in Darwinian evolution, although not as clearly as one might wish. For single-handedly demonstrating that the role of gradualism is to increase mutational efficiency, Richard Dawkins has earned our respect and gratitude.

Summary

In his 2011 essay on the properties of continuous variables, Dawkins astutely enunciated the mathematical principle, which falsifies his 2006 (p 121, The God Delusion) argument. That argument was based on discrete mathematics, by which he falsely identified the role of gradualism in Darwinian evolution as increasing the probability of evolutionary success.

On page 121 of The God Delusion, published in 2006, Dawkins had made the false, indeed nonsensical, claim “. . . that natural selection breaks the problem of improbability up into small pieces. Each of the small pieces is slightly improbable, but not prohibitively so.” However, on the very next page (122 of The God Delusion), Dawkins correctly identified the role of gradualism in Darwinian evolution as increasing the efficiency of mutation. He referred  to mutational efficiency in its temporal manifestation with the words, “in no time”.

In his 1991 lecture, “Climbing Mount Improbable”, Dawkins had presented a numerical example, by which he lucidly demonstrated that the role of gradualism is that of increasing mutational efficiency, while having no effect on the probability of evolutionary success. His numerical example was a set of 3 mutation sites of 6 mutations each versus a series of three discrete substages, with mutation in the substages sequentially restricted to one of the 3 mutation sites. The mutational efficiency factor in favor of gradualism was 12, i.e. 216/18, with no effect upon the probability of evolutionary success, which was 100% (The probability was 100% because mutation was non-random, rather than random, in the particular demonstration).

It had commonly been taken for granted that the role of gradualism in Darwinian evolution was to increase the probability of evolutionary success. No one, other than Richard Dawkins, has ever demonstrated the true role of gradualism in Darwinian evolution, which is to increase mutational efficiency, while having no effect upon the probability of evolutionary success. I thank Richard Dawkins for my having had the opportunity to learn the true role of gradualism in Darwinian evolution directly from him via public print and video.

This essay identifies Bayes’ theorem, illustrates it with pinocle, and critiques Raphael Lataster’s published understanding of it.

The subject of Bayes’ theorem is a set which contains two subsets that partially overlap one another. Given the ratio of each subset to the set and the ratio of the overlap to one of the subsets, Bayes’ theorem is the equation for the ratio of the overlap to the other subset.

A set of Playing Cards would be a Bayesian Set in light of the following information. The set contains two subsets, namely a subset of Clubs and a subset of Kings, which partially overlap one another. The Overlap is the subset of cards that are Clubs as well as Kings.

Given, the ratio of the subset of Clubs to the set of Playing Cards is 1/4
Given, the ratio of the subset of Kings to the set of Playing Cards is 1/6
Given, the ratio of the Overlap to the subset of Clubs is 1/6
Bayes’ theorem is:

Overlap/Kings = (Overlap/Clubs) × (Clubs/PlayingCards) × [1/(Kings/PlayingCards)]     Eq. 1

Overlap/Kings = (1/6) × (1/4) × [1/(1/6)] = 1/4                                                                  Eq. 2A

Notice that the actual size of the entire set or any subset remains unknown within the context of Bayes’ theorem. Only fractions, i.e. ratios of subsets to sets is given and calculated. You may have noted that the ratios given and calculated do not conform to a deck of standard playing cards. They do conform to a standard pinocle deck.

It may be said that having been given the fraction of generic cards that are Clubs, Bayes’ theorem enables us to calculate the fraction specifically of Kings that are Clubs. However each of these two fractions is the ratio of a subset to a set, which is the definition of probability. Consequently, it would be popular to say that given the generic probability that ‘this’ playing card is a Club, Bayes’ theorem enables the calculation of the probability that ‘this’ playing card is a Club, given the additional specification that ‘this’ playing card is a King.

A Claim of Bayesian Reasoning

Raphael Lataster (p 286) has identified as Bayesian reasoning a rationale, which has nothing to do with Bayes’ theorem and which is even mathematically illogical, as will be shown, below.

He notes that the ratio of (the number of human deaths, by angels and not reported in Acts) to (the number of all human deaths, not reported in Acts) is virtually zero.

Lataster notes that consequently, the ratio of (the number of deaths by angels reported in Acts) to the number of all human deaths, not reported in Acts) must also be virtually zero. So far, so good. However, Lataster then concludes that the probability of the report of a death by an angel in Acts is false, because, by Bayesian reasoning, such deaths are zero. This falsity, known by Bayesian reasoning, is further confirmed by the fact that angels are mythical.

Lataster’s conclusion cannot proceed from Bayes’ theorem or Bayesian reasoning. What Bayes’ theorem would calculate is simply deaths by angels reported in Acts, as a fraction of the total deaths reported in Acts, without regard to the veracity of those reports. Indeed, this ratio is not necessarily even close to zero, based on the ratios which Lataster identifies as virtually zero.

Lataster’s Argument in Terms of Playing Cards

Given a Set of Playing Cards, but not the numerical ratios above, Lataster’s argument is: Given that the ratio of (Clubs,NotKings)/( Cards,NotKings) is virtually zero, we can surmise that the ratio of (Clubs that are Kings)/(Cards,NotKings) is also virtually zero, because (Clubs that are Kings) is a smaller set than (Clubs,NotKings), where the smaller set is the ‘Overlap’. However, Lataster does not stop there. He further concludes, that it would be false to claim that the ‘Overlap’ contains even one card, because that set is virtually zero. Lataster’s argument goes from what is true of a ratio, to deduce that it must be true of the numerator alone. That is illogical.

Lataster’s ‘Added Given’ to a Deck of Cards

Lataster has added another given to the standard givens of Bayes’ theorem. It is that a specific ratio is virtually zero. To attain a set which conforms to Lataster’s ‘Added Given’, all we need do is add a trillion, trillion cards NotClubsNotKings to a pinocle deck. Conforming to Lataster’s added given, the number of Clubs,NotKings would be 10 and the number of cards NotClubs,NotKings would be 30 + one trillion, trillion or 30 + 10^24. The ratio, (Clubs,NotKings)/(Cards,NotKings) = 10/(40 + 10^24), which is virtually zero. Similarly, the ratio of (Clubs that are Kings)/(Cards,NotKings) is also virtually zero. Does this mean that the Overlap is virtually zero, which is analogous to Lataster’s claim (p 286)? The Overlap is Cards that are both Clubs and Kings.

Adding one trillion, trillion cards NotClubsNotKings to a pinocle deck, we have as given: The ratio of the subset of Clubs to the set of Playing Cards is 12/(48 + 10^24); The ratio of the subset of Kings to the set of Playing Cards is 8/(48 +10^24); and The ratio of the Overlap to the subset of Clubs is 1/6. Bayes’ theorem is:

Overlap/Kings = (Overlap/Clubs) × (Clubs/PlayingCards) × [1/(Kings/PlayingCards)]      Eq. 1

Overlap/Kings = (1/6) × [12/(48 + 10^24)] × [1/{8/(48 + 10^12)}] = 1/4                              Eq. 2B

Assessment of the Latasterian Argument

Lataster’s argument (p 286) is a contradiction of the premise of Bayes’ theorem that the division of a set is into two subsets, which are independent of one another. This is illustrated above by two numerical examples. In both examples, the division of a deck of playing cards is into a subset of Clubs and a subset of Kings, which subsets partially overlap The first example is a pinocle deck. In the second example, a trillion, trillion cards are added to the NotClubsNotKings subset, in accord with Lataster’s added given. The added given has no effect upon Bayes’ theorem because in a set to which Bayes’ theorem is applicable, the division into subsets is a division into subsets, which are independent of one another. Lataster’s argument (p 286) is erroneous in principle by contradicting Bayes’ theorem in principle that the subsets defined are independent of one another.

Lataster’s argument has also been numerically illustrated to be erroneous in this essay: (1) using the subsets of cards of a pinocle deck (by Bayes’ theorem, Eq. 2A = 1/4) and (2) using a pinocle deck to which one trillion, trillion cards NotClubsNotKings have been added. This modification satisfies Lataster’s ‘added given’, which identifies a ratio as ‘virtually zero’. Nevertheless, by Bayes’ theorem the ratio of (Clubs that are Kings)/(Kings) doesn’t change. (Eq. 2B = 1/4).

The Latasterian conclusion, namely, that the subset (Clubs that are Kings) is zero, does not follow. In fact, that subset is a multiple of 1. It is the numerator of the ratio of (Clubs that are Kings)/(AllKings), which ratio, by Bayes’ theorem, is 1/4 (Eq. 1B) and 1/4 (Eq. 2B).

It is true of the pinocle deck, to which a trillion, trillion NotClubs,NotKings are added, that the ratio of (Clubs,NotKings)/(CardsNotKings) is virtually zero, namely, 10/(40+10^24). It is also true that the ratio of (Clubs that are Kings)/(CardsNotKings), is virtually zero, namely, 2/(40+10^24).

Thus the two premises of Lataster are true of the set to which a trillion, trillion cards NotClubsNotKings are added. It is not that his premises are false. It is that his conclusion does not follow from his premises. It is that both of his premises are irrelevant to Bayes’ theorem. The subset, NotClubsNotKings, is not a factor of Bayes’ theorem Eq. 1, so its size, relatively small or relatively large, appears to be irrelevant. Further, the only term in Bayes’ theorem Eq. 1, affected by the size of the subset, NotClubsNotKings, is the total set of Playing Cards. However, the entire set, Playing Cards, is both in the numerator and the denominator of Bayes’ theorem Eq. 1 thereby cancelling each other out and rendering Overlap/Kings, independent of the size of Playing Cards and the size of NotClubsNotKings (Eq. 1).

If to one trillion, trillion NotClubsNotKings, we add one pinocle deck, the subset of (Clubs that are Kings) is actually 2, while the Bayes’ theorem ratio of Overlap/Kings is 1/4. If we were to add three pinocle decks to the one trillion, trillion NotClubsNotKings, the subset of (Clubs that are Kings) would be 6, while the Bayes’ theorem ratio of Overlap/Kings would still be 1/4. This illustrates the independence of the subset of Kings and the subset of NotKings. This independence is true of Bayesian sets. It is this independence that is denied by Lataster’s argument (p 286). He assumes that a characteristic within the subset, NotKings, must be characteristic of the subset, Kings.

Conclusion

Latasterian reasoning is illogical. The fact that a ratio is virtually zero, does not mean that the numerator of the ratio is actually zero, which is what Lataster (p 286) illogically concludes. In the illustration of this essay, the ratio of (Clubs that are Kings)/(CardsNotKings) is virtually zero, namely, 2/(40 + 10^24). It would be illogical to conclude that the numerator must be actually zero. In the illustration the numerator is 2.

The subject of Latasterian reasoning are two ratios, neither of which is any of the four ratios of Bayes’ theorem Eq. 1. Thus, Latasterian reasoning is not Bayesian reasoning. It is irrelevant to Bayesian reasoning. In the illustration, the two ratios, which Lataster subjects to his mode of reasoning are: (ClubsNotKings)/(CardsNotClubs) and (Clubs that are Kings)/( CardsNotClubs). Neither of these is any of the four ratios of Bayes’ theorem, Eq. 1. These two Latasterian ratios are irrelevant to Bayes’ theorem and Bayesian reasoning.

Each of the four ratios of Bayes’ theorem is a probability, i.e. it is a ratio of a subset to a set (Eq. 1). In contrast, only one of the two ratios of Latasterian reasoning is a probability. Of the two ratios subjected to Latasterian reasoning, only (ClubsNotKings)/(CardsNotKings) is a probability, i.e. the ratio of a subset to a set. The other ratio is (Clubs that are Kings)/(CardsNotClubs). The numerator is not a subset of the denominator within the context of the entire set of cards. Consequently, the ratio is not a probability.

It would be logical to identify angels and their activity as mythological, i.e. not true, and conclude thereby that a report of a death by an angel in Acts is false. In contrast, it is not logically possible to reach that conclusion based on Bayes’ theorem or Bayesian reasoning, where the entire set is human deaths, one subset is deaths by angels, and the other subset is deaths reported in Acts. The Overlap is deaths by angels, reported in Acts.

As a religious studies professional, Raphael Lataster is keenly interested in the rationale of probabilities (minute 30:45). His interest should lead eventually to his understanding of Bayes’ theorem, which is an equation of four ratios, each of which is a probability. A good starting point toward his or anyone’s understanding of Bayes’ theorem would be to compare it to Pythagoras’ theorem of which we are all familiar. I wish him well in his pursuit of understanding Bayes’ theorem.

The logic of the syllogism and of Bayes’ theorem can be seen in light of a house and a condo, respectively.

The Logic of the Syllogism and a House for Sale

A house for sale would include the kitchen, while the kitchen would include each of its permanent elements, such as the cabinets, the flooring, and the sink. This is analogous to the syllogism in which an element (the sink), as part of the content of a subset (the kitchen) is thereby an element of the set (the house).

The Logic of Bayes’ Theorem and a Condo Unit for Sale

To effect an analogy between Bayes’ theorem and a condo unit for sale, let us identify the condo unit for sale as one of two condo units comprising area 3 of a condominium building, where the two units are defined each as including a utility room of area 3, common to both units.

The real estate agent, a devotee of Bayes’ theorem, tells a prospective customer:

  • The condo unit for sale, 3A, is 15% of the entire condominium building’s floor space.
  • The other unit of area 3, 3B, is 20% of the entire condominium building’s floor space.
  • The common utility room of area 3 is 25% of unit 3B.
  • By Bayes’ theorem, the common utility room is 1/3 of unit 3A, which is for sale.

 

The customer says, “That’s fine, but what is the square footage of the private rooms of unit 3A?” The agent replies, “It’s twice the square footage of the common utility room.” The customer notes, “That’s no help. If you don’t know the square footage of area 3, do you know the entire square footage of the building?” The agent gives up on this customer, saying, “Picky, picky, picky. You’re one of those who think that truth is absolute, when in fact human knowledge is relative. You probably think of logic as embodied in the syllogism, when the fullness of logic is relative and is truly expressed by Bayes’ theorem. Logic consists in weighing probabilities, the ratios of subsets to sets”

The agent then adopts a more conciliatory attitude, “You see much of human knowledge is that which we use for decision making, which is often the weighing of probabilities afforded us by Bayes’ theorem. I know that for a person in the condo building, the likelihood that the person is in the private quarters of unit 3A is a probability of 1/10. This knowledge is implied by Bayes’ theorem. I also know that a person, in condo unit 3A is likely to be in the private quarters thereof at a probability of 2/3. I could then weigh these probabilities in deciding whether unit 3A would suit my needs, if I bought it.” The prospective customer replies, “ I understand, but such relative knowledge is insufficient for me. I would not know the size of the private area of 3A, e.g. it could be 200 sq. ft. or 2000 sq. ft. The relative knowledge that the private quarters is 2/3 of its whole is insufficient for me.”