The purpose of this paper is to identify an oversight and accompanying errors in the logic of the Bell theorem. Violation of Bell inequalities by experimental data results from misunderstanding the nature and processing of the data to be used. While quantum mechanics as a whole is not understood, and therefore admits various interpretations, the Bell inequality rests on mathematical logic alone. This has been unrecognized due to Bell’s derivation, but is not a matter of interpretation once pointed out.
The Bell  inequality was originally derived as part of a theorem in statistics (See Appendix). However, the same inequality is derivable as a purely algebraic result that must be identically satisfied by three (or four) mutually cross-correlated data sets consisting of ±1’s, regardless of whether they are random or deterministic . The Bell inequality is thus independent of the statistical assumptions that Bell used and that have been assumed to be necessary to its derivation. Violation of the Bell inequality results from its misuse based on ignorance of its purely mathematical basis, and most surprisingly, from ignoring the established quantum mechanical principle of non-commutation.
When the inequality is applied to random processes, its basis in simultaneous cross-correlations limits the two-variable correlation functions that may occur in each triplet-of-variables or quadruplet-of-variables realization. From a mathematical perspective, there is no reason why the correlation functions among the different variable pairs need all be the same, and it will be found below that there is a simple reason why some should be qualitatively different.
Without even being aware of its name, the characteristics of a simple stochastic process known as wide sense stationary  (WSS) have been mistakenly assumed to characterize three measurements on two entangled particles involving non-commuting observables. That this error has not been recognized is possibly due to the stated belief that non-commutation is a purely quantum effect  that should not be considered in the context of possible hidden variables in quantum mechanics. In fact however, many classical processes are non-commutative and a major example of quantum non-commutation, the Pauli spin matrices, originated in a classical representation of three dimensional rotations by two dimensional matrices . This, and the non-commutation of sequences of classical light polarization measurements, indicate that non-commutation is a fact that cannot be neglected in either classical or quantum physics. Indeed, while non-commutation is common in the classical world, the author is not aware of any books that treat it in the context of random variables until the recent monograph by Khrennikov . Finally, although encountered frequently in everyday life, there is no commonly recognized English language term for non-commutation.
Bell based his theorem on the use of predicted quantum measurements (counterfactuals)  which some believe to be inapplicable to quantum mechanics. However, quantum mechanics may be used to provide probabilistic predictions for physical processes. Measurements may then be carried out to confirm predicted correlations . As a result, the Bell inequalities may be applied to quantum counterfactuals that are subsequently measured as will be illustrated below.
A common explanation for the violation of the Bell inequality is that due to non-locality, more than three variables (or four) are actually interacting to produce the Bell cosine correlation. Then, the inequality is judged to be intrinsically inapplicable to the real physical situation. However, due to the extreme generality of the Bell inequality as derived below, simple procedures ensure the number of data sets necessary for inequality satisfaction, although alteration of the form of correlation functions may occur.
Finally, it is quite common to consider a probability version of the Bell inequality that some seem to assume is logically independent of the correlation form that Bell derived. However, since correlations may be expressed in terms of the probabilities that yield them, it is not surprising to find that the probability form follows from the Bell inequality in correlations. Thus the probability or Wigner  form of the Bell inequality is not logically independent of the correlational form and is satisfied by properly constructed quantum probabilities .
An earlier version of this paper was posted in the quant-ph archives .
2. Bell Inequality for Data Sets
The Bell inequality for data sets will now be derived, since it is of critical importance to understanding the Bell theorem. Data sets as defined herein, exist if they can be written down. They may result from experimental observations, from theoretical predictions of experimental observations (counterfactuals), or from a combination of the two. The data may also be either random or deterministic. In constructing his theorem and inequality, Bell assumed three measurement readouts from two entangled particles in a Bell-correlation measurement apparatus (Figure 1). There appear to be only two ways to obtain such data. One conceptually simple way is to add detectors (Figure 2) beyond the two needed to observe the correlated count pairs produced by the source. Such an arrangement immediately implies different correlations among different variable pairs.
Bell specifically rejected this approach . He employed a second method using quantum probabilities to predict the result of an additional measurement at an alternative detector setting for a previously measured particle. However, only one measurement on each particle is commutative. In quantum mechanics, if two operators corresponding to measurement operations commute, they have a set of eigenstates or eigenfunctions in common. If they do not commute, their eigenfunctions are from different function sets with members of either set expressible as a linear combination of those of the other (see basic quantum mechanics texts, e.g., Mandl ). Thus, the probabilites for predicted correlations resulting from non-commuting alternative observables at different instrument settings should be different, and they are as will be shown below (See  for a suggested expermental test of this).
The Bell inequality will now be shown to hold under far more general conditions than is apparent from Bell’s derivation as part of a statistics theorem .
Figure 1. Schematic of Bell experiment in which a source sends two particles to two detectors having angular settings and and/or counterfactual settings and . While one measurement operation on the A-side, e.g. at setting , commutes with one on the B-side at , any additional measurements at either or are non-commutative with prior measurements at and , respectively.
Figure 2. Schematic of a Multiple Stern-Gerlach apparatus. Arrows a, a', b, b' indicate the magnetic field directions encountered by pairs of particles emitted in opposite directions by the source. At each encounter with a magnetic field, the particle is deflected in one of two directions depending on whether its spin is +1/2 or −1/2. Each sequence of ±1’s corresponds to a unique output position so that knowledge of two spin measurements is yielded retrospectively for each particle.
Assume that three data sets, random or deterministic, labeled a, b, and c have been obtained. The data set items are denoted by , , and with N items in each set. Each datum equals ±1. One may form the equation
and sum this equation over the N data triplets from the data sets. After dividing by N, one obtains
Taking absolute values of both sides,
Inequality (2.4) has the same form as the three variable inequality derived by Bell for correlations but is expressed in a form directly applicable to laboratory data. The algebraic steps used in it’s construction are the same as those applied by Bell to previously averaged correlation functions (See Appendix).
The author unexpectedly discovered this result some time ago  by asking the following question: If one performs a laboratory experiment for which the number of data items 3N is necessarily finite, to what extent do random fluctuations of the correlation estimates result in violation of the Bell inequality (2.4)? Surprisingly, the answer as shown above, is that the inequality is precisely satisfied. No assumption has been made other than that the data can be written down. Further, as long as the data can be tagged with labels a, b, and c, the inequality is satisfied even if nonlocl pickup exists between detectors. In that event, the form of the correlations would be affected but not whether the inequalitiy is satisfied. Indeed, in an extreme case, the data averaged correlations might not converge to identifiable limits, but the Bell inequality would still be satisfied for any three identified data sets. No experimental loophole in this conclusion is apparent. The inequality still follows if some of the ±1’s are replaced by zeros.
If, however, the data derive from a random process, and the correlation estimates in inequality (2.4) converge to probabilistically computable correlations as , the resulting correlations designated by must then satisfy an inequality of the same form as inequality (2.4):
and the limit is statistical.
This is essentially the inequality derived by Bell  using a stochastic process model (see Appendix) in which detectors at the same settings on opposite sides of a source of entangled particle pairs produce results of opposite sign (Figure 1). In Bell’s stochastic process model, left-hand-side detectors of Figure 1 were designated and right-hand-side detectors with so as to automatically agree with results of entanglement at equal settings. In deriving the right-hand side correlation, Bell represented the product of both outputs using one stochastic-model function, . However, the final correlation on the right-hand side is preceded by a plus sign that arises due to measurements being taken on opposite sides of the apparatus.
In some discussions below, measurement settings will be labeled to agree with the side on which the measurement occurs. That usage is consistent with Bell’s stated interpretation of the variables in the three variable inequality (  Chapter 8) as predicted results: a measurement at setting a, followed by two alternative predictions of measurement results at b and b'. This will be treated in detail in Sec. 4.
In the optical case two polarizations occur. Counts of one polarization are labeled +1 and those of the orthogonal polarization are labeled −1. Using for the alternative measurement instead of c on the right-hand-side, the inequality is written
Now, all a-measurements occur on one side of an apparatus such as shown in Figure 1 and b-measurements on the other.
The subscripts on the correlations in inequality (2.5b) indicate that they do not all necessarily have the same functional form as follows from the lack of conditions used in deriving inequality (2.4). This is directly relevant to the quantum mechanics (QM) case to which inequality (2.5b) is applied, and for which the correlations are different as results from non–commutation of measurements.
It is obvious that the constant equal to 1 occurring in inequality (2.5b) results from the fact that the value that multiplies also multiplies , data triplet by data triplet. However, if data are obtained from an independent run for each correlated measurement pair as is common in practice, then six data sets instead of three are used, the condition under which the inequality was derived does not hold, and the inequality will in general be violated. Strangely, the use of independent runs has become accepted experimental practice. It is critically important to understand that while inequality (2.4) holds generally for any three arbitrary data sets, the Bell inequalities (2.5a, b) do not hold for arbitrary correlations. Since correlations must result from the convergence of the correlation estimates that satisfy inequality (2.4), they must satisfy inequalities (2.5a, b) and their functional forms are mutually constrained thereby. Arbitrary correlations not derivable as limiting forms of correlations of three concurrently existing data sets will not necessarily satisfy inequalities (2.5a, b). If inequalities (2.5a, b) are violated by assumed limiting forms, no data sets of triplets can exist that produce them.
Similar assumptions to those used to derive inequality (2.5a) from inequality (2.4) can also be used to derive a four variable Bell inequality. Assuming that there exist four data sets of size N with members with each datum equal to ±1, then for each group of four data items from the four data sets, one has (by inspection)
(Inequality (2.6) also holds if zeros occur among the variables.) Summing over i from 1 to N in inequality (2.6), and dividing by N leads to
Since all experimental data sets are intrinsically finite, four data sets must satisfy inequality (2.7a), as three must satisfy inequality (2.4). Again assuming statistical convergence to limits as , a common form of Bell inequality used by experimentalists results:
As in the case of inequality (2.5b), the correlations may have different functional forms.
The difficulty of applying the three variable inequality (2.4) to an entangled pair of particles in which more than two measurements are non-commutative is amplified in the case of a four variable inequality. Note, as in the previous case, that while inequality (2.7a) must be identically satisfied by any four data sets that may be written down, inequality (2.7b) may be violated by assumed correlations. However, as in the three variables case, if it is violated it follows that no four data sets exist whose cross-correlations result in the assumed correlation functions.
Note that the notion of experimentally “testing” the above inequalities, in either three or four variables, involves a logical mischaracterization. Only the form of the several correlations that describe data from a given physical experiment may be tested, and not whether or not cross-correlations of the data satisfy the Bell inequalities. (In the laboratory counts are observed, and correlations computed from them.) Further, if variables are obtained from random realizations that yield measurements on only one pair per realization among the four correlated variables, the correlations will in general be different than if all four variable values are obtained per realization. A conceptually simple way to obtain four data outputs per realization is shown in Figure 2.
3. How Independent Data-Pair Correlations May Violate Bell Inequalities
3.1. Bell Operationally Assumed Correlations That Are Wide-Sense Stationary
Inequalities in three and four variables were derived from Bell’s assumption of a stochastic process representation of quantum entanglement . Bell represented detector readouts with a function , where a is an instrument setting and denotes one or more random variables determining the resulting random values taken by . In Bell’s representation, there is no implication that accessing a readout at a affects the probability of accessing a readout at for a given realization. The multiple readouts of function and their associated probabilities are analogous to a set of commuting observables in QM. Howeever, for the case of non-commuting QM observables that applies here, probabilities of specific readouts at successive instrument settings are conditionally dependent on readouts at prior settings . The conditional probabilites at alternative instrument settings have different values, whereas they would be expected to be the same if commutative states were involved.
Without stating it, Bell effectively and operationally assumed properties for the several quantum mechanical outputs with which he was concerned that correspond to a special, not universal, kind of random process defined as wide-sense-stationary  (WSS). Such a process is one in which the correlation of the readouts at any two instrument settings and is given by a function of the form depending on the difference of coordinate settings for all setting pairs. Thus, in Bell’s notation
where is a probability distribution for , and and are any detector settings.
Bell used the correlation functional form computed from QM for commuting measurements on a pair of entangled spins that suggests a WSS process, . The measurements commute because they are carred out on two different particles and it does not matter which measurement occurs first. Bell computed QM correlations at a setting a for a first detector and two alternative settings b and b' for a second detector ( , Chap 8). Predicted QM correlations and are both given by the negative cosine of detector angular differences (suggesting WSS). However, if the resulting correlation at output settings b and b' is computed from the different QM probabilities that occur for each non-commuting variable and fixed value a (that Bell assumed), the result is as will be shown below.
3.2. How Misinterpretation of the Bell Inequality Leads to Its Violation
Inequalities (2.5b) and (2.7b) result from the cross-correlations of three and four data sets, respectively. A triplet or quadruplet of data values must occur in one realization of the associated random process. Whether three or four variables, each equal to ±1 are cross-correlated, determines the constant of the related inequality, 1 in the case of three variables, 2 in the case of four variables. However, if correlations are obtained from three or four variable pairs, with each pair acquired in an independent experimental run, the correlations will in general be quantitatively different from those that result from cross-correlated data triplets or quadruplets acquired in one run (as could be accomplished with the setup of Figure 2). Different measurement scenarios for the random variables will in general affect both correlations between variables and their corresponding probabilities.
Given measurement apparatus such as shown in Figure 1, two possible scenarios are identified for acquisition of three correlations. One may measure output pairs in independent runs at settings , , and , (the third pair of settings on opposite sides of the apparatus) producing separate realizations of each variable pair. In that case, the correlation of each variable pair is given by the same function in the quantum situation under consideration, but the conditions that lead to Bell’s derivation of the inequality as well as inequality (2.4), are violated.
A second scenario (that specified by Bell) is to predict the three outputs at settings a, b, and b' (with b and b' both on the right-hand side) for each random realization and calculate resulting correlations , , and from QM probabilities for the variable pairs. Clearly the correlations and probabilities should be different from those obtained in the first scenario. Given that probabilities , are known from QM for measurements at and given setting a, one can immediately compute allowing the evaluation of . Thus, the now connected correlations of outputs at setting pairs , , and may be determined. Hess has pointed out  that similar facts and inequalities related to those of Bell have been known to mathematicians since Boole. Pure mathematics determines a third correlation when data for two correlations out of the three are specified.
3.3. WSS Correlations Satisfy the Bell Inequality But They Are Not Co-Sinusoidal
It is instructive to consider inequalities (2.7a, b) for a finite value of N in the special case of a WSS process and four data sets. The WSS properties have been assumed to represent entanglement by experimentalists and theoreticians alike, after Bell’s mistaken assumption of their universal applicability. One may write inequality (2.7a) in the form
where the functions are assumed to represent the limiting forms for the correlation estimates as . Since the inequality cannot be violated by data sets that are jointly present and cross-correlated, the ’s represent random differences from the probability averaged correlation that lead to inequality satisfaction when the four variables’ values are present in each realization of the experiment.
By contrast if the data are taken in four independent runs using the same instrument settings, inequality (3.2) for the same WSS process becomes
where the subscripts 1 … 4 indicate the experimental run number used to compute the correlation statistical fluctuation, and the question marks indicate possible violation of the ±2 limits since the data are no longer cross-correlated.
Note: the cross correlation of the data sets used in inequality (2.7a) is what causes that inequality to be identically satisfied and have a limiting magnitude of 2. If eight data sets and not four are used, the ’s plus the limiting correlations need no longer satisfy the inequality, even though the limiting correlations are given by the same function for the WSS process assumed. Given that the estimates statistically converge, the ’s are expected to become small as N becomes lager. Thus, although the inequality (3.3) would be violated, it would be violated by smaller and smaller values as N increases.
3.4. Quantum Mechanical Bell Correlations Cannot Represent a WSS Process
Bell effectively assumed  that the random process applicable to a triplet of polarization or spin measurement correlations is WSS, as is also widely done in the four variable case of inequality (2.7b) by those interpreting experimental data in a way that violates inequality (2.7b). When the mathematical facts leading to inequalities (2.5) or (2.7b) are considered, however, it becomes clear that the measurement results in QM experiments do not represent a WSS process. If they did, violation of the corresponding Bell inequality would be expected to be small, i.e., of the order of four standard deviations, rather than 102 standard deviations as has been reported . Such inequality violations represent proof that the correlations of the process under consideration cannot all be co-sinusoidal and indeed they are not, as will be shown below. In the case of QM entanglement, only the measurements that commute between two particles are of this form.
4. The Wigner Inequality Results from a Bell Inequality If Probabilities Are Symhmetric
A probability inequality known as the Wigner  inequality is intimately related to the Bell inequality constraint on correlations . It relates the probabilites for pairs of +1 outcomes corresponding to Bell correlations at given instrument settings. The result is
where the first letter in each probability indicates the angular setting for the random variable on the left side of the Bell apparatus in Figure 1, and the second letter indicates the angular setting for the random variable on the right side. The subscripts + and – indicate whether the variables at settings a and b have values of +1 or −1. Inequality. (4.1) follows from the Bell inequality
If setting b of the right-hand-side correlation specifies instead the a-side setting of a Bell apparatus, the same numerical output occurs with a reversed sign under Bell’s random process model where . Then inequality (4.2a) becomes
A physical process is now considered with probabilities having the same symmetry of occurence for ±1’s as occurs in QM in the case of two entangled spins. The joint probabilites resulting from an entangled spin state are:
where a contracted notation indicates possible values of random variables b and a at settings and . The probabilities conditional on a are easily obtained since . The joint probabilities (4.3) thus have the following symmetry for ±1 occurrence:
The normalization condition is
so that is given by
after using Equation (4.5a). The results for and are obtained by renaming the variables in Equation (4.5c). The use of correlation (4.5c) in inequality (4.2b) with appropriate variables for different correlations produces:
or the Wigner inequality:
As is well known, inequality (4.6) is violated when the same quantum Bell-state probability is used for all terms.
To show that quantum mechanical probabilities are consistent with a probability form of the Bell inequality it is simpler to use inequality (2.5b) since that form directly represents the physical situation considered by Bell . In Chapter 8 of Bell’s book, Speakable and unspeakable in quantum mechanics, Bell indicates that the result for the variable at b' is a predicted value on the same side of the apparatus as b.
The computation of in terms of (where both b-variables are now on the B-side of the apparatus) follows if the symmetry of Equations (4.4) holds for the probabilities of the observed and predicted variables:
The joint probability computed from probabilities (4.3) is
A similar calculation for yields the same result. Similarly, with result
Since the probabilities leading to have the same symmetries as those for and , and the probabilities for variable pairs are equal, they are used to compute from Equation (4.7a). However, although the specified symmetries are the same, the probabilities are very different from probabilities (4.3) that lead to the Bell correlation for the first measurement pair.
5. Quantum Counterfactual Probabilities and Correlations Satisfy the Bell Inequality
5.1. Satisfying the Bell Inequality
Inequality (4.6) is violated by quantum probabilities upon assuming that all corresponding correlations have the same form as those for the two commuting measurements. This occurs because the correlation on the right-hand-side of inequality (2.5a) is constrained by the left-hand correlations and whose existence requires data that determines the right-hand side. (Note: .) The correlation thereby determined cannot have the same form as the previous correlations if the latter have the Bell cosine form. Repeating inequality (2.5b):
Using the probabilities for and given above in Equations (4.7b) and (4.7c) the resulting correlation may be computed as 
The same result may also be computed by using conditional probabilites for data outputs at b and b' given outputs for a. This result could be tested experimentally as suggested in Ref  where an analogous result is also given in the four variable case. Using the contracted notation , and the Bell cosine correlations, inequality (5.1a) becomes
after use of appropriate trigonometric identities. One may replace the difference of correlations on the left-hand-side of inequality (5.1d) by expressions in probabilities but the same result occurs in inequality (5.1e). Since
the Bell inequality (5.1a) is satisfied.
Thus, when correlations computed from probabilities resulting from QM are used, whether expressed in correlational or probability form, inequality (5.1a) is satisfied as demanded by basic mathematics. Deductions of non-locality or non-reality, if based on Bell inequality violation, no longer follow.
5.2. Dealing with Possible Pickup between Detectors
If measurements are made on two particles, one of the measurements occurs first, except in circumstances of infinite time precision. Assuming that A is measured before B or B’ by some time increment, any assumed pickup from other detectors by A is fixed when A is measured. If there is also pickup from detector A to B or A to B’, three data sets are still obtained. Thus the three variable inequality holds even for the corrupted data. Observed correlation functions could then be compared with QM predictions to determine if evidence of pickup in fact exists. The Bell inequality would still be satisfied by the data sets even if the correlation estimates failed to converge to identifiable functions.
The principle claim with which this article is concerned is that correlations of quantum mechanical laboratory data violate the Bell inequality. Since it has been shown that the same inequality holds identically for data sets as a fact of algebra without Bell’s assumptions, the notion that it is testable rests on a mathematical oversight. This has resulted in misuse of an inequality that must be identically satisfied when used correctly. What may be experimentally tested is not whether the Bell inequality is satisfied when correctly used, but the form of the multiple correlation functions realized from qualitatively different measurements. If correctly computed, correlations consistent with quantum mechanics do not all have the cosine form that Bell and others have assumed based on independent count-pair measurements. The Bell inequality constants result from data triplets and quadruplets obtained per realization, and not data pairs.
Understanding of the Bell inequality follows simply in the absence of logical errors. The three and four variable inequalities are identically satisfied by cross-correlations of finite quantum mechanical data sets of ±1’s as a fact of algebra. Their satisfaction by quantum measurements follows from a well-known quantum mechanical fact: performed measurements on spins or photons are non-commutative when more than one per particle of an entangled pair is invoked. When both facts are employed, the Bell inequality is satisfied without mysteries.
The Bell theorem has been interpreted to imply that one cannot construct a local probability model that accounts for quantum correlations without entanglement. If the logic of the theorem is flawed, however, it does not follow thereby that the converse is true. The question that immediately arises is: how much of conventional quantum mechanics is to be accounted for. Since a local algorithmic model for Bell correlations has been presented , the elimination of non-locality would seem to be an attainable goal. This is in agreement with the observation that the physical superposition of four waves that produce entanglement in a down-converter source no longer exists when the waves propagate to spatially separated detectors for count detection. In this case one may derive a local probability model resulting from the boundary conditions that exist at the source . This model appears to be consistent with both quantum electro-dynamics and wave optics.
The literature relevant to Bell’s theorem has grown to a large size. There are growing numbers of papers that disagree with the Bell consensus according to . The author apologizes in advance to those not cited that agree with him, and to those not agreeing as well. However, this already quite long article is concerned with the logical components of this controversial topic, rather than a review of the literature. A more uniform coverage of recent contributions will have to wait until a later date.
The author is indebdted to Joe Foreman for many conversations that motivated writing the present paper and to Michael Steiner for careful specification of some of the reasoning that has been applied to Bell inequality violation. Useful conversations with Armen Gulian have also aided the writing of this paper.
It is useful to derive Bell’s version of his inequality by applying an explicit probability average to the left hand side of Equation (2.2):
In (A1), is a probability density for a collection of random variables that determine the values of the data items , , and . Changing to Bell’s random process notation for which at setting a, at setting b, and , each function determined by random variable , Equation (A1) becomes
where the probability averages are independent of subscript i since they are the same for each i. The three variables have random values determined by the collection of random parameters , and define a stochastic process. The right side of (A2) then becomes
Taking absolute values of both sides and bringing the absolute value inside the integral on the right leads to
where indicates the probability average of etc. Relations (A4) end in Bell’s inequality .
Bell’s notation suggests that any number of readouts may be obtained for a given realization of the process. It is consistent with a WSS process as described above. The interpretation used thus specifies a particularly simple random process that is by no means universal. The result (2.4) proves that the same relation holds independently of Bell’s stated assumptions in proving version (A4), and even if the data are deterministic.
 Sica, L. (2019) The Bell and Probability Inequalities Are Not Violated When Non-Commutation Is Applied According to Quantum Principles.