A Study on Computer Consciousness on Intuitive Geometry Based on Mathematics Experiments and Statistical Analysis

Show more

1. Introduction

Intuitive geometric knowledge is an origin of human civilization, just as shown by the Plimpton 322 tablet that people in the Old Babylonian period (between −1900 and −1600) already knew the rule of the right triangle *i.e*., the Pythagorean theorem, through various instances of right triangles, almost one thousand years before proof was given in Greek time. From the analogue view, the machine’s consciousness would be better built starting from recognizing geometric configurations, formating of geometric concepts and discovering geometric properties from observing sufficiently many examples of geometric configuration without human interference, and automated verification (or proof) of the observed geometric theorems. Indeed, machine proof of geometric theorems has been regarded as an essential subject of artificial intelligence research during the inception of artificial intelligence. In the past few decades, researchers have made significant progress in using computers to prove geometric theorems. The research work of computer proof of geometric theorems is mainly developed from the following three directions:

1) Algebraic calculation method based on coordinates;

2) Point elimination method based on geometric invariants;

3) Proving theorems by simulating human thinking the reasoning database search method.

The machine proof of geometric theorems originated in the 1950s. Tarski [1] proposed that most of the decision problems in elementary algebra and elementary geometry can be verified using an algebraic method. Among many implementations, great progress was made by Wu Wen-Tsün in the 1970s. Inspired by ancient Chinese mathematics, Wu proposed the algebraic method of geometric theorem machine proof, called the “Wu’s method” [2] [3] [4]. Its basic idea is to transform a geometric problem into a system of algebraic equations, and then verify (prove or disprove) the geometric theorem by calculating the relationship between the system of algebraic equations. Wu’s method has been successfully used for the mechanized proof of geometric theorems along with the rapid development of computer algebra systems like Reduce, Derive, Mathematica, Maple, and so on. Soon after Wu’s initial work, the Gröbner basis method, which was developed by Buchberger for processing polynomial system in the 1960s, has also been widely used in the field of geometric theorem proving [5] [6]. Both Wu’s method and Gröbner basis method are essentially verifying algebraic identities with some constrained variables and a set of polynomial constrained equations. Starting from the fact that the lower and upper bounds of a polynomial equation can be determined by its coefficients, Hong [7] proposed the “one-example illustration method” that can verify the correctness of a geometric theorem via a single instance of the related geometry statement. Furthermore, based on the following observation: if a multivariate polynomial has a value equal to zero on a sufficiently large grid, then this polynomial is always equal to zero, Zhang *et al.* proposed the “numerical parallel method” [8], which passed a certain scale of examples to verify whether the given polynomial is an identity. As Hong’s method usually involves constructing a very complicated example and computation of too large objects, Hong’s method had never been used in any non-trivial theorem, meanwhile, the parallel numerical method is the first numerical algorithm with practical and feasible significance in the machine proof of geometric theorems as it was easily implemented in portable computer (like CASIO’s PB700 or Sharp PC1500) at the end of 1980s.

When the above-mentioned algebraic methods are used to prove geometric theorems, they usually include large-scale complicated calculations involving polynomials, which geometric meaning generally can’t be understood by human, and for human it is also too difficult to check the correctness of the machine computation by manual method. Therefore, such proofs are called “human non-readable”. Zhang *et al. * [9] proposed to use the area method to prove the geometric theorem and realized the readable proof for the first time. Zhou *et al. * [10] introduced the Pythagorean difference to the proof process of Non-Euclidean geometric theorems. Similar to the area method and the Pythagorean difference method, a generalized vector method was suggested in [11]. These methods are collectively referred to as the “geometric invariant method” [12] [13].

Another category method, the “deductive reasoning method” based on database searching, which simulates the idea of human proof of geometric theorems, namely, using known hypotheses and standard axioms to perform inference searches on geometric propositions, can be traced to 1960. Gelernter *et al. * [14] proposed a method that combined the backstepping method with the depth-first search and implemented a program based on the backstepping method on the computer. Nevins [15] combined the forward and backstepping method to prove the geometric theorem. Zhang *et al.* [16] gave a more effective method based on a geometric deduction database system. Based on the idea of a structured database, the amount of calculation in the inference process was significantly reduced, and it proves that generate geometric propositions are generally readable. It worths indicating that together with dynamic geometry programs (like Geometer’s Sketchpad), the deductive reasoning method has been widely used for developing educational software in China. Nevertheless, there has no report on studies to promote computers to obtain graphical intuitive analysis capabilities for elemental geometry yet.

Considering that the intuitive knowledge of geometry played the essential role in the development of human intelligence—in both meaning of humankind and human individuals, it is natural to expect that computing machines that are able to see or understand certain geometry meaning, like three-point collineance, four points lie on the same circle, or square of one edge equals to the sum of squares of other two edges in certain triangles, would eventually lead to a higher stage of machine intelligence—the ASI (Artificial Super Intelligence), Sun studied recently in his Master thesis [17] the problem to train the intelligent agents such as computers to “observe” a large number of intuitive geometric configurations, to combine the powerful algebraic computing capabilities and data storage capabilities of machine, so to understand and master the intuitive geometrical analysis capabilities of humans in the long-term goal of AI. The work implemented a symbolic computation program with Maple software to mimic dynamic geometry for randomly generating geometric configuration in batch, and designed several statistical formulas to discover latent geometry relationships from suitable amount of graphic data, therefore, exhibited a potential probability of the conscious evolution of the computing machine species.

As an English translation of one part of the thesis, this paper focuses on establishing statistics of geometric relations in graphic data and establishing a quantitative method for comparing the similarities between the distributions of graphic data.

The rest of this paper is organized as follows. In Section 2, we introduce the geometric theorem machine proof methods related to the content of this paper. In Section 3, we propose the geometric relationship detection method based on distribution similarity. In Section 4, we conducted numerical experiments and compared the results under different observation error levels. In the final section, we draw a short conclusion.

2. Related Methods of Mechanical Geometry Theorem-Proving

2.1. Wu’s Method

Let *F* and *G* be two multivariate polynomials about the variable
$x$, the class of *F* is *k*, and the highest degree of *F* and *G* about
${x}_{k}$ are *d* and *s* respectively. Arrange *F* and *G* in descending order of the variable
${x}_{k}$ and write as follows form:

$\{\begin{array}{l}F={f}_{d}{x}_{k}^{d}+{f}_{d-1}{x}_{k}^{d-1}+\cdots +{f}_{0}\\ G={g}_{s}{x}_{k}^{s}+{g}_{s-1}{x}_{k}^{s-1}+\cdots +{g}_{0}\end{array}$ (1)

Then, there must be a non-negative integer *t* and polynomials *T* and *R*. The highest coefficient of *R* with respect to
${x}_{k}$ is less than *d* or
$R=0$, which satisfies:

${f}_{d}^{t}\times G=T\times F+R$ (2)

In Equation (2), *R* is the pseudo remainder of polynomial *G* with respect to polynomial *F*, denoted as
$prem\left(G\mathrm{,}F\right)=R$.

If the polynomial group
$TS={T}_{1},{T}_{2},\cdots ,{T}_{s}$ satisfies
$s=1$,
${T}_{1}\ne 0$ or
$\forall i<j$,
$class\left({T}_{i}\right)<class\left({T}_{j}\right)$, then the polynomial group *TS* of the following form is called a triangular polynomial group:

$\{\begin{array}{l}{T}_{1}\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{i1}\right)\\ {T}_{2}\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{i1}\mathrm{,}{x}_{i2}\right)\\ \cdots \\ {T}_{s}\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{i1}\mathrm{,}\cdots \mathrm{,}{x}_{is}\right)\end{array}$ (3)

Assuming that
$TS={T}_{1},{T}_{2},\cdots ,{T}_{s}$ is a triangular polynomial group, the remainder of polynomial *G* with respect to *TS* can be obtained by the following “continuous pseudo division”:

$\{\begin{array}{l}prem\left(G,{T}_{s}\right)={R}_{s}\\ prem\left({R}_{s},{T}_{s-1}\right)={R}_{s-1}\\ \cdots \\ prem\left({R}_{2},{T}_{1}\right)={R}_{1}\end{array}$ (4)

Let $R={R}_{1}$, and write Equations (4) as $prem\left(G\mathrm{,}TS\right)=R$. Further, the remainder formula Equation (2) can be extended to the following form:

${I}_{1}^{{t}_{1}}\times \cdots \times {I}_{s}^{{t}_{s}}\times G={\displaystyle \underset{i=1}{\overset{s}{\sum}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{C}_{i}\times {T}_{i}+R$ (5)

Among them, ${I}_{i}$ and ${C}_{i}$ are the initial formula and polynomial of ${T}_{i}$ respectively.

The general procedure of Wu’s method to prove geometric theorem is as follows:

1) The geometric theorem is algebraized, the known assumptions are partially transformed into a polynomial group *H*, and the theorem’s conclusion is transformed into a polynomial *g*.

2) The polynomial group *H* is sorted according to the Wu-Ritt principle [2] [18], and the ascending
$CS=\left\{{f}_{1}=0,{f}_{2}=0,\cdots ,{f}_{s}=0\right\}$ is obtained.

3) Solve the theorem conclusion polynomial *g* and the continuous pseudo-division of ascending sequence, get
$prem\left(g\mathrm{,}CS\right)=R$, and judge whether the residue *R* is 0. If
$R=0$, according to Equation (5), it is easy to get the equation
${I}_{1}^{{t}_{1}}\times \cdots \times {I}_{s}^{{t}_{s}}\times g={C}_{1}{f}_{1}+{C}_{2}{f}_{2}+\cdots +{C}_{s}{f}_{s}$, and
$g=0$ can be derived from the non-degenerate condition
${I}_{i}^{{t}_{i}}\ne 0$ and
${f}_{i}=0$. From this, we can know that the theorem to be proved holds under non-degenerate conditions.

2.2. Numerical Parallel Method

The single-example illustration method has expanded a new idea for the machine proof of geometric theorems, but it has not been realized due to its high computational complexity. Zhang *et al.* [8] proposed a numerical parallel method inspired by Wu’s method.

Suppose the polynomial
$F\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right)\in \mathbb{K}\left[{x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right]$, the highest degree of the polynomial *F* to the variable
${x}_{i}$ is less than or equal to
${d}_{i}\left(i=1,2,\cdots ,n\right)$, and
${S}_{i}\left(i=1,2,\cdots ,n\right)$ is an arbitrary subset of
${d}_{i}+1$ elements in the domain
$\mathbb{K}$. If the following Equation (6) holds,*F* is an identity that is always 0:

$\forall \left({x}_{1}^{*},{x}_{2}^{*},\cdots ,{x}_{n}^{*}\right)\in {S}_{1}\times {S}_{2}\times \cdots \times {S}_{n},F\left({x}_{1}^{*},{x}_{2}^{*},\cdots ,{x}_{n}^{*}\right)=0$ (6)

The conclusion can be drawn from the above: To verify whether an n-ary polynomial
$F\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right)$ with the highest degree of each variable of
${d}_{i}\left(i=1,2,\cdots ,n\right)$ is an identity that is always 0, only *N* different numerical examples need to be verified, where
$N=\left({d}_{1}+1\right)\times \left({d}_{2}+1\right)\times \cdots \times \left({d}_{n}+1\right)$.

The general procedure of Wu’s method to prove geometric theorem is as follows:

1) The geometric theorem is algebraized, the known assumptions are partially transformed into a polynomial group *H*, and the theorem’s conclusion is transformed into a polynomial *g*.

2) Solve the triangle polynomial set *TS* reduced by the polynomial set *H*. Solve the conclusion of the polynomial *g* for the remainder of *TS*,
$R=prem\left(g\mathrm{,}TS\right)$. Estimate the maximum degree
${d}_{i}\left(i=1,2,\cdots ,n\right)$ of the remainder *R* for the independent variable.

3) According to Equation (6), construct the set of instances to be tested and substitute these instances into *TS* one by one, solve the specific values of the constraint variables, and then substitute them into *g*. If
$g=0$, it indicates that the instance is consistent with the theorem; Otherwise, this geometric theorem is generally invalid.

3. Intuitive Geometry Based on Experimental Mathematics and Statistical Analysis

3.1. Data and Statistics

The algebraic methods such as Wu’s method, single-example illustration method, and numerical parallel method prove geometric theorems. It is necessary to algebraize the geometric theorems. We propose to calculate the numerical value of the geometric configuration instance without algebraic processing, so it needs to generate a large number of geometric configuration legends. Data can be generated by changing the free points in the geometric configuration. We use Maple to write a dynamic geometry subroutine module similar to the geometric sketchpad and super sketchpad to realize the data generation of the geometric configuration.

The algebraic method proves the geometric theorem, and a polynomial $f\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right)=0$ expresses the geometric relationship by selecting appropriate coordinates. Our Maple program simulates the intelligent subject to observe the geometric configuration intuitively, adding slight disturbances to the data and rounding the coordinates of the points. In this way, it is not possible to directly use the polynomial $f\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right)=0$ to express the geometric relationship. In analogy to the geometric predicate in the algebraic method, we have constructed relevant statistics to express the geometric relationship.

The construction of statistics satisfies the following three principles:

1) $f=0$, if and only if a particular geometric relationship is strictly valid numerically, the degree of approximate validity of a particular geometric relationship is measured by the degree of deviation from 0.

2) Statistics should eliminate the influence of dimensions.

3) For *N* samples Equation (7) that satisfy a particular geometric relationship, satisfy Equation (8)

${A}^{\left(i\right)}=\left({u}_{1}^{\left(i\right)},{v}_{1}^{\left(i\right)}\right),{B}^{\left(i\right)}=\left({u}_{2}^{\left(i\right)},{v}_{2}^{\left(i\right)}\right),{C}^{\left(i\right)}=\left({u}_{3}^{\left(i\right)},{v}_{3}^{\left(i\right)}\right),\cdots ,i=1,2,\cdots ,N$ (7)

$\underset{N\to \infty}{\mathrm{lim}}\frac{1}{N}{\displaystyle \underset{i=1}{\overset{N}{\sum}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}f\left({A}^{\left(i\right)},{B}^{\left(i\right)},{C}^{\left(i\right)},\cdots \right)\to 0$ (8)

We briefly explain the construction of statistics corresponding to commonly used geometric relations.

Measure equilateral triangle: use Equation (9) to measure.

$f=\underset{i}{\mathrm{arg}}\mathrm{max}\left(\left|i\right|-\frac{\pi}{3}\right),i=\angle A,\angle B,\angle C$ (9)

Measurement angle bisector: Measured by the difference between the two angles formed by dividing the angle by a straight line. Measure vertical (or parallel):

the value is measured by the difference between the angle of two vectors and $\frac{\pi}{2}$

(or 0), and the outer product of the vector measures the sign. The three points are measured in common: the value is measured by the farthest distance among the three points, and the directed area of a triangle measures the sign. Measure the collinearity of three points: use the smallest angle between the three points in the vector. Measure multiple points in a circle: fit a circle
${\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}={r}_{0}^{2}$ closest to these *N* points by the least square method, and then use Equation (10) to measure.

$f=\left(\underset{{r}_{i}}{\mathrm{arg}}\mathrm{max}\left(\left|{r}_{i}-{r}_{0}\right|\right)\right)/{r}_{0}-1$ (10)

3.2. Distribution Similarity Geometric Relationship Detection

In this section, we propose a geometric relationship detection method based on distribution similarity. Before that, let me introduce the methods of measuring the similarity of distributions and the nonparametric test methods used in this paper.

Considering the similarity of two probability distributions, *P* and *Q*, Kullback-Leibler divergence (KL divergence) in Equation (11) and Jensen-Shannon divergence (JS divergence ) in Equation (12) can be used.

${D}_{KL}\left(P\left|\right|Q\right)={\displaystyle {\int}_{x}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}p\left(x\right)\mathrm{log}\left(\frac{p\left(x\right)}{q\left(x\right)}\right)\text{d}x$ (11)

${D}_{JS}\left(P\left|\right|Q\right)=\frac{1}{2}{D}_{KL}\left(P\left|\right|M\right)+\frac{1}{2}{D}_{KL}\left(Q\left|\right|M\right),M=\frac{P+Q}{2}$ (12)

When the support sets of the two distributions, *P* and *Q*, do not overlap or the overlap is small, it is difficult for KL divergence and JS divergence to quantify the similarity between the distributions. In recent years, the similarity of the Wasserstein distance Equation (13) metric distribution has been widely used in machine learning. In this paper, we use Wasserstein distance to measure the similarity of distributions. In Equation (13),
$\gamma \left(x\mathrm{,}y\right)$ satisfies
${\int}_{{\mathbb{R}}^{d}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\gamma \left(x,y\right)\text{d}y=P$ and
${\int}_{{\mathbb{R}}^{d}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\gamma \left(x,y\right)\text{d}x=Q$. In general, it is tough to calculate the Wasserstein distance, but when the data dimension
$d=1$, the p-Wasserstein distance can be expressed as Equation (14). Among them,
${F}^{-1}$ and
${G}^{-1}$ are the quantile functions of *P* and *Q*, respectively.

${W}_{p}\left(P,Q\right)={\left(\underset{\gamma \in \Gamma \left(P,Q\right)}{\mathrm{inf}}{\displaystyle {\int}_{{\mathbb{R}}^{d}\times {\mathbb{R}}^{d}}}{\Vert x-y\Vert}^{p}\gamma \left(x,y\right)\right)}^{1/p}$ (13)

${W}_{p}\left(P\mathrm{,}Q\right)={\displaystyle {\int}_{0}^{1}}{\left|{F}^{-1}\left(t\right)-{G}^{-1}\left(t\right)\right|}^{p}\text{d}t$ (14)

In this way, the calculation of *p*-Wasserstein distance is simplified. In the previous section, we constructed the statistics of geometric relations and mapped the observation data to one dimension to use 1-Wasserstein distance Equation (15) to measure similarity.

${W}_{1}\left(P\mathrm{,}Q\right)={\displaystyle {\int}_{0}^{1}}\left|{F}^{-1}\left(t\right)-{G}^{-1}\left(t\right)\right|\text{d}t$ (15)

In statistics, hypothesis testing is often used in statistical inference, inferring hypotheses about the population based on empirical data. It can also be used to test whether two distributions come from the same distribution. In this paper, we used two non-parametric tests. One is the Kolmogorov-Smirnov (K-S) test, which uses the K-S statistic Equation (16) or Equation (17) to accept or reject the null hypothesis.

${D}_{n}=\underset{x}{\mathrm{sup}}\left|{F}_{n}\left(x\right)-F\left(x\right)\right|$ (16)

${D}_{n,m}=\underset{x}{\mathrm{sup}}\left|{F}_{1,n}\left(x\right)-{F}_{2,m}\left(x\right)\right|$ (17)

Another method is referred to as the permutation test based on the 2-Wasserstein distance used in Matsui *et al.* [19] and Schefzik *et al.* [20]. Considering the Wasserstein distance when
$d=1$ and
$p=2$, it can be decomposed into three parts [21] in Equation (18).

$\begin{array}{c}{W}_{2}\left(P\mathrm{,}Q\right)={\displaystyle {\int}_{0}^{1}}{\left|{F}^{-1}\left(t\right)-{G}^{-1}\left(t\right)\right|}^{2}\text{d}t\\ ={\left({\mu}_{p}-{\mu}_{q}\right)}^{2}+{\left({\sigma}_{p}-{\sigma}_{q}\right)}^{2}+2{\sigma}_{p}{\sigma}_{q}\left(1-{\rho}_{p\mathrm{,}q}\right)\end{array}$ (18)

Among them, the mean, variance and shape of the three items on the right are respectively distributed.
${\rho}_{p\mathrm{,}q}$ is the Pearson correlation coefficient of the corresponding point in the quantile map of *F* and *G*. Equation (18) can be approximated by the empirical estimation formula Equation (19).

${W}_{2}\left(\stackrel{^}{P}\mathrm{,}\stackrel{^}{Q}\right)={\displaystyle {\int}_{0}^{1}}{\left|{\stackrel{^}{F}}^{-1}\left(t\right)-{\stackrel{^}{G}}^{-1}\left(t\right)\right|}^{2}\text{d}t$ (19)

$\stackrel{^}{P}$ and
$\stackrel{^}{Q}$ are the distribution of the observation data
$X\left(i\right),Y\left(i\right),i=1,2,\cdots ,n$, and
${\stackrel{^}{F}}^{-1}\left(t\right)$ and
${\stackrel{^}{G}}^{-1}\left(t\right)$ are the quantile functions of the observation data. The null hypothesis is
${H}_{0}:P=Q$. Under the condition that the null hypothesis is established, the distribution functions *P* and *Q* are obtained by random replacement of samples. Calculate the distance
${d}_{i}^{\mathrm{*}}={W}_{2}\left({\stackrel{^}{P}}_{i}^{\mathrm{*}}\mathrm{,}{\stackrel{^}{Q}}_{i}^{\mathrm{*}}\right)$ according to Equation (18), mark
${D}_{i}^{\mathrm{*}}=\left({d}_{i\mathrm{,1}}^{\mathrm{*}}\mathrm{,}{d}_{i\mathrm{,2}}^{\mathrm{*}}\mathrm{,}\cdots \mathrm{,}{d}_{i\mathrm{,}B}^{\mathrm{*}}\right)$ as the distances between the two distributions after all possible permutations. Then the p-value can be calculated according to Equation (20), where
${d}_{i}={W}_{2}\left(\stackrel{^}{P}\mathrm{,}\stackrel{^}{Q}\right)$. The subset
${D}_{i,sub}^{*}=\left({\stackrel{\u02dc}{d}}_{i,1}^{*},{\stackrel{\u02dc}{d}}_{i,2}^{*},\cdots ,{\stackrel{\u02dc}{d}}_{i,{B}_{s}}^{*}\right),{B}_{s}<B$ of
${D}_{i}^{\mathrm{*}}$ can approximate Equation (20) to reduce the amount of calculation, and Equation (21) can be obtained.

$p=\frac{{\displaystyle {\sum}_{b=1}^{B}I\left({d}_{i,b}^{*}\ge {d}_{i}\right)}}{B}$ (20)

${p}_{s}=\frac{{\displaystyle {\sum}_{b=1}^{B}I\left({\stackrel{\u02dc}{d}}_{i,b}^{*}\ge {d}_{i}\right)}}{{B}_{s}}$ (21)

The geometric relationship detection method based on distribution similarity mainly includes the following steps:

Step 1: Call the Maple subroutine to generate the corresponding geometric configuration legend, and generate a large sample data according to the dynamic geometry.

Step 2: Randomly select a batch of samples, and measure the similarity according to Equation (15). Under fixed disturbance
$\delta $, construct the standard distribution
${P}_{\left(\delta \mathrm{,}f\right)}$ of each geometric relationship to measure the observation data distribution
${P}_{\left(emp\mathrm{,}f\right)}$. Among them, *f* is a statistic that measures geometric relations.

Step 3: In Step 2, reduce the search range according to the 1-Wasserstein distance, and perform a significance test on the remaining distributions with high similarity. When ${P}_{\left(\delta \mathrm{,}f\right)}$ and ${P}_{\left(emp\mathrm{,}f\right)}$ approximately obey the normal distribution, use the T-test and the F-test to test the position and scale parameters of the normal distribution, respectively. Otherwise, the permutation test is used for the non-parametric test. Use the K-S method to test whether ${P}_{\left(\delta \mathrm{,}f\right)}$ and ${P}_{\left(emp\mathrm{,}f\right)}$ approximately obey a normal distribution.

The complete process can be found in Algorithm 1.

3.3. Integral Coefficient Invariant Discovery

In Section 3.2, we propose a geometric relationship detection method based on distribution similarity to explore the deterministic vertical and collinear geometric relationships in geometric configurations. In this section, we explore the integer coefficient relationship between the lengths of geometric quantities. This type of relationship involves uncertain, unknown quantities and uncertain integer coefficients. Since most of the relations between geometric quantities are homogeneous relations of first and second order in plane geometry, we study the first and second integer coefficient relations between the length of line segments in geometric figures. Specifically, there is a vector $x=\left({x}_{1}\mathrm{,}{x}_{2}\mathrm{,}\cdots \mathrm{,}{x}_{n}\right)\in {\mathbb{R}}^{n}$, and a group of integers ${a}_{1}\mathrm{,}{a}_{2}\mathrm{,}\cdots \mathrm{,}{a}_{n}$ that is not all 0 is found to satisfy Equation (24).

${a}_{1}{x}_{1}+{a}_{2}{x}_{2}+\cdots +{a}_{n}{x}_{n}=0$ (22)

At present, the widely used integer relational algorithms are mainly the LLL algorithm proposed by Lenstra *et al.*, and the PSLQ algorithm proposed by Bailey *et al.* In addition, Feng *et al.* [22] researched a PSLQ algorithm with empirical data as input. Our data is observational data, and the accuracy of the data does not meet the requirements of these algorithms, so we try to solve it in the following way.

First, we converted the sample data, and the generated data is uniformly expressed as an array of points *P* Equation (23),

$P=\left[{p}_{1}:\left({x}_{1},{y}_{1}\right),{p}_{2}:\left({x}_{2},{y}_{2}\right),\cdots ,{p}_{n}:\left({x}_{n},{y}_{n}\right)\right]$ (23)

where the order of points
${p}_{i}$ in *P* is fixed. There is a line between any two points by default, and all the line segment lengths in the geometric figure *D* Equation (24) are obtained, where *d* is distance.

$D=\left[d\left({p}_{1},{p}_{2}\right),d\left({p}_{1},{p}_{3}\right),\cdots ,d\left({p}_{n-1},{p}_{n}\right)\right]$ (24)

Extending it to quadratic and the reciprocal of the geometric quantity get ${D}_{2}$ Equation (25) and ${D}_{-1}$ Equation (26).

${D}_{2}=\left[{\left(d\left({p}_{1}\mathrm{,}{p}_{2}\right)\right)}^{2}\mathrm{,}d\left({p}_{1}\mathrm{,}{p}_{2}\right)\ast d\left({p}_{1}\mathrm{,}{p}_{3}\right)\mathrm{,}\cdots \mathrm{,}{\left(d\left({p}_{n-1}\mathrm{,}{p}_{n}\right)\right)}^{2}\right]$ (25)

${D}_{-1}=\left[1/d\left({p}_{1},{p}_{2}\right),1/d\left({p}_{1},{p}_{3}\right),\cdots ,1/d\left({p}_{n-1},{p}_{n}\right)\right]$ (26)

Then, we randomly select no less than m converted samples and write them as
${A}_{m\times m}$ in the form of a matrix, and write the integer coefficient vector to be solved as
${x}_{m\times 1}$,
$x\in {\mathbb{Z}}^{m}\mathrm{,}x\ne 0$, where*m* is the number of elements in *D* or
${D}_{2}$. If the length of the line segment in the geometric configuration has an integral coefficient relationship, then there is such a
$x$ that satisfies
$Ax=0$. Due to the observation error of the sample data, the problem is transformed into Equation (27).

$\underset{x}{\mathrm{arg}}\mathrm{min}\frac{1}{m}{\Vert Ax\Vert}_{2}^{2}\mathrm{,}x\in {\mathbb{Z}}^{m}\mathrm{,}x\ne 0$ (27)

Usually, in plane geometry, the integer coefficients between geometric quantities are relatively small. Due to the existence of such prior knowledge, we add regular term constraints based on Equation (27) to obtain Equation (28).

$\underset{x}{\mathrm{arg}}\mathrm{min}\frac{1}{m}{\Vert Ax\Vert}_{2}^{2}+\lambda {\Vert x\Vert}_{2}\mathrm{,}x\in {\mathbb{Z}}^{m}\mathrm{,}x\ne 0$ (28)

In addition, in geometric figures, the integer coefficient relationships that exist are not unique. In Equation (28), integer coefficient relationships involving a small number of geometric quantities will conceal the relationship involving more integer coefficients involving geometric quantities. Consider decomposing Equation (28) into sub-problems Equation (29).

$\underset{x}{\mathrm{arg}}\mathrm{min}\frac{1}{m}{\Vert Ax\Vert}_{2}^{2}+\lambda {\Vert x\Vert}_{2}\mathrm{,}{x}_{i}\ge 1\left(1\le i\le m\right)\mathrm{,}x\in {\mathbb{Z}}^{m}$ (29)

Since the elements in *D*,
${D}_{-1}$, and
${D}_{2}$ are always non-negative, if
$x$ satisfies Equation (28), then at least two components of
$x$ are not 0 and are integers. We let a component
${x}_{i}$ of
$x$ add constraints to Equation (28) to get Equation (29) to search sequentially.

The solution of Equation (29) is complex. We use the Cplex solver developed by IBM to solve this problem. The algorithm steps are shown in Algorithm 2, where the *err* refers to the sum of the difference between the approximate integral relationship and the strict integral relationship and the regular term.

4. Numerical Experiment

In this section, we construct observation data of some geometric theorems and performed numerical experiments on this basis, including 12 geometric theorems such as Orthocenter theorem, Centroid theorem, Incenter theorem, Morley theorem, Euler Line, Five Circles theorem, Nine-point Circle theorem, Ptolemy’s theorem, corollary to Ptolemy’s theorem, Candy theorem, Pappus theorem, and Desargue theorem. We first carry out numerical experiments under $\delta =N\left(\mathrm{0,2}\right)$ disturbances and then carry out numerical experiments with different disturbance levels of $\delta =N\left(\mathrm{0,1}\right)$ and $\delta =N\left(\mathrm{0,3}\right)$, where $N\left(\mathrm{0,2}\right)$ is normal distribution.

Take the Orthocenter theorem to illustrate a feasible method of selecting thresholds to reduce the search space. Figure 1 is the Wasserstein distance of the geometric relationship of the three-point collinear relationship in the Orthocenter theorem. The Wasserstein distance is naturally divided into two categories. The right part does not satisfy the three-point collinear relationship, so the data of these combinations can be quickly excluded. In the Orthocenter theorem, the perpendicular, three-point common point relationship test is the same, and the hypothetical test of the result after rapid elimination is performed. The Orthocenter theorem, the empirical distribution of the perpendicular relationship

Figure 1. Wasserstein distance of the three-point collinear relationship in the Orthocenter theorem.

statistics of the three vertical lines, and the standard distribution of the perpendicular relationship statistics under $\delta =N\left(\mathrm{0,2}\right)$ disturbance are shown in Figure 2. A further hypothetical test is performed on the perpendicular relationship of the three vertical lines and the significance level a = 005. Because the empirical distribution and the standard distribution are approximately normal distributions, the parameter test is used directly. The results are shown in Table 1.

Since there are many collinear and perpendicular relationships, it is too verbose to list them all. Here are examples of each type of geometric relationship. The results are shown in Table 2, where 1-Wd means 1-Wasseratein distance.

We take Ptolemy’s theorem, the corollary of Ptolemy’s theorem, and Candy theorem as examples to carry out numerical experiments on the invariants of integral coefficients. The first is Ptolemy’s theorem, 100 samples are randomly selected to solve, and three effective solution vectors Equation (30) are obtained. These three solution vectors correspond to the same geometric relationship, which is the conclusion of Ptolemy’s theorem. We re-selected 100 samples for 10 experiments, and the errors obtained were 94.3702, 78.3523, 59.2743, 31.8584, 76.5764, 81.6923, 43.1427, 69.1004, 118.8367, 76.9757.

$\{\begin{array}{l}{x}_{1}=\left(\mathrm{0,0,0,0,0,1,0,0,0,}-\mathrm{1,0,0,1,0,0,0,0,0,0,0,0}\right)\\ {x}_{2}=\left(\mathrm{0,0,0,0,0,}-\mathrm{1,0,0,0,1,0,0,}-\mathrm{1,0,0,0,0,0,0,0,0}\right)\\ {x}_{3}=\left(\mathrm{0,0,0,0,0,1,0,0,0,}-\mathrm{1,0,0,1,0,0,0,0,0,0,0,0}\right)\end{array}$ (30)

The second is the corollary of Ptolemy’s theorem, where the result of the corollary is in an order relationship with the size of the geometric value, and the length of the line segment is sorted before starting the solution. Similarly, 100 samples are randomly selected and solved to obtain the solution vector Equation (31).

Figure 2. The distribution of the perpendicular relationship statistics of the vertical lines, the lower right corner subfigure is the standard distribution and the rest are the empirical distribution.

Table 1. Hypothesis test of perpendicular relationship ( $\alpha =0.05$ ).

Table 2. Geometric relationship detection results based on distribution similarity ( $\delta =N\left(0,2\right),\alpha =0.05$ ).

$\{\begin{array}{l}{x}_{1}=\left(\mathrm{1,1,0,0,0,}-1\right)\hfill \\ {x}_{2}=\left(\mathrm{1,1,0,0,0,}-1\right)\hfill \\ {x}_{3}=\left(\mathrm{0,0,1,}-\mathrm{1,0,0}\right)\hfill \\ {x}_{4}=\left(\mathrm{0,0,0,1,}-\mathrm{1,0}\right)\hfill \\ {x}_{5}=\left(\mathrm{0,0,0,}-\mathrm{1,1,0}\right)\hfill \\ {x}_{6}=\left(-\mathrm{1,}-\mathrm{1,0,0,0,1}\right)\hfill \end{array}$ (31)

Finally, in the numerical experiment of Candy theorem, the theorem’s conclusion could not be obtained. If we expand the conclusion of Candy theorem, we get a cubic relationship. It is tough to solve the cubic relationship. The dimension of the solution vector $x$ will be increased very largely, and the error will also accumulate due to the multiplication of each item.

The comparative experimental results of the other two groups of different disturbances can be seen in Table 3 and Table 4. The error of the integral coefficient invariant relationship in Table 4 is taken from the average of the results of 10 experiments, and “−” means that the result of the theorem is not obtained.

5. Conclusion

In this paper, we construct statistics that measure approximate geometric relationships for inaccurate observation data, map the observation data to one dimension through statistics. Using the distribution similarity of the Wasserstein distance metric, we propose a method for detecting geometric relationship similarities. The method has been successfully applied for checking the following geometric relations: 1) three lines intersect at one point; 2) three points lie on one line; 3) three points form one equilateral triangle; 4) two lines are parallel or perpendicular to each other; 5) one line bisects a given angle; 6) four points form a convex quadrilateral; and 7) four or more points lie on the same circle. We have also proposed a searching method to find linear or quadratic equations with integer coefficients between observed geometric quantities under certain prior conditions. The constrained searching method can be used to find linear or quadratic relation with integer coefficients between geometric quantities under a priori conditions. Numerical experiments show that the method proposed in this

Table 3. Geometric relationship detection results ( $\delta =N\left(0,1\right)$ and $\delta =N\left(0,3\right)$, $\alpha =0.05$ ).

Table 4. Integral coefficient relation error under different disturbance levels.

paper is adequate, which will help the machine to obtain intuitive analysis capabilities for geometric figures, which have practical significance and certain application prospects. Our experiment failed in finding a cubic relation in the CandyTheorem, namely, assume that *AB* is an arbitrary chord in the circle *O*, *P* is a point on *AB*,
$C\mathrm{,}D$ are arbitrary two points on the circle *O*,
$E\mathrm{,}F$ are intersection points of the circle *O* and line
$CP\mathrm{,}DP$, and
$G\mathrm{,}H$ are intersection points of the chord *AB* with
$CF\mathrm{,}DE$, respectively, then
$1/AP-1/BP=1/GP-1/HP$. We will try to overcome this difficulty by control error accumulation in numerical analysis. An interesting problem is to train computers to find latent inequalities from configuration data. A very simple and famous example of such kind is Euler’s Inequality, which states that the distance *d* between the incenter *r* and the circumcenter *R* of a triangle satisfies

${d}^{2}=R\left(R-2r\right),$

and therefore $R\ge 2r$. Since almost interesting theorems that involved equalities in Euclidean geometry have been well established in past three thousand years, a prospective application of machine intelligence in the future would be automated discovering of geometric inequalities through analyzing big data of geometric configurations.

References

[1] Tarski, A. (1951) A Decision Method for Elementary Algebra and Geometry. In: Johannes, K., Ed., Texts & Monographs in Symbolic Computation, Springer, Berlin, 24-84.

https://doi.org/10.1007/978-3-7091-9459-1_3

[2] Wu, W.-T. (2012) Mechanical Theorem Proving in Geometries: Basic Principles. Springer-Verlag Wien, Wien.

[3] Chou, S.C. (1988) An Introduction to Wu’s Method for Mechanical Theorem Proving in Geometry. Journal of Automated Reasoning, 4, 237-267.

https://doi.org/10.1007/BF00244942

[4] Chou, S.C. and Gao, X.S. (1990) Ritt-Wu’s Decomposition Algorithm and Geometry Theorem Proving. International Conference on Automated Deduction, Kaiserslautern, July 1990, 207-220.

https://doi.org/10.1007/3-540-52885-7_89

[5] Buchberger, B., Collins, G.E. and Kutzler, B. (1988) Algebraic Methods for Geometric Reasoning. Annual Review of Computer Science, 3, 85-119.

https://doi.org/10.1146/annurev.cs.03.060188.000505

[6] Kutzler, B. and Stifter, S. (1986) On the Application of Buchberger’s Algorithm to Automated Geometry Theorem Proving. Journal of Symbolic Computation, 2, 389-397.

https://doi.org/10.1016/S0747-7171(86)80006-2

[7] Hong, J.W. (1986) Can We Prove Geometric Theorems with Examples? Science in China Series A-Mathematics, Physics, Astronomy & Technological Science, 16, 234-242.

https://doi.org/10.1360/za1986-16-3-234

[8] Zhang, J.Z., Yang, L. and Deng, M. (1990) The Parallel Numerical Method of Mechanical Theorem Proving. Theoretical Computer Science, 74, 253-271.

https://doi.org/10.1016/0304-3975(90)90077-U

[9] Zhang, J.Z., Chou, S.C. and Gao, X.S. (1995) Automated Production of Traditional Proofs for Theorems in Euclidean Geometry I. The Hilbert Intersection Point Theorems. Annals of Mathematics & Artificial Intelligence, 13, 109-137.

https://doi.org/10.1007/BF01531326

[10] Chou, S.C., Gao, X.S. and Zhang, J.Z. (1993) Automated Production of Traditional Proofs for Constructive Geometry Theorems. Proceedings Eighth Annual IEEE Symposium on Logic in Computer Science, Montreal, 19-23 June 1993, 48-56.

https://doi.org/10.1109/LICS.1993.287601

[11] Chou, S.C., Gao, X.S. and Zhang, J.Z. (1993) Mechanical Geometry Theorem Proving by Vector Calculation. Proceedings of the 1993 International Symposium on Symbolic and Algebraic Computation, Kiev, Ukraine, August 1993, 284-291.

https://doi.org/10.1145/164081.164142

[12] Chou, S.C., Gao, X.S. and Zhang, J.Z. (1996) Automated Generation of Readable Proofs with Geometric Invariants. Journal of Automated Reasoning, 17, 349-370.

https://doi.org/10.1007/BF00283134

[13] Zhang, J.Z., Gao, X.S. and Chou, S.C. (2015) Invariance Method of Machine Proof for Geometric Theorems. Science Press, Beijing.

[14] Gelernter, H., Hansen, J., Loveland, D. (1960) Empirical Explorations of the Geometry-Theorem Proving Machine. Western Joint IRE-AIEE-ACM Computer Conference, San Francisco, California, 3-5 May 1960, 143-149.

https://doi.org/10.1145/1460361.1460381

[15] Nevins, A.J. (1975) Plane Geometry Theorem Proving Using Forward Chaining. Artificial Intelligence, 6, 1-23.

https://doi.org/10.1016/0004-3702(75)90013-2

[16] Chou, S.C., Gao, X.S. and Zhang, J.Z. (2000) A Deductive Database Approach to Automated Geometry Theorem Proving and Discovering. Journal of Automated Reasoning, 25, 219-246.

https://doi.org/10.1023/A:1006171315513

[17] Sun, X. (2021) Research on Intuitive Geometry Based on Computer Experimental Mathematics and Statistical Analysis. Master Thesis, Shanghai University, Shanghai.

[18] Wu, W.-T. (2003) Mathematics Mechanization. Science Press, Beijing.

[19] Matsui, Y., Mizuta, M., Ito, S., Miyano, S. and Shimamura, T. (2016) D3M: Detection of Differential Distributions of Methylation Levels. Bioinformatics, 32, 2248-2255.

https://doi.org/10.1093/bioinformatics/btw138

[20] Schefzik, R., Flesch, H. and Goncalves, A. (2021) Fast Identification of Differential Distributions in Single-Cell RNA-Sequencing Data with waddR. Bioinformatics.

https://doi.org/10.1093/bioinformatics/btab226

[21] Irpino, A. and Verde, R. (2015) Basic Statistics for Distributional Symbolic Variables: A New Metric-Based Approach. Advances in Data Analysis and Classification, 9, 143-175.

https://doi.org/10.1007/s11634-014-0176-4

[22] Feng, Y., Chen, J.W. and Wu, W.Y. (2019) The PSLQ Algorithm for Empirical Data. Mathematics of Computation, 88, 1479-1501.

https://doi.org/10.1090/mcom/3356