Received 3 June 2016; accepted 26 July 2016; published 29 July 2016
It is now well recognized that in finite element computations there are three independent parameters: characteristic length of the discretization h, degree of approximation p, and the order k of the scalar product space. h and p have been well known for quite some time but introduction of k as an additional independent parameter in finite element computations is rather recent. Surana et al.  -  have shown the order k of the approximation space to be an independent parameter in all finite element computational processes in addition to h and p, hence k-version of finite element method in addition to h- and p-versions. The order k of the approximation space ensures global differentiability of order over the whole discretization. The appropriate choice of k is essential in ensuring that 1) the desired physics is preserved in the computational process and 2) the integrals are Riemann in the entire finite element process so that the equivalence of BVP with the integral form is preserved and the errors in the calculated solution can be computed correctly without knowledge of the theoretical solution. We elaborate more on some of these aspects in the following.
If the differential operator contains highest order derivatives of the dependent variables of orders, then the approximation of the solutions of the BVP must at least be of class i.e. of global differentiability of order in order for this approximation to be admissible in the BVP in the pointwise sense. This requires that the order k of the approximation space must at least be i.e. is minimally conforming order of the approximation space. Clearly, the order k of the minimally conforming space is determined by the highest order of the derivatives of the dependent variable(s) in the BVP. When, all integrals over the discretization remain Riemann. When, the integrals over are in Lebesgue sense and the corresponding approximation of the solution over is not admissible in the BVP in the pointwise sense. When, the approximation of over is not admissible at all in the BVP. Choosing may be beneficial if the theoretical solution of the BVP is of higher order global differentiability than 2m as this choice incorporates higher order global differentiability aspects of in the computational process. Thus, now we have h-, p-, k-versions of the finite element processes and associated convergences and convergence rates.
The subject of a priori error estimation and a posteriori error estimation have been exhaustively studied and investigated with the objective that 1) perhaps a priori error estimates will help us in deciding the most prudent choices of h, p, and k so that the errors in the desired norms are reduced at the fastest rate during computations, 2) the a posteriori error estimates will guide us based on the current finite element solution in improving the accuracy of the subsequently computed solutions in the most prudent manner. The published literature on this subject is enormous and discussion of each writing on the subject in this paper is not feasible and is also of little benefit. Interested readers can refer to some selected publications  -  included here.
In the work presented in this paper, our objectives are:
a) To derive a priori error estimates for BVPs described by self adjoint, non-self adjoint, and nonlinear differential orperators when the theoretical solutions are analytic, thus establishing precise dependence of the chosen error norm on h, p, k, and the smoothness of the theoretical solution (for simplicity this is done using one dimensional BVPs).
b) To discuss the currently used a posteriori error estimation techniques, their shortcomings, and serious inadequacies when actual physics of the BVP is incorporated in the finite element computational process.
c) To demonstrate the need for a posteriori error computation and present a framework in which those computations can be performed without the knowledge of theoretical solutions.
d) To establish that higher order approximation spaces and variationally consistent integral forms are essential for incorporating the desired physics of the BVP in the computational process and to ensure that the resulting finite element computational processes are unconditionally stable so that error estimations remain meaningful.
e) To perform numerical studies using one dimensional boundary value problem described by self adjoint, non-self adjoint, and nonlinear differential operators and to demonstrate exceptionally good agreement of the computed convergence rates with those established theoretically.
f) To establish that the a priori error estimates derived in (a) also hold for 2D and 3D BVPs when the integral forms in those BVPs are variationally consistent.
2. Preliminaries: Convergence and Convergence Rates, Convergence Behavior of Computations, Error Estimation, and Error Computations
In this section, we present some preliminary material and concepts that are essential in error estimation and error computations. Many of these are well known but are included in the following for completeness and for the sake of coherent continuation to the new work in this paper.
2.1. Convergence and Convergence Rate
Convergence of a finite element solution implies behavior of the error in the finite element solution (measured in some norm) as a function of the degrees of freedom or the characteristic length of the discretization. When the theoretical solution is known, the error in the finite element solution in some norm (L2-norm, H1-norm, etc.) can be computed and therefore we can study its behavior as a function of the degrees of freedom. When the theoretical solution is not known, perhaps estimating the error in some norm in the computed solution is a viable option. However, we shall see in a later section that this option only works in a restricted range of the behavior of error norm versus dofs. The third option is that if we are using minimally conforming spaces then residual functional can be computed precisely as for minimally conforming spaces all integrals over the discretization of, the domain of definition of the BVP, are Riemann. Proximity of to zero is a measure of error due to the fact that when,. Thus, is in fact error measure in the solution over. This option can always be used for any applications as it does not require theoretical solution but necessitates the approximation to be in a space of order. In what follows we can use as a measure of error over, hence convergence of the computed solution to implies studying versus dofs as more degrees of freedom are added to the discretization. When, a predetermined tolerance of computed zero, we consider the finite element solution to be converged to the
theoretical solution. We consider versus dofs or, L2-norm of residual E.
We study versus dofs using log-log scale, or more precisely we study versus log(dofs). and log(dofs) or log-log scale are necessary as the range of I could be - and the range of dof could be - or higher.
2.2. Convergence Behavior of Computations
The material presented in this section is based on versus dof behavior, but the same concepts hold true for any other measure of error norm (i.e. can be replaced with any other error norm without affecting the basic behavior of the convergence graph). A typical convergence behavior of or versus log(dof) is shown in Figure 1. This graph is generated using 1D convection-diffusion equation (a second order ODE) with and least squares finite element formulation based on residual functional. The progressively graded discretizations are generated beginning with two elements using a constant geometric ratio of 1.5. The smallest element is located at. is used as it corresponds to the minimally conforming space. Minimum p-level of 5 (needed for) is considered for each progressively refined discretization. From Figure 1, we observe five distinct zones. In each one of these zones the behavior of versus dofs is unique and distinct. The behavior of versus log(dofs) shown in Figure 1 illustrates the varying rate of convergence of the finite element solution (the slope of the curve) with varying dofs. In the middle portion represented by almost a straight line behavior the slope is almost constant, indicating constant convergence rate. We discuss the details related to the varying slope of the curve, associated rate of convergence, and its significance in the following.
Figure 1. Typical convergence behavior of a finite element solution.
Pre-asymptotic range (AB): The range AB is called pre-asymptotic range. In this range as we move from location A toward location B additional degrees of freedom are added to the discretization but there is virtually no measurable reduction in the L2-norm of E. The accuracy of the computed solution in this range is very poor (due to of the). Due to poor accuracy of the solution, hence, the values for the elements are poor as well, hence these cannot be used to guide any form of adaptive refinement process. A posteriori error estimations in this range are not possible either as these require some regularity in the computed solution which is absent in in range AB. Thus, in this range adaptive processes are not possible as reliable indicators (either estimated or computed) based on are not possible.
Onset of asymptotic range (BC): The range BC is called onset of asymptotic range. In this range addition of degrees of freedom to the discretization results in measurable reduction in reflecting progressive improvement in accuracy of the computed solution from B to C. In this range values or any other possible element error indicators are more accurate than range AB. In this range adaptive processes in h, p, or hp can be utilized keeping in mind that as we move closer to C, the values of (or other indicators) for the elements of the discretization become more accurate, hence can be more effective in the adaptive process.
Asymptotic range (CD): In this range as more dofs are added to the discretization the improvement (reduction) in is most significant. This range on log-log scale is nearly linear, hence constant slope. Adaptive refinements in this range are most effective in reducing. We observe that between C and D there are several orders of magnitude reduction in the value of. Slope of the error norm versus dof graph in this range is called the asymptotic convergence rate of the finite element solution.
Onset of post-asymptotic range (DE): This range is almost reverse of the onset of asymptotic range. In this range reduction in progressively diminishes with the addition of degrees of freedom to the discretization indicating that substantial achievable reduction in has taken place up to point D. Computations in this range result in waste of significant resources (dofs) with very little gain in the objective of reducing.
Post-asymptotic range (EF): In this range in spite of the addition of dofs to the discretization no measurable reduction is observed in. This is generally due to the fact that within the accuracy of the computations (i.e. the word size on the computer we have reached a limit), hence the accuracy remains limited to the same number of decimal places in regardless of the increase in dofs.
2.3. Convergence Rates
In range AB, the slope is almost zero. From B to C the slope increases as more dofs are added to the discretization thereby progressively increasing convergence rate from B to C. From C to D, the asymptotic range, the slope of versus dofs is almost constant and the reduction in is most significant as more dofs are added. Thus, in the asymptotic range the convergence rate is the highest (due to highest slope of versus log(dof)) and is constant. In the onset of post-asymptotic range DE the convergence rate decreases and eventually becomes almost zero in the post-asymptotic range EF.
I) Behavior of versus dofs shown in Figure 1 is typical of other error norms as well, hence the discussion and conclusions related to Figure 1 are applicable in the convergence behavior study using any other desired error norm.
II) Pre-asymptotic range AB, onset of post-asymptotic range DE, and post-asymptotic range EF should be avoided as in these ranges solution accuracy improvement is poor.
III) In range AB values (or other measures) are not accurate enough to guide an adaptive process of any kind.
IV) Adaptive processes () can be initiated in the range BC as values in this range are reasonable measure of error. Adaptive processes become more and more effective when we initiate them as we approach from B to C. In the range BC the slope of versus dof increases from B to C indicating improving convergence rate and eventually achieves the highest convergence rate value at C which remains almost constant in the asymptotic range CD.
V) A priori and a posteriori error estimates are only valid in the asymptotic range due to the fact it is only in this range that computed has desired regularity and the convergence rate is the highest, hence worth estimating a priori. The error estimates (a priori and a posteriori) can neither be derived accurately nor can be used meaningfully in regions other than BC.
2.4. Error Estimation and Error Computation
There are two types of error estimations generally considered: a priori error estimation and a posteriori error estimation. A priori error estimation refers to establishing dependence of some error norm on h, p, k, and the regularity of the theoretical solution before the computations are performed so that we have knowledge of the precise nature of the functional dependence of error norm on h, p, k, and the regularity of the theoretical solution. A posteriori error estimation refers to error estimates derived using a computed solution with specific choices of h, p, and k. The sole purpose of a posteriori error estimation is to use current finite element solution to derive element indicators that can perhaps be used to guide an adaptive process. Both of the error estimations require some regularity of the computed solution which only exists in the asymptotic range (range CD, Figure 1). This is a very significant restriction on the use of these estimates. For example, a priori error estimate cannot be used to predict convergence rate in the ranges AB, BC, DE, and EF as this is specifically derived using the regularity of that only exists in the asymptotic range. Likewise a posteriori estimate cannot be used for adaptivity in any ranges except CD.
Another point to note is that a posteriori error estimates are generally derived such that they quantify the
weakness (es) in the finite element global approximation. Their derivations are largely based on
local approximations which result in interelement discontinuity of the derivatives normal to the interelement boundaries. This may be quantified by establishing bounds that can be used for adaptivity. However if we use of class thereby of class, then such bounds are meaningless. In k-version of finite element methods enabling higher order global differentiability approximations, majority of the a posteriori error estimates based on interelement discontinuity of the derivatives are not meaningful. With the use of higher order approximations, the integrals can be maintained Riemann, and are true measures of the error in the finite element solution for and and can indeed be used in adaptive processes. These aspects are discussed in more details in later sections.
3. Variationally Consistent (VC) and Variationally Inconsistent (VIC) Integral Forms
The differential operators appearing in the totality of all BVPs can be mathematically classified in three categories: self-adjoint, non-self-adjoint, and nonlinear differential operators. The finite element processes for these operators can be derived by constructing integral forms using methods of approximation such as: Galerkin method (GM), Petrov-Galerkin method (PGM), weighted residual method (WRM), Galerkin method with weak form (GM/WF), and least squares method or process (LSP). The unconditional stability of the resulting computational process or lack thereof can be established by making a correspondence of these integral forms to the elements of the calculus of variations  -   . The integral forms that result in unconditionally stable computational processes are termed variationally consistent (VC). The others are called variationally inconsistent (VIC). In VC integral forms the assembled coefficient matrices always remain positive-definite regardless of the admissible choices of h, p, and k whereas in VIC integral forms this can not always be ensured.
Definition 3.1 (consistent (VC) integral form of a BVP) A variationally consistent integral form corresponding to the BVP consists of
1) Existence of a functional corresponding to the BVP. This is generally by construction (or is assumed).
2) Necessary condition for the existence of an extremum of is given by. The integral form is used to determine. The Euler’s equation resulting from must be the BVP.
3), , (minimum, saddle point, maximum of) is the sufficient condition or extremum principle. Extremum principle ensures that a obtained from is unique. Extremum principle also establishes whether from minimizes or maximizes or yields a saddle point of.
When all these three elements are present in an integral formulation of the BVP, then the integral form (resulting from or otherwise) is called a variationally consistent integral form of the BVP (or simply VC integral process). VC integral form or process yields unique extremum of the functional corresponding to, hence a unique solution of the BVP (the Euler’s equation resulting from).
Definition 3.2 (inconsistent integral form (VIC) of a BVP) If an integral form of a BVP (resulting from or otherwise) is not variationally consistent, then it is variationally inconsistent. A variationally inconsistent integral form or process violates one or more of the three requirements needed for variational con- sistency of the integral form.
1) Thus, we see that a variationally consistent integral form of a BVP emerges as a method of obtaining a unique solution of the BVP.
2) The necessary condition (the integral form resulting from or otherwise) provides a system of algebraic equations from which the solution is determined.
3) The sufficient condition or unique extremum principle ensures that a obtained from the integral form (or otherwise) is unique, hence this yields a unique extremum of as well as a unique solution of the Euler’s equation which is the BVP under consideration.
4) Variationally consistent integral forms yield symmetric coefficient matrices in the algebraic systems and the coefficient matrices are positive-definite, hence have real, positive eigenvalues and real eigenvectors (basis). Such coefficient matrices are invertible, hence yield unique values of the unknowns in the corresponding algebraic systems.
5) When the integral form is variationally inconsistent, a unique extremum principle does not exist. In such cases the coefficient matrix in the algebraic system resulting from the integral form is not symmetric, hence is not ensured to be positive-definite. A unique solution of the unknowns in such algebraic systems is not ensured. A consequence of the non-positive-definite coefficient matrix in the algebraic system is that such coefficient matrices may have zero or negative eigenvalues or the eigenvalues and eigenvectors may be complex. In summary, variationally inconsistent integral forms must be avoided at all cost due to the fact that when using such integral forms a unique solution of the BVP is not ensured. In other words when obtaining solution of BVPs, variationally consistent integral forms are essential to ensure unique solutions of the BVPs.
6) The definition stated above can be applied to any BVP provided we can show existence of a functional corresponding to the BVP such that and are necessary and sufficient conditions for the existence of extremum of. A yielding unique extremum of is also a unique solution of.
7) We can show (see ref.  -   for details) that a) the integral forms resulting from GM/WF are VC only for self-adjoint differential operators when the bilinear functional is symmetric, b) the integral form resulting from LSP is VC for all three classes of differential operators, and c) integral forms resulting from the other methods of approximation (GM, PGM, WRM) for all three clases of differential operators are VIC.
8) We show that VC integral forms in designing finite element processes are essential for the derivations of the a priori error estimates.
9) In the following, we only consider GM/WF and LSP, keeping in mind that the integral form from GM/WF is VC only for self-adjoint differential operators and for LSP the integral forms are VC for all three classes of operators.
4. Variational Consistency of the Integral Form and the Best Approximation Property
In this section, we present some theorems and their proofs regarding GM/WF and LSP for the three classes of differential operators and establish best approximation property of GM/WF for self-adjoint operators and LSP for all three classes of operators.
4.1. Galerkin Method with Weak Form (GM/WF): Self-Adjoint Operators
In this section, we revisit main steps of GM/WF for self-adjoint operators. Let
be a boundary value problem in which the differential operator A is symmetric and its adjoint (i.e. the differential operator A is self adjoint). Based on fundamental lemma of calculus of variations we can write the following integral form   :
in which on if (given) on. v is called test function, hence is admissible in (2). When in (2), the integral form (2) is called integral form in Galerkin method. Since A is self adjoint, the BVP (1) only contains even order derivatives of. We transfer half of the differentiation from to v using integration by parts in the first term in (2) and collect those terms that contain both and v and define them collectively as and those that contain only v and define them as, hence we can write the following.
Each term in contains both and v but more importantly the orders of derivatives of and v in each term is same (i.e. is symmetric), thus
and since A is linear, is bilinear in and is linear in v. Hence in this case quadratic functional is possible and is given by
The integral form (3) is called weak form of (1). Due to the fact that (2) is integral form in Galerkin method, the weak form (3) is called integral form in Galerkin method with weak form (GM/WF). The quadratic functional has physical significance as explained in reference  . If (1) represents a BVP associated with
linear elasticity in solid mechanics, then is strain energy, is potential energy of loads and
is the total potential energy of the system described by (1).
Theorem 4.1. The weak form resulting from GM/WF for self adjoint differential operator A in in which is symmetric is variationally consistent.
Proof. Variational consistency of the weak form requires that there exist a functional such that gives the weak form, the Euler’s equation resulting from is the BVP, and yields unique extremum principle. Following Section 4.1 the existence of the functional is by construction (Equation (5))
If is differentiable in, then is a necessary condition for an extremum of. Using (due to GM/WF),
Since is symmetric, we obtain
The unique extremum principle (or sufficient condition) is given by
Hence, a unique extremum principle.
To show that the Euler’s equation resulting from the weak form is in fact the BVP, we just have to transfer differentiation back to f (or fh) from v in the weak form using integration by parts. This is rather straightforward. Thus, the weak form resulting from the GM/WF is variationally consistent. implies that a from the weak form minimizes,
Theorem 4.2. Let be a BVP in which A is self adjoint and let be weak form resulting from GM/WF in which and, then has best approximation property in -norm. That is, if, being theoretical solution, then
This implies that no element of V is a better approximation of than, the solution for the weak form when measured in as e is -orthogonal to every element v of V. This is called the best approximation property of GM/WF for self adjoint operators.
b) For any
Since we have
That is, error in in B-norm is the lowest compared to any other solution w. This completes the proofs of a) and b). □
4.2. GM/WF for Non-Self Adjoint and Non-Linear Operators
Theorem 4.3. Let in be a BVP in which A is a non-self adjoint differential operator. Let be all possible weak forms. Then all such integral forms are variationally inconsistent.
Proof. Let there exist a functional such that yield the weak form. Since A is non-self adjoint, is bilinear but not symmetric (i.e.), hence
is not possible because is not symmetric. Therefore, is not a unique extremum principle. Thus, the integral form with is VIC when the differential operator is non-self adjoint. □
Theorem 4.4. Let in be a BVP in which A is a non-linear differential operator and let be all possible weak forms of in. Then, all such integral forms or weak forms are variationally inconsistent.
Proof. Let there exist a functional such that yields the integral form. Since the differential operator A is non-linear, is linear in v but not linear in and is linear in v. Therefore, the second variation of I
is a function of due to the fact that is a non-linear function of. Thus, does not represent a unique extremum principle and, hence, the integral form or weak form is VIC. □
4.3. Least-Squares Method Based on Residual Functional: Self-Adjoint and Non-Self-Adjoint Operators
Theorem 4.5. The integral form in least-squares method based on residual functional is variationally consistent when the BVP is described by self adjoint differential operator.
Proof. Consider the BVP
in which A is self adjoint. Let be an approximation of over discretization of. Let
be residual function. We define residual functional
If is differentiable in then the necessary condition is given by.
is bilinear and symmetric and is linear.
Hence, the integral form resulting from is variationally consistent. □
Theorem 4.6. The integral form in least-squares method based on residual functional is variationally consistent when the BVP is described by non-self adjoint operator.
Proof. Since non-self adjoint operators are linear the proof of this theorem is same as that for self adjoint operators (Theorem 4.5) which are also linear. □
4.4. Least-Squares Method Based on Residual Functional for Non-Linear Operators
Theorem 7 Let in be a boundary value problem in which A is a non-linear differential oper-
ator. Let be approximation of in, discretization of and let be the re-
sidual function in. Then the integral form resulting from the first variation of the residual functional set to zero is VC provided and the system of non-linear algebraic equations resulting from are solved using Newton-Raphson or Newton’s linear method.
Proof. Since A is non-linear, E is a non-linear function of, hence is a function of.
If is differentiable in, then
Hence, is a necessary condition.
is not possible. Hence, we do not have a unique extremum principle. At this stage, the least-squares process is VIC. We rectify the situation in the following.
We note that based on the necessary condition must hold. Since is a non-linear function of, we must find a iteratively that satisfies. Let be an initial (or assumed) solution, then
Let be a change in such that
Expanding in Taylor series about and retaining only up to linear terms in (Newton- Raphson or Newton’s linear method)
Thus, in order for the coefficient matrix to be positive-definite,
This gives a unique extremum principle. The improved value of is given by
We choose such that. This is referred to as line search. With this approximation of, the integral form is variationally consistent. £
1) Justification for approximating is important to discuss.
2) We note that
Justification of is only necessary in the asymptotic range of convergence as the a priori error estimation only holds in this range, thus establishing best approximation property of LSM method in some norm is also only required in this range.
In the asymptotic range in the pointwise sense if the approximation spaces are minimally conforming to ensure that all integrals over are Riemann. When then, hence is valid. Further discussion on the validity of this approximation can be found in reference  .
Theorem 4.8. The integral form resulting from the least-squares method based on residual functional has best approximation property in L2-norm of E.
Proof. From Section 4.3, we have
For theoretical or exact solution, we have
Thus, or is orthogonal to (dual of). We note that
That is L2-norm of E obtained using fh is lowest out of all. Hence, LSP has best approximation property in L2-norm of E or. £
Theorem 4.9. A variationally consistent integral form has a best approximation property in some associated norm. Conversely, if an integral form has a best approximation property in some norm, then it is variationally consistent.
Proof. Proof of this theorem follows due to the fact that VC integral form in GM/WF has best approximation property in B-norm because is bilinear and symmetric. The integral form in the LSP is also VC but LSP has best approximation property in L2-norm of E. Both GM/WF and LSP are VC but have best approximation property in different norms. In both cases, VC integral form is not possible without best approximation property and the best approximation property is not possible without VC integral form. This is obviously due to the fact that they both require the functional to be bilinear and symmetric. As long as this holds, how is derived is not important. £
We note that
1) Since the integral forms for non-self adjoint and non-linear differential operators are VIC in GM/WF, the approximation from GM/WF does not have best approximation property in B-norm (Theorem 4.9).
2) Lack of best approximation property and lack of VC of the integral form resulting from GM/WF for non- self adjoint and non-linear differential operators are both obviously due to the fact that the functional in the weak forms is not symmetric.
3) In LSP for all classes of differential operator is minimized, therefore has best approximation property in E-norm
4) We note that variational consistency of the integral form holds for all choices of h, p, and k whereas the best approximation property only holds in the asymptotic range.
4.5. Integral Forms Based on Other Methods of Approximation
The integral forms used in finite element method based on Petrov-Galerkin method, Galerkin method, and weighted residual method are not considered as these always yield integral forms that are variationally inconsistent. Hence, when using these integral forms computations may not even be possible.
4.6. General Remarks
1) We have established that GM/WF yields VC integral form only for self adjoint operators when the functional in the integral form is symmetric and this method has best approximation property in B-norm.
2) LSP based on residual functional yields VC integral forms for self adjoint, non-self adjoint, and non-linear (in the asymptotic range) differential operators and has best approximation property in E-norm.
3) VC integral form implies best approximation property in some norm and vice versa.
4) Best approximation property is necessary in a priori error estimation (in the asymptotic range), as shown in subsequent sections.
5) In general, when using GM, PGM, WRM, etc. error estimation is not possible as in these methods the approximation of does not have best approximation property in any norm.
5. A Priori Error Estimates: GM/WF and LSP
We consider simple model problems to demonstrate the best approximation properties of GM/WF for self adjoint operators and LSP for linear operators and present derivations of the a priori error estimates and convergence rates when. These estimates are derived using model problems (as illustrations) and are then generalized for all BVPs.
5.1. Model Problem 1: GM/WF
Consider the following BVP:
GM/WF for (6) with BCs (7) gives
Let be the finite element approximation of, then we have
Using (8) and (9) and since, v in (9) is also in V and we have
Theorem 5.1. For any we have
we can choose as both, then
Using Cauchy-Schwarz inequality 
That is, in this case for the model problem (6) - (7) the derivative of has the best approximation property in L2-norm. Alternatively, has best approximation property in B-norm. This completes the proof. £
5.2. Model Problem 2: LSP
Consider the following BVP described by non-self adjoint differential operator.
LSP based on residual functional gives (for)
is approximation of over. This integral form is VC. Also for theoretical solution
Setting in (14)
Subtracting (13) from (15)
Using interpolant of (interpolant matches at end nodes);, let, then we have
We note that, hence
(due to (5.11)) (18)
Thus, (17) reduces to
Using Cauchy-Schwarz inequality
That is L2-norm of the derivative of error is bounded by the finite element interpolant. Using proposition 5.1 (shown subsequently) and (21), we can write
L2-norm of e; that is, for LSP is derived using Aubin-Nitsche trick (Oden and Carey  and Reddy  ). We consider details in the following.
Consider the same BVP (for),
Let. Assume that w is the solution of the second order differential equation
The finite element interpolant () satisfies
(using (5.19)) (26)
Using integration by parts and the fact that at and at and (orthogonal property)
Hence, (using Cauchy-Schwarz inequality)
We make the following remarks.
1) For a first order BVP, the rate of convergence of the L2-norm of the error in the finite element solution is proportional to and the rate of convergence of the L2-norm of the derivative of the error is proportional to h.
2) These estimates are same as those for a second order BVP when using GM/WF in which the integral form is variationally consistent.
1) The error estimates have been derived for a second order BVP using GM/WF in which the integral form is VC and the local approximation is linear () over an element. In case of LSP the BVP is first order ODE, the integral form is VC, and for local approximation.
2) We note that the integral forms in both cases are VC and contain only up to first order derivatives, hence the reason for same convergence rates of and even though in case of GM/WF the BVP is a second order ODE and in case of LSP it is only a first order ODE. This is rather significant to note that VC of the integral form and the highest order of the derivative in the integral form control the rates of convergence.
3) We need to extend these estimates for higher degree local approximation (i.e. p-level of “p”).
4) The order of approximation space k needs to be incorporated in the error estimates.
5.3. Proposition and Proof
Proposition 1 Let the theoretical solution of (6) - (7) be at least of class and let be approxi-
mation of over the discretization of in which is an element e. Let h be the characteristic length of such that. Let be interpolant of that agrees
with at the nodes [i.e.,]. Then
c) When (31) and (32) hold, the following hold
Proof. Consider linear and (i.e.).
For an element e let be the interpolation error between f and inter-
polant. Since vanishes at and of an element e, by virtue of Rolle’s theorem there ex- ists at least one point between and at which. Then for any x
Since is linear, implies that
Applying Cauchy-Schwarz inequality to (37) and using (38)
Hence for, we can write
This proves (32).
Applying Cauchy-Schwarz inequality
Substituting from (43) into (47)
Using (44), (49) reduces to
This proves (31):
Substituting from (40) into (51)
This proves (33).
Using Cauchy-Schwarz inequality
Substituting from (40)
Using (56) and (68), we have
This proves (35).
From theorem 5.1, we have
Hence using (74), (75), (33), and (34), we finally have
5.4. Proposition and Proof
Proposition 5.2. The derivation of the error estimates in proposition 1 are presented for model problem 1 using GM/WF in which the operator is self adjoint, hence the weak form is VC. In model problem 2 (Section 5.2) the differential operator is non-self adjoint and the error estimates are derived for LSP in which the integral form is also VC. In this section we consider a more general approach of deriving a priori error estimates for arbitrary degree of approximation p only based on the assumption that the integral form is VC.
If the integral form resulting from a method of approximation is VC, then the following hold.
In (81), is the highest order of the derivative in the differential operator A. The constants, , , and do not depend upon h and p.
Proof. Consider one dimensional BVP:
Let be discretization of in which is an element e. Let be finite element approximation of over such that in which is local approximation of over.
Let and be interpolants of of class over and such that at the nodes agrees with the theoretical solution. Thus, error estimation reduces to estimating error between and over an element of length. When is analytic, it can be expanded in Taylor series in over about some point j.
Consider a over of degree p resulting from a VC integral form (hence, ensuring well-behaved solution), then the local approximation at the same point j can also be written as (assuming agrees with up to degree of p),
Subtracting (84) from (83), we obtain
Using (83)-(90), it is rather straightforward to establish
and by induction
Using (90) and (92), we can establish that
1) The estimates in (92) and (93) apply to VC integral forms regardless of the method of approximation. Thus, these estimates hold for GM/WF for self adjoint operators and also hold for LSP for all three classes of differential operators.
2) The local approximations used are always of class.
3) The constants, , and do not depend on h and p.
4) The estimates (92) and (93) apply to all finite element processes in which the integral form is variationally consistent.
5) From (92) and (93), we note that progressively increasing order of derivatives of the finite element solution converge progressively slower. That is
and so on. Likewise
and so on. From (95) and (97), we note that convergence rate in H1-norm is controlled by the convergence rate of the seminorm (i.e. highest order derivative in). This property holds universally for all operators and integral forms as long as they are variationally consistent.
6) When examining, if the highest order of derivative in E is 2m, then we have
5.5. Convergence Rates
In this section, we present details of the convergence rates of various error norms for finite element solutions obtained using GM/WF for self adjoint operators when is symmetric and LSP for all three classes of differential operators. We recall that when the integral form has best approximation property in some norm, hence is variationally consistent, we have the following a priori error estimate (derived for 1D BVP, Equation (93)) in the asymptotic range:
Taking log of both sides
We note that (101) is the equation of a straight line (when we use equality) in xy-space in which m is the slope and C is the y-intercept. That is, if we plot versus on an xy-plot, then we obtain a straight
line whose slope is and intercept is. Slope is called the rate of conver-
gence of. Higher values of imply faster convergence of to measured in. Equation (101) can be expressed in terms of total degrees of freedom which is perhaps more appealing in applications as dofs are more easily accessible than characteristic length or size “h” of the discretization. As the discretization is refined, the characteristic length h reduces and the total dofs increase, thus dofs are inversely proportional to h,
Using in (100) and since we obtain
We keep in mind that dofs in (104) are purely due to uniform mesh refinement. Thus, in order to determine convergence rate of for finite element processes with VC integral forms we need to plot versus and determine the slope of this curve, which is the convergence rate in the asymptotic range. For a sequence of fixed discretizations, as p increases convergence rate increases linearly.
I) We note that requires knowledge of theoretical solution, which may not be possible to determine for a practical application.
II) When the approximation space is minimally conforming or of higher order (i.e. for integrals over to be Riemann or if the Lebesgue integrals over are accepta-
ble), then in which the residual function can be computed using over
using and taking log of both sides
The dofs in (106) are also due to uniform h-refinement. Since does not require theoretical solution,
it can be computed using. Equation (106) can be used for any application without the knowledge of
theoretical solution as long as the approximation space is minimally conforming or of higher order than minimally conforming.
5.6. Proposition and Proof
Proposition 5.3. When local approximation is of progressively higher order global differentiability, that is, in scalar product spaces for progressively increasing k, the accuracy of the finite element solution progressively improves. In this proposition we answer two important questions:
1) Dependence of the a priori error estimates derived so far for local approximations of class on the order of the space k; that is, if the local approximations are in space how do the a priori estimates change and the influence of k on convergence rate.
2) The influence of the order k of the approximation space on the accuracy of the finite element computation.
Of course (1) and (2) are interdependent because when we have determined (1), the assessment of accuracy may be inferred from it.
The following a priori error estimate derived for 1D BVPs using p-version local approximation can be extended when the local approximations are in spaces using the following two important considerations or properties of local approximations in spaces:
We consider a simple illustration of a 1D discretization using three node p-version hierarchical local approximation finite elements in space. Let m be the number of elements in the discretization, then the total degrees of freedom (dofs) are given by
p is the degree of local approximation (assumed same for all elements of the discretization). Let us choose a p-level, say nine (9) and a one hundred (100) element discretization, then using (108) we can determine total degrees of freedom for corresponding to the local approximations of class, , ∙∙∙,.
From Table 1, we observe that as k increases (i.e. progressively higher order local approximations) the total degrees of freedom are progressively reduced. This is a significant property of the higher order local approximations. From Table 1, we note that for, 901 dofs are reduced to 802 in the case of without much effect on accuracy of the solution. The same holds for progressively higher order local approximations, , and so on; that is, the dofs continue to reduce with progressively increasing order of space without much effect on the accuracy. This behavior of the solution accuracy (say in) holds regardless of the type of differential operator and regardless of the method of approximation used to oconstruct the integral form as long as the integral form is variationally consistent. Figure 2 shows typical plots of versus dofs at for solutions of classes, , and. Typical points A, B, C correspond to solutions of classes, , and for the same discretization and p-level (i.e. fixed h and p), with almost same value of but progressively reducing degrees of freedom. In view of the a priori error estimate (107) we can conclude that if h and p are fixed, then the dependence of the a priori estimate on k lies in, [i.e. in (107) and in (98)].
If we choose and if is the interpolant that agrees with at the inter-element
nodes of the discretization; that is, agree with corre-
sponding to local approximations of classes respectively, then in the consideration of the a priori error estimates we only need to consider () (i.e. interior of the element). This suggests that in a priori estimate in (98) (for example) only depends on k. That is, (98) holds when except that.
Table 1. Total dofs for a 100 element discretization at p = 9 for different values of the order of space k.
Figure 2. Typical versus dofs behavior for k = 1, 2, and 3.
1) From properties I and II it is clear that when, in the error estimate (98) the terms and likewise the terms in (107) remain unaffected. Only the coefficients and show mild dependence on k.
2) In view or properties I and II, we conclude that if solutions of a BVP are to be computed for a fixed number of degrees of freedom, then progressively more degrees of freedom can be added to solutions of class so that the total dofs in all classes of solutions are the same. We recall that with same values of h and p in the solutions of class (Figure 2) remains virtually the same for all classes, however the total dofs are progressively reduced.
The consequence of adding more dofs (through h-refinement) with progressively increasing order of space so that in each case the dofs match with C0 solutions is clearly improved accuracy of reflected by progressively reducing. Clearly, in doing so the convergence rate or is not affected. Thus, versus log(dofs) graphs for solutions of classes in the asymptotic range are parallel to each other but with progressively lower values of as shown in Figure 2. That is graph for C1 is below C0 and that of C2 is below C1 and so on, but they are all parallel.
5.7. General Remarks
2) We remark again that the rates only hold in the asymptotic range.
3) The integral forms must be VC so that the best approximation property of holds in some norm in order for these estimates to remain valid. The estimates derived here hold for: (a) GM/WF for self adjoint operators when the bilinear functional is symmetric and (b) for LSP based on residual functional for all three classes of differential operator.
4) In case of GM/WF for non-self adjoint and non-linear operators, the a priori estimates derived here do not hold. In case of such operators the functional generally consists of a symmetric part and a non-symmetric part. With sufficient mesh refinement if we can ensure that the behavior is dominated by the symmetric part, then the estimates derived here hold in the range of calculations when asymptotic range is realized. We illustrate this aspect through model problems presented in a later section.
6. Computations of a Priori Error Estimates and Convergence Rates
In this section, we present numerical studies related to the computation of a priori error estimates and convergence rates for BVPs described by self adjoint, non-self adjoint, and non-linear differential operators in which VC integral forms are constructed using GM/WF for BVP described by self adjoint differential operators and using LSP for BVPs described by all three classes of differential operators.
6.1. Model Problem 1: Self-Adjoint Operator, 1D Diffusion Equation
We consider the 1D steady-state diffusion equation.
If we choose, , , , then the theoretical solution or is given by
a) GM/WF: The differential operator is linear and. The integral form using GM/
WF is given by (over)
is bilinear and symmetric and is linear. The integral form (weak form) is VC due to the fact that
hence a solution from (113) minimizes.
b) LSP based on residual functional
I) LSP using higher order system (without auxiliary equation)
Using (109), referred to as the higher order differential equation or system, if we let be approximation of over then
Hence, the integral form (117) is variationally consistent.
II) LSP using first order system
Let, hence (109) can be written as a system of two first order equations.
LSP for (119) follows standard procedure. Let and be approximations of and, then
Hence the integral form (121) resulting from LSP is variationally consistent.
I) All other methods of approximation yield VIC integral forms, hence are not considered as in such cases the a priori error estimates and the convergence rates are not valid.
II) In the numerical studies, we consider GM/WF and LSP for higher order as well as first order system of differential equations describing BVPs.
In this section, we present numerical studies for the integral form (112) for, discretization of
. We consider uniform discretizations employing three node p-version hierarchical 1D elements with local approximations in scalar product space. We begin with two element uniform discretization and perform uniform mesh refinement containing 4, 8, 16, ...elements. Since in this model problem the theoretical solution is known, various error norms can be computed. We note from the description of the BVP (109) that in this case (highest order of the derivative in the BVP) and the integral form resulting from GM/WF contains only up to first order derivatives of the dependent variable and the test function. We consider computations using solutions of class, , and at different p-levels with uniform mesh refinements. Computed results for solution of class are shown in Figure 3. The integral form is VC and, the computed solution, has best approximation property in -norm.
In this case, the following a priori error estimates hold (Proposition 5.2):
For this BVP, and q depends on the type of norm. Figure 3 also shows the theoretical values of the convergence rates of various error norms for solutions of class at p-levels of 2 and 5. Graphs of the log of error norms versus log of dofs for these solutions are shown in Figure 3. We note that due to smoothness of the theoretical solution even the two element discretization yields the error norms in the asymptotic range; that is,
Figure 3. versus log(dofs) for solutions of class C0 (GM/WF, model problem 1, and 5).
pre-asymptotic and onset of asymptotic ranges in these solutions do not appear in Figure 3. All computations are in the asymptotic range, hence onset of post-asymptotic and post-asymptotic ranges are also absent. Calculated convergence rates are in perfect agreement with the theoretical convergence rates calculated using (123). We note that in and error norms the integrals over are Lebesgue, but the norms are well- behaved due to smoothness of.
Figure 4 shows plots of log of various norms and seminorms versus log of degrees of freedom at and 5 for solutions. The computed convergence rates of various error norms and comparison with the theoretical convergence rates obtained using (123) are shown in Figure 4. The agreement is perfect. Here we note that in computing and, the integrals over are Lebesgue but error norms are well-behaved due to smoothness of. Also, nearly all computations shown in Figure 4 are in the asymptotic range, except for the last point for at and at.
Log of various error norms and seminorms versus log of degrees of freedom for solutions of class at and 7 are shown in Figure 5. Since for, all integrals in all error norms are Riemann over the discretization. Computed error norms using (123) and comparison with the computed convergence rates of error norm are also shown in Figure 5. We observe perfect match between the theoretical values and the computed values. Except for the last point shown in Figure 5 for at, all other computed results are in the asymptotic range due to smoothness of.
Figure 6 shows plots of (or) versus log of dof for solutions of class, , and
() at. All three graphs of versus log of dofs for are parallel, confirming that the convergence rate of is independent of the order k of the approximation space. We note that graph for appears below and the graph for is below confirming that for given dofs, as the order k of space is increased, the error in the computed solution (measured in -norm) decreases without affecting the convergence rate.
The BVP in this model problem is described by a second-order differential operator (); hence, corresponds to minimally conforming space for which the integrals are always Riemann. However, due to smoothness of, when (solutions of class) in which case the integrals over are Lebesgue, the solution is expected to converge weakly to class. Next we consider solutions of class
Figure 4. versus log(dofs) for solutions of class (GM/WF, model problem 1, and 5).
Figure 5. versus log(dofs) for solutions of class (GM/WF, model problem 1, and 7).
Figure 6. versus log(dofs) for solutions of classes C0, C1, and C2 at (GM/WF, model problem 1).
with (minimum). Numerical solutions are computed for uniform mesh refinements beginning with a two-element uniform discretization. For each discretization we calculate and. Since the rate of convergence of is controlled by, we expect the convergence rates of and to be nearly same. We clearly see this in Figure 7. Graphs for and are parallel, confirming the same convergence rates of (or) for solutions of class () and class (). The convergence rate in the case of is, whereas in the case of is. Plots in Figure 7 confirm that rate of convergence of error norms is independent of the order of the approximation space.
6.1.2. LSP, Higher-Order System (No Auxiliary Equation)
In this study, we consider finite element formulation of model problem (109) using least-squares process based on residual functional. We consider solutions of class as well as. In case of solutions integrals over are Lebesgue whereas for solutions of class the integrals are Riemann. Figure 8 shows plots of and versus log of dofs for p-levels of 3 and 5 calculated using uniform mesh refinement. Calculated convergence rates of various error norms are also shown in Figure 8. The theoretical convergence rates of various error norms and a comparison with calculated convergence rates is also shown in Figure 8.
Agreement between theoretical and calculated values is excellent. Here also we observe absence of pre- asymptotic and onset of asymptotic ranges due to smoothness of the theoretical solution. Some graphs for significant refinement show appearance of post-asymptotic (or onset of post-asymptotic) range.
Similar studies for solutions of class are shown in Figure 9 for;,; and, norms at p-levels of 5 and 7. The computed convergence rates of the error norms are in perfect agreement with theoretical rates calculated using (), shown in Figure 9.
Graphs of and (or) versus log of dofs for solutions of class and obtained using uniform mesh refinement are shown in Figure 10. Calculated convergence rates are also shown in Figure 10. For error norm the theoretical rate is () whereas for it is (). The theoretical convergence rates are in perfect agreement with those calculated using graphs in Figure 10. We note that
Figure 7. or versus log(dofs) for solutions of classes C1 and C at (GM/WF, model problem 1).
Figure 8. versus log(dofs) for solutions of class C1 (LSP, model problem 1, and 5).
Figure 9. versus log(dofs) for solutions of class C (LSP, model problem 1, p = 5 and 7).
Figure 10. or versus log(dofs) for solutions of classes C1 and C2 at p = 5 (LSP, model problem 1).
and graphs for same p-level (5) are parallel to each other and graph is below, confirming that the convergence rates for and are same (i.e. independent of k), the order of space, but for the solution has better accuracy compared to.
Remarks. Numerical studies for LSP using auxiliary equation (i.e. a first-order system) are not presented for this model problem but will be presented for the next model problem, 1D convection-diffusion equation.
6.2. Model Problem 2: Non-Self-Adjoint Operator, 1D Convection-Diffusion Equation
We consider 1D convection-diffusion equation described by non-self adjoint operator for computing a priori error estimates and convergence rates and compare them with their theoretical values,
We consider. Theoretical solution of (124)-(125), finite element solution using GM/WF and LSP using higher order system (no auxiliary variables) and using first order system (using auxiliary variables) is
given in  . For this BVP, the operator is linear but, hence the in-
tegral form from GM/WF is VIC, but LSP for higher order as well as first order system of differential equations is VC.
a) GM/WF: The integral form of (124)-(125) is given by (for)
is bilinear but not symmetric and
does not yield a unique extremum principle. Hence, the integral form (127) is VIC.
b) LSP based on residual functional:
I) Higher order system (without auxiliary equation)
In this case we use (124) without introducing auxiliary equation, that is without reducing (124) into a first order system of equations. Let be approximation of over, then
Hence, the integral form (132) is VC.
II) First order system
Let, hence (124) can be written as
LSP for (134) follows standard procedure (parallel to Equations (119)-(122)). Details are straightforward. See  for many model problems of similar type.
1) Since GM/WF yields VIC integral form and does not have best approximation property as the operator A is not self adjoint, hence the a priori error estimates derived in earlier sections using best approximation property in B-norm do not hold in this case. Nonetheless we present numerical studies for GM/WF for this model problem to illustrate some important aspects of error norms in a later section.
2) Integral form derived using LSP is VC and has best approximation property in E-norm or, I being residual functional, hence the same a priori error estimates derived for LSP for self adjoint operators hold here as well.
6.2.1. LSP: First Order System
Domain is discretized using 3-node p-version 1D elements of higher order global differentiability into 2, 4, 6, ...element uniform meshes. The solutions are computed using finite element formulation based on LSP for first order system of equations. Solutions of classes, , and are considered at different p-levels. For this problem the a priori estimates (123) hold as well with due to the fact that it is a first order system of equations. LSP has best approximation property in E-norm and the integral form is VC,
First, we consider solutions of class at and 5 and with. Due to local approximation and the first order system, integrals over are Lebesgue but due to smoothness of weak convergence of computed to class is expected. Figure 11 shows plots of log of various error norms versus log of the dofs at and 5. Details of the studies are also given in Figure 11. Theoretical convergence rates are in perfect agreement with the calculated rates shown in Figure 11. As p-level is increased from 2 to 5 convergence rates also show increase by 3 at compared to those at. We clearly observe pre- asymptotic, onset of asymptotic, and asymptotic ranges in all cases. For also observe onset of post- asymptotic and post-asymptotic ranges. We note that even though LSP does not have best approximation property in B-norm but due to the fact that the integral form is VC, the convergence rate of LSP (135) is same as those of GM/ WF for self adjoint operators (123).
As p-level is increased convergence rate increases proportionately. Derivatives converge more slowly than functions, hence convergence rate of is one order lower than that of or. Since the convergence rate of is dominated by the first derivative, and have same convergence rates (also clear from (135)).
Solutions of class at and 5 are considered here. Results obtained using uniform mesh refinement are given in Figure 12. Plots of log of various error norms versus log of dofs and calculated convergence rates are shown in Figure 12 and are compared with theoretical convergence rates. Calculated and theoretical convergence rates are in perfect agreement. We note that when the convergence rates of error norms are independent of k (i.e. at), solutions of class and have same convergence rates for the same norm, confirming that convergence rates of the error norms are not a function of k, the order of the approximation space. Pre-asymptotic, onset of asymptotic, and asymptotic ranges are clearly observed in Figure 12.
Solutions of class at and 7 are considered next. Results obtained using uniform mesh refinement are shown in Figure 13 and are compared with the theoretical convergence rates obtained using (135). Once
Figure 11. versus log(dofs) for solutions of class C0 (LSP, model problem 2, first order system, and 5,).
Figure 12. versus log(dofs) for solutions of class C1 (LSP, model problem 2, first order system, and 5,).
Figure 13. versus log(dofs) for solutions of class C2 (LSP, model problem 2, first order system, and 7,).
again the agreement is perfect. Again, we note from Figure 12 and Figure 13 that at the convergence rates are independent of k. In this case, the integrals in the computations of the error norms are always Riemann.
Figure 14 shows plots of and versus log of dof for solutions of class and at. Since the differential operator has the highest derivative of order 2, the convergence rate of is expected to be same as that of or for both classes of solutions. This is confirmed in Figure 14. Convergence rate in case of and solutions are same (4 in this case), but solutions have better accuracy for a given dofs, confirming again that convergence rates of error norms or residual functional are not a function of the order k of the approximation space. Calculated rates are in perfect agreement with the theoretical rates. Figure 15 shows plots of log of versus log of dofs for solution of classes, , and at. We observe same convergence rates for, 2, and 3 but better accuracy of the solution with progressively increasing k. These rates for LSP match perfectly with GM/WF for self adjoint operators due to the fact that in both cases the integral forms are variationally consistent. This proves again that the best approximation property in B-norm is not a requirement for establishing convergence rate. It is the variational consistency of the integral form that matters. Clearly the LSP does not have best approximation property in B-norm, yet has same convergence rates as GM/WF for self adjoint operators due to the fact that in both cases the integral forms are variationally consistent.
Since the differential operator is non-self adjoint the GM/WF will yield VIC integral form in which is nonsymmetric and we lose the best approximation property in B-norm. Nonetheless we conduct some numerical experiments to monitor convergence rates of various error norms. First, we note that GM/WF in this model problem will yield the following element equations for an element e (when and). See reference  .
Figure 14. versus log(dofs) for solutions of classes, , and at (LSP, first order system, model problem 2).
Figure 15. or versus log(dofs) for solutions of classes and at (LSP, first order system, model problem 2).
and the assembled equations for discretization are
in which and, are due to assembly of and for. As shown in reference  , is due to convection term (i.e.) and is due to diffusion (i.e.); is
nonsymmetric with zeros on the diagonals after and BCs are imposed, thus if is large, the contribution of to is almost insignificant compared to the contribution of and the computations using (137) will fail. On the other hand if the discretization is sufficiently refined, the con-
tribution of to overshadows that of and the behavior will be dominated by (i.e.
term in the differential operator). When this happens the integral form from GM/WF will behave like a VC inte-
gral form as it is primarily due to term in the differential operator which is self adjoint, hence the
convergence rates of various error norms will be similar to GM/WF for self adjoint operator.
For numerical experiments, we consider and. For the solution gradients are more isolated near and are higher in magnitude compared to. We consider solutions of class at for both Peclet numbers. Progressively refined uniform discretizations are used for computing solutions and error norms. Figure 16 shows error norms versus dof plots for solutions of class, for. We note that due to smoothness of the solutions, the asymptotic range in which dominates is quickly achieved, and the computations succeed for meshes of 16 elements or more. In this range calculated convergence rates match perfectly with the theoretical rates for self adjoint operator. In this range the BVP re-
duces to as the contribution of term in this range is insignificant. For meshes with 16 ele-
ments or fewer the calculated solution from (137) does not satisfy (137) when substituted in them, implying lack of equilibrium due to spuriousness of the computed solution. Figure 17 shows similar graphs for. The computations fail for discretizations resulting in (meshes coarser than 256 elements) where equilibrium is not achieved, that is, calculated solution from (137) does not satisfy (137) when substituted into the equations. This is due to VIC nature of the integral form resulting from GM/WF. Correspondingly, the values of the error norms for the failed discretizations grow out of control. When (discretization contains 256 elements or more), contribution becomes insignificant and the BVP behaves like
, hence the asymptotic range is observed with calculated convergence rates of the indicated error
norms of 3.7, 2.9, 2 are achieved compared to their theoretical values of 4, 3, 2 for self adjoint operators, rather amazingly good performance for VIC integral form.
When performing the error computations for higher than 1000 with uniform mesh refinement of 2, 4, ...elements failure of computations occurs when dominates the total as expected.
6.2.3. LSP: Higher Order System (Without Auxiliary Equation)
In this study, we consider 1D convection-diffusion equation (124) without converting it to a system of first order equations through the use of auxiliary equation. In this case, is minimally conforming approximation space if the integrals over are to be Riemann. For the integrals over are Lebesgue and (solutions of class) is not admissible.
Error norms are computed for progressively refined uniform discretizations for (solutions of classes and) at p-levels of 3 and 5 for and and 7 for. Plots of error norms versus dof for
Figure 16. versus log(dofs) for solutions of class (GM/WF, model problem 2, ,).
Figure 17. versus log(dofs) for solutions of class (GM/WF, model problem 2, ,).
solutions of classes and and the calculated convergence rates and comparisons with the theoretical values are shown in Figure 18 and Figure 19. We note that the highest order of the derivative in the mathematical model () in this case is 2 as the convection-diffusion equation is not reduced to a first order system using auxiliary equations. Theoretical convergence rates are overall in good agreement with the calculated convergence rates confirming importance of the variational consistency of the integral form. In solutions of both classes, the convergence rate of is higher than predicted for. In this case whereas in case of first order system derived using auxiliary equation, thus the first order system has higher convergence rate of in the LSP.
Figure 20 shows plots of versus dofs and versus dofs for solutions of classes and at. Since the highest order derivative is two in the differential operator, the convergence rate of is same as that of. and solutions have same convergence rates but solutions have better accuracy for a given dofs, confirming that the convergence rates of error norm and residual functional are not a function of the order k of the approximation space. Thus, for higher order system we also observe that the rates for LSP match with GM/WF for self adjoint operators due to the fact that in both the integral forms are VC even though the two methods of approximation have best approximation property in different norms.
6.3. Model Problem 3: Non-Linear Operator, 1D Burgers Equation
We consider 1D Burgers equation described by a non-linear operator (see reference  ) to compute a priori error estimates and convergence rates of various error norms and compare them with their theoretical values,
For the studies presented in the following sections, a value of is used. Theoretical solution of
Figure 18. versus log(dofs) for solutions of class C1 (LSP, model problem 2, higher order system, and 5,).
Figure 19. versus log(dofs) for solutions of class C2 (LSP, model problem 2, higher order system, and 7,).
Figure 20. or versus log(dofs) for solutions of classes C1 and C2 at (LSP, higher order system, model problem 2).
(138) and (139) and finite element solution using GM/WF and LSP (higher order and first order systems) are given in reference  . Some details are given in the following as a review. It is shown  that in this case
which is a function of, hence non-linear. The GM/WF yields VIC integral form. The in-
tegral form from the LSP is VC with minor adjustments (see theorem 7) of little consequence but immense benefit as they yield variational consistency of the integral form.
a) GM/WF: The integral form of (138) and (139) over is given by
Functional is linear in v but not linear in and is obviously not symmetric.
is obviously not, , or , hence the integral form (141) is VIC.
b) LSP based on residual functional: These can be constructed in two alternate ways, as a higher order system (138) or by recasting (138) as a system of first order equations. [(I)]
I) Higher order system
and residual functional is given by
The necessary condition is satisfied by calculating a solution using Newton’s linear method. See reference  for full details. The integral form in this case is variationally consistent.
II) First order system
Let, then (138) reduces to
LSP for (147) is described in detail in reference  and is omitted here. This integral form is also VC.
6.3.1. LSP: Higher-Order System (Without Auxiliary Equation)
For this model problem we only present studies related to convergence rates of various error norms using (138) (i.e. without recasting it as a system of first order equations). As in other problems is discretized using uniform meshes of 2, 4, 8, ...3-node p-version higher order global differentiability elements and the solutions are computed using finite element formulations based on GM/WF and LSP. In case of LSP, since the integral form is VC the same convergence rate estimates hold as in (135):
In this BVP,. Since the differential operator has derivative of up to second order, the minimally conforming space in this case is for the integrals over to be Riemann and the integrals are in Lebesgue sense when. is not admissible. Figure 21 and Figure 22 show plots of various error norms versus dofs for solutions of class and as well as calculated and theoretical convergence rates.
First, we note from Figure 21 and Figure 22 large pre-asymptotic and onset of asymptotic ranges. The asymptotic range is rather limited, due to which accurate computation of convergence rates is difficult. Nonetheless we observe that for most error norms the theoretical and calculated convergence rates are in good agreement. Once again, we observe that due to VC integral form in LSP for nonlinear operators the convergence rate estimates for GM/WF for self adjoint operators and the same for LSP for linear operators hold here, again confirming the significance and importance of VC integral forms.
Figure 23 shows plots of versus dof and versus dof for solutions of class and at. Since the differential operator is second order operator, the convergence rate of is same as that of for both and local approximations. However, solutions have better accuracy for a given dofs. We clearly observe that the convergence rate is not a function of k, the order of approximation space. Calculated convergence rates of and are the same and are in exact agreement with the theoretical convergence rates.
Since the differential operator is non-linear the integral form from GM/WF is VIC. is not bilinear and is
Figure 21. versus log(dofs) for solutions of class C1 (LSP, model problem 3, higher order system, and 5,).
Figure 22. versus log(dofs) for solutions of class C2 (LSP, model problem 3, higher order system, and 7,).
Figure 23. or versus log(dofs) for solutions of classes C1 and C2 at (LSP, higher order system, model problem 3).
not symmetric, hence we lose best approximation property of the GM/WF in -norm. GM/WF will yield the following form of the assembled equations for (when and B) assuming uniform discretization ():
in which and. is symmetric. is due to and is due to term in the differential equation. Furthermore has zeros on the diagonal after
and boundary conditions are imposed, thus if is large, the contribution of to is almost insignificant and the computations using (149) will fail. On the other hand if the discretization is sufficiently refined then contribution of to overshadows that of and the solution behavior
will be dominated by (i.e. term in the differential equation). When this happens the integral form
from GM/WF will behave like a VC integral form and the convergence rates of various error norms will be same as those of GM/WF for self adjoint operator.
For numerical studies, we consider. Uniform mesh refinement is carried out for solutions of class at. Figure 24 shows plots of error norms versus dofs. We note that for discretizations coarser than 128 elements the error norms correspond to erroneous computed solutions in which equilibrium condition is violated for the assembled equations. For finer discretizations (128 elements or more) asymptotic range is observed. In this range discretization is sufficiently refined so that the integral form is dominated by the diffusion term. Calculated convergence rates (of;,;,) 3.7, 3, and 2 are in close agreement with the theoretical convergence rates 4, 3, 2. In this study for computations failed for discretizations coarser than 128 elements where equilibrium was not achieved.
Figure 24. versus log(dofs) for solutions of class C1 (GM/WF, model problem 3, p = 3,).
7. A Posteriori Error Estimation and Computation
7.1. A Posteriori Error Estimation
A posteriori error estimation refers to estimation of errors in the computed solution. The primary purpose is to be able to devise some element-wise measures as well as in the whole discretization that quantify the errors in the computed solution as well as provide some guidance on the portions of the domain where the computed solution needs to be improved. Based on these measures one could design mesh refinement, p-level change, etc. strategies that result in the desired accuracy of the computed solution. This process of changing h, p, and possibly k based on measures estimated using the computed solution is referred to as adaptive process (i.e. we adapt h, p, and k as dictated by the current state of the solution and a posteriori error estimators or indicators).
During the development of finite element technology and even now, solutions of class have been used predominantly. The local approximations of class result in interelement discontinuity of the derivatives normal to the interelement boundaries. When the solutions of the BVPs are smooth, these interelement jumps in the derivatives are reduced upon h, p refinements and we say solutions converge weakly to class. The a posteriori error estimations largely exploit the interelement discontinuities of the derivatives inherent in local approximations. We note the following.
1) When the local approximations are considered in higher order spaces, the a posteriori error estimates used currently that are derived based on local approximations are meaningless as for higher order global differentiability local approximations the interelement jumps in the derivatives of the solutions used currently do not exist.
2) The local approximations can only be used in a system of first order differential equations to calculate the residuals and residual functionals over as well as over, but only in Lebesgue sense. For higher order BVPs such computations are not possible with local approximations of class. Even though the residual functional over and are true measures of how well the local approximation satisfies the BVP, the emphasis has been largely on a posteriori error estimation, primarily due to the insistence on the use of local approximations.
3) Our view is that in a finite element computational framework the physics of the BVP must be preserved and in such a framework, once a finite element solution has been calculated, the computational framework must permit a posteriori computations of any desired measures otherwise the computational framework is deficient.
7.2. A Posteriori Error Computation
As mentioned in Section 7.1, the computational framework must be designed such that it permits a posteriori computations of all desired measures that are necessary and meaningful in adaptivity. Minimally conforming spaces play a crucial role in accomplishing this. We present details in the following. Let
be a boundary value problem in which the differential operator may be self adjoint, non-self adjoint, or non- linear. Let be the highest order of the derivative of in (150). Let and be approximations of over and. The approximation is assumed to be computed from any of the methods of approximation in which the integral forms may be VC or VIC. Let
The approximation space is minimally conforming ensuring that the integrals over are Riemann. Using (150) and (152), we can define residual functions
where n is the number of differential equations in (150). Let
We define residual functionals I and over and by
Since, we can write (using (155) and (156))
If is the theoretical solution then
over and each. Minimally conforming space ensures that integrals over are Riemann, hence proximity of to zero (theoretical value of functional I; that is,) is a measure of error in the solution over. When, , thus for each in, implying that differential Equation (150) are satisfied in the pointwise sense. Thus, the main steps in a posteriori error computation can be summarized in the following.
1) Choose minimally conforming space thereby ensuring integrals over in Riemann sense.
2) Regardless of the method of approximation to construct integral form in the finite element process, the following steps are possible and help in quantifying solution error. Calculate finite element solution and hence.
3) Calculate, for each element e with domain of the discretization.
4) Calculate for.
5) When (or lower), is reasonably converged to for the h, p, and k employed, hence no need for adaptive refinements.
6) When, we examine values for individual elements of to determine which elements have values larger than a certain threshold value. These elements can be considered for adaptive refinement (h or p or both) depending on the strategy adopted. Some of these are presented in the next section.
7) In this approach, a posteriori error estimations derived and used presently (of little value in higher order spaces) are eliminated altogether.
8) Errors in the computed solution are quantified without the knowledge of theoretical solution and there is built-in adaptivity due to for individual elements. The elements with values larger than a threshold value are candidates for refinement.
9) Adaptive processes based on values for elements of descretization are presented in the next section.
8. Summary and Conclusions
In this paper, we have considered a priori and a posteriori error estimations, a posteriori error computation, and convergence rates of the finite element computations for BVPs described by self-adjoint, non-self-adjoint, and nonlinear differential operators. Concepts of h-, p-, and k-versions and h-, p-, and k-convergences in finite element processes are presented and discussed. It is shown that a desired measure of error norm or residual functional versus degrees of freedom behavior has distinct features that can be classified as pre-asymptotic range, onset of asymptotic range, asymptotic range, onset of post-asymptotic range, and post-asymptotic range. The significance and importance of these ranges in finite element computations has been discussed and demonstrated through three model problems described by self adjoint, non-self adjoint, and non-linear differential operators.
The a priori estimates only hold in asymptotic range and their derivation in the currently published literature are only valid for self adjoint operators in GM/WF when functional is symmetric, thus GM/WF has best approximation property in B-norm. New work presented in this paper establishes correspondence between best approximation property of an integral form in some norm and the variational consistency of the integral form and demonstrates that when one exists the other is ensured. Thus, for establishing a priori error estimates, variational consistency becomes an essential property of the integral form. Of course best approximation property in some norm if it exists is equally good as best approximation property and variational consistency of integral form can not exist without each other, i.e. they co-exist. In case of GM/WF, VC integral form is possible for self adjoint operator and in case of LSP VC integral form is possible for all three classes of differential operators, hence a priori estimates for GM/WF for self adjoint operators and a priori estimates for LSP for all three classes of operators can be derived. The derivation of a priori error estimates presented in proposition 5.2 applies to GM/WF for self adjoint operators and in case of LSP for all three classes of operators as well as any other integral form resulting from a chosen method of approximation as long as the integral form is VC. Numerical studies for the model problems containing the three classes of operators confirm that when the integral form is VC, same a priori estimates and convergence rates hold. Thus, for the first time we have a priori error estimates for non-self adjoint and non-linear differential operators. Extensive numerical studies are presented for various p and k values for uniform h-refinements demonstrating that the theoretically derived convergence rates in a priori estimates are always in agreement with calculated values when the integral forms are VC. The a priori error estimates derived here also hold for 2D and 3D BVPs as long as the integral forms in these BVPs are variationally consistent. This can be confirmed numerically and is in agreement with published literature for self adjoint operators.
A posteriori error estimation based on the work presented here is viewed unnecessary when the approximation spaces are minimally conforming or of orders higher than minimally conforming due to the fact that when using such spaces a posteriori error computations of any desired quantity (for example and I) that can help guide adaptivity is possible. residual values for elements of are shown to be a perfect choice for adaptivity.
In short, VC integral form permits derivation of a priori error estimates and determination of convergence rates for all three classes of differential operators and use of minimally conforming spaces make a posteriori error estimation unnecessary and permit determination of desired a posteriori measures (such as and I) that can be used to quantify errors in the currently computed solution and to design adaptive processes (presented in a followup paper). The same estimates and convergence rates hold for 2D and 3D BVPs when the integral forms are VC. The details are somewhat involved and have been presented in published literature for self-adjoint operators.
The first and third authors are grateful for the support provided by their endowed professorships during the course of this research. The computational infrastructure provided by the Computational Mechanics Laboratory (CML) of the Mechanical Engineering department of the University of Kansas is gratefully acknowledged. The financial support provided to the second author by the Naval Air Warfare Center is greatly appreciated.