The objective of this paper is to discuss the importance of testing the adding up condition in a demand system as a gateway to the estimation of the corresponding expenditure-share specification.
To summarize the discussion elaborated further on, the advantages of a share format may be listed as saving degrees of freedom and mitigating error heteroskedasticity. The limitations are, perhaps, more eye opening. A crucial issue consists in the impossibility of testing a null hypothesis such as the adding up condition that is automatically satisfied in an expenditure-share format and induces the singularity of the error covariance matrix. In a share format, adding up, symmetry and homogeneity are hypotheses that cannot be tested independently. Testing the adding up condition is important because, often, the number of sample commodities is much smaller than the number of goods that compose a consumer’s basket. This hypothesis and the associated statistical test constitute the paper’s main focus.
The advantages of a quantity format can be listed as the possibility of testing the adding up condition, the zero-degree homogeneity assumption and the sym- metry and negative semidefiniteness of the Slutsky matrix as separate null hypotheses. The disadvantages are minimal and deal, possibly, with the necessity of requiring larger samples than in the case of a share format. This event may occur in very small samples.
Given the gamut of issues associated with the estimation and testing of consumer demand systems, we will narrow this preliminary discussion to specifications of share systems as commonly appeared in the literature. The pioneering paper by Sir Richard Stone  , presents a linear expenditure system (LES) of demand functions stated in expenditure format, where the dependent variable represents the expenditure on a given good. This specification is equivalent to a share format where the share is defined with respect to total expenditure. Stone’s LES empirical model includes all goods and services grouped in six categories of commodities for the years 1920 to 1938 in the United Kingdom. For the first time, the theoretical requirements of adding up, zero-degree homogeneity of demand functions and symmetry of the Slutsky matrix appear as restrictions in the empirical literature. Barten  , who presented a linear demand system that is stated directly in share format, attempted to include all commodities in the consumer expenditure household survey kept in The Netherlands between 1921 and 1958. There followed other important papers by Barten  , in share format and by Pollak and Wales  , in expenditure format. Hence, the tradition of estimating demand systems in expenditure-share format has a distinguished lineage.
In his influential paper that summarizes the empirical literature on consumer demand, Barten (  , p. 23) wrote: “The approach is essentially an empirical one, in the sense that one aims at the formulation of a system to be estimated using actual data. In view of the data limitations, one makes use of restrictions which, in part, are of a theoretical nature.” We interpret Barten’s words to mean that the data generating process (DGP) ought to assume center stage in an econometric specification of models that wishes to represent the final decisions of consumer behavior. In econometrics, a DGP must be guided by economic theory but must also be adapted to describe the peculiarities of data collection, as Barten implicitly suggests.
In the case of consumer behavior, utility theory develops the process of deriving systems of demand functions in the format of quantity levels of various commodities as a function of their prices and income. Let be an -vector of quantity levels of commodities and services that represent all the goods’ choices available to a consumer. Let be an -vector of prices of those goods. Finally, let be the exogenous income available to consumer for making her decisions. Then, utility theory derives a system of relations that are interpreted as Marshallian demand functions and a budget constraint
The system has unknown quantities, , and, therefore, one of the relations in Equation (1) is redundant and can be omitted in the solution of the remaining quantities. The quantity of the -th good can be recovered from the budget constraint after replacing the quantities obtained from the solution of the relations.
In many cases, however, the DGP of consumer demand information, in any given sample, may not satisfy all the conditions stated above. Many empirical studies that estimate systems of demand functions exhibit a number of commodities, , that is much smaller than the number of all possible goods available for consumers’ decisions over a given time interval. In this case, the sample demand system is incomplete  . It does not satisfy the adding up condition since represents exogenous income that is available for purchasing all the commodities of consumers’ choice.
It is well known that the hypotheses of separability and multistage budgeting were developed to justify the adoption of the features associated with the general theoretical scaffolding also in the case of a small number of commodities (or commodity aggregates). Accordingly, consumption decisions would occur in at least two stages. In the first stage, consumer would allocate income among a number of commodity subsets. In the second stage, consumer would proceed to maximize utility only with respect to the commodities belonging to one of those subsets subject to the previously determined portion of income for that category of goods. All this development is acceptable from a theoretical standpoint. In an empirical setting, however, these hypotheses remain often untested and untestable, given the available sample information. Put another way, the portion of income that, according to a two-stage approach of consumer decisions, would be allocated to a specific commodity subset in the first stage is never known and measurable, thus preventing the testing of the assumption that would require this level of income to be an exogenous piece of information. We emphasize, therefore, that to test the hypothesis whether a group of commodities is separable from the rest of the consumer’s basket it is necessary to collect sample information on quantities and prices on all the goods.
As a consequence, in many empirical studies, the budget constraint in Equation (2) may not bind. Furthermore, information on total exogenous income is rarely collected. What Barten calls total expenditure, , is simply an accounting definition analogous to Equation (2) but generated as the sum of sample prices times quantities over the available commodities. Often, therefore, for econometric purposes, there are only independent equations similar to Equation (1) while the analogous Equation (2) is not a constraint but is simply an accounting relation with no sample information of its own that is independent of prices and quantities. Many empirical studies of demand published to date, however, have taken for valid both Equations (1) and (2), regardless of the subset of commodities dealt with in the sample and without performing a statistical test of the adding up condition. This test appears to represent a crucial step for assessing the theoretical scaffolding leading to a share format: If constraint (2) is part of the hypothesis that the sample commodities constitutes a proper subset of goods within a two-stage budgeting process, the test of the adding up condition is an indicator of whether that hypothesis may be supported by the sample data.
Referring to a stochastic specification of a demand system described by the theoretical scaffolding of Equations (1) and (2), the fundamental, empirical consequence of the assumptions and conclusions that are valid for the entire consumer’s basket is stated by Barten (  , p. 26) as: “However, Equation (2) implies a linear dependence of the joint distribution of the disturbances if m and p are exogenous. The theoretical covariance matrix is, therefore, singular. This problem is usually solved by deleting one equation from the system.”
This proposition was originally put forward in the late sixties,  , and, since then, almost all the empirical studies of demand that appeared in the literature have adopted it regardless of the number of commodities involved and whether the available information constitutes an incomplete sample. Furthermore, the great majority of studies has gone another step and has specified demand systems in the format of expenditure shares. Deaton and Muellbauer  , with their Almost Ideal Demand System (AIDS), have provided a remarkable impetus for the use of an expenditure-share format in empirical studies of demand.
Thus, in this cursory survey of empirical demand issues, we have identified two main topics of interest. The first topic deals with the question whether the DGP of sample information of consumer behavior-as typically observed-statis- tically supports the application of the more general approach embedded in Equations (1) and (2), regardless of the size and completeness of the subset of commodities constituting the sample data. The second topic discusses the consequences of estimating demand systems in expenditure-share format rather than in a quantity format. In particular, given the absence of empirical information about a two-stage budgeting and separability that characterizes many empirical demand studies, it is of interest to know whether the adding up condition holds for the sample at hand. As elaborated in more detail further on, this condition is crucial for concluding that the error covariance matrix is singular and, as a consequence, for admitting the deletion of an equation in the estimation of demand parameters without loss of information. The adding up condition, however, cannot be tested using an expenditure-share format of the demand system. This test must be performed using a quantity format.
The paper is organized in several sections. Section 2 presents a general discussion of estimating models (not necessarily models of consumer behavior) in a share format. Section 3 lays out the stochastic quantity model of demand functions based upon the AIDS specification of Deaton and Muellbauer  , as the most popular demand system appeared so far in the literature. Two more recent specifications will also be presented: the quadratic AIDS (QUAIDS) of Banks, Blundell and Lewbel  , and the EASI (Exact Affine Stone Index) demand system of Lewbel and Pendakur  . Section 4 describes a large sample of data used in the empirical analysis and presents the empirical results. Conclusions follow.
2. Models in Share Format
Any linear statistical model that is specified in share format, with an intercept in each equation and the same explanatory variables appearing in every equation, exhibits a unique property: the sum over equations of the error terms is equal to zero in each sample observation. Therefore, the error covariance matrix is singular. Furthermore, the sum over intercepts of the various equations is equal to 1 and the sum over rows of the coefficient matrix associated with explanatory variables is equal to zero without any a priori condition on parameters. Hence, the adding up property of shares holds automatically on the left and on the right side of the equality sign. This result is briefly mentioned in papers by Worswick and Champernowne  , Barten  , Berndt and Savin  . We offer an alternative derivation in the Appendix. Surprisingly, however, many demand studies that specify a share format declare that the adding up restrictions must be imposed on the model’s parameters. For example, Berndt and Savin (  , p. 938) write: “It is assumed that y satisfies the adding up conditions …”; Moschini (  , p. 351) writes: “… adding up … hold(s) if …”; Alston, Chalfant and Piggott (  , p. 74) write: “To satisfy … adding up … the following restrictions must hold …”; Fisher, Fleissig and Serletis (  , p. 62) write: “Adding up … restrictions require that …”; Cranfield, Eales, Hertel and Preckel (  , p. 357) write: “Adding up is imposed with …”; Barnett and Serletis (  , p. 213) write: “… the resulting theoretical restrictions are …”; Liu, Parton, Zhou and Cox (  , p. 488) write “… to be consistent with the demand theory, the following restrictions must be adhered to: the adding up restriction …”. This oversight may have consequences for testing hypotheses.
Let indicate sample observations; the number of equations; the number of explanatory variables; the share of the equation in the observation; the explanatory variable in the observation; the intercept in the equation; the parameter in the equation; the disturbance term of the equation in the observation with expectation and constant contemporaneous covariance matrix . All explanatory variables appear in each equation. Then, a share model without theory is stated as
Summing over equations
In the Appendix, it is shown that the sum over equations of relation (3) fulfills the adding up property automatically without imposing any a priori additional constraints on the parameters of the share-model specification of Equation (3). In other words,
The contemporaneous error terms form a linear combination in each observation and the estimated error covariance matrix is singular. Therefore, any estimator that requires the inversion of the error covariance matrix is infeasible. Notice that
without the necessity to impose these conditions as a priori restrictions. Hence, an equation can be deleted from system (3) and the estimates of the corresponding parameters can be recovered from relations (6).
The relationship between this discussion of a general share system such as Equation (3) and an expenditure-share system of demand functions, as usually stated in the literature, is straightforward. Many demand studies appeared in print and specified in expenditure-share format-although they deal with a number of commodities -have all explicitly assumed and imposed adding up conditions by way of parameter restrictions analogous to relations (6). But since the adding up condition holds by necessity without the need to impose it a priori, this suggests that the share specification of any econometric model (and, equivalently, the expenditure specification of it) is like a straight jacket: once worn, it forces the error covariance matrix to be singular and the adding up condition to hold whether or not the DGP warrants it. An important corollary follows: the null hypothesis that the adding up condition holds cannot be tested under a share (expenditure) format of demand systems. In the absence of any sample information regarding a two-stage budgeting, the test of the null hypothesis that the adding up condition holds corresponds to an indirect test of the assumption that the sample commodities constitutes a proper subset of goods in a two-stage budgeting process of consumer behavior. To test this null hypothesis, however, only a quantity format specification of a demand system is available.
To exemplify more directly that the above reasoning applies also to demand systems, we state the AIDS model of Deaton and Muellbauer  , in share format
where and . All logarithms in this paper are defined in base 10. There are commodities with and repre- senting quantities and prices of the sample observation while total expenditure is with shares computed as . Further- more, the deflating price index is defined as
although Deaton and Muellbauer suggested and many empirical studies adopted their suggestion that a Stone index could often suffice:
Furthermore, Deaton and Muellbauer specify and impose parameter restrictions that include adding up requirements, zero-degree homogeneity in prices and income of demand functions and symmetry of the Slutsky matrix
adding up (10)
zero-degree homogeneity (11)
Slutsky symmetry (12)
and write (  , p. 314): “Provided (10), (11), and (12) hold, Equation (7) re- presents a system of demand functions which add up to total expenditure, are homogeneous of degree zero in prices and total expenditure taken together, and which satisfy Slutsky symmetry.” But, as argued above, restrictions (10) are automatically satisfied in a share system regardless of either theory or other assumptions. They are satisfied automatically also when conditions (11) and (12) are imposed using either specification of the price index deflator. Hence, there is no need to state them as if they “ought to be imposed” for estimating a share model which represents a demand system.
1But Barten also wrote (  , p. 16): “However, it is quite arbitrary as to which equation should be dropped, and to avoid any asymmetry it seems more appropriate to estimate the system in its complete formulation.”
Thus, the estimation of Equations (7) and (8) [or (9)] together with side conditions (11) and (12) represents a special case of estimating the share system (3). Barten (  , p. 16) stated: “… it is possible to delete one equation from the system without losing any information.”1 After the knowledge acquired from the above discussion, this statement should be qualified to read: “When a share format is warranted, it is possible to delete one equation from the system without losing any information.”
With respect to parameter “restrictions” (10) a crucial remark is in order. They imply that the general theoretical conclusions of consumer theory, which are valid for the full basket of commodities, have been adopted also for the case when the number of sample goods is . Furthermore, the adding up hypothesis cannot be tested in an expenditure-share demand system. Hence, suppose that the adding up condition does not hold (tested in a quantity format model). This means that the number of sample commodities is different from the number of goods constituting a proper subset, according to a two-stage budgeting criterion.
3. AIDS, QUAIDS and EASI Quantity Formats
Under the assumptions of an AIDS expenditure function, consumer utility theory generates a system of demand functions that assumes the following quantity format in a stochastic representation
where and is a disturbance term for the commodity in the observation with expectation and covariance matrix . According to Brown and Walker,  , the disturbance terms of commodities involved in the individual consumer’s decisions may depend on prices and total expenditure. To represent this assumption about heteroskedasticity the function multiplies the disturbance term with the objective of rendering the entire error term homoskedastic.
Model (13) can now be used to test a series of null hypotheses based upon restrictions (10), (11) and (12). The tests have the structure of a likelihood ratio which is distributed as a chi square with degrees of freedom equal to the number of restrictions. In particular, we are interested in testing the adding up hypothesis expressed by restrictions (10).
The QUAIDS specification in quantity format takes on the following expression (see  , p. 534):
In this case, the adding up hypothesis requires the same AIDS relations (10) with the addition that
The AIDS and QUAIDS Engel curves are linear and quadratic functions of total expenditure, respectively. On the contrary, the EASI specification admits Engel curves of more complex shape. In particular, they are not bound by rank restrictions originally presented by Gorman,  . To facilitate the comparison with the paper by Lewbel and Pendakur, we adopt their notation in expanded form (  , Equations (8) and (9), pages 833-834). Beside prices and expenditure, the model includes demographic variables denoted by the letter . Restating, for clarity, the range of the various indexes: observations are denoted by ; equations by ; demographic variables by ; ; the power of log expenditure by . The EASI demand system, then, takes on the following specification in quantity format
The variable is the logarithm of real expenditure defined as nominal log expenditure, , deflated by the Stone index and other price terms. The introduction of two-way interactions of the demographic variables with prices and total expenditure follows the specification of Lewbel and Pendakur  .
The adding up constraint of the EASI model is satisfied with the following parametric conditions
The flexibility of the EASI demand system is reflected in the number of parameters to be estimated. For example, with 5 commodities, 5 demographic variables and the exponent of the logarithm of real expenditure equal to 5, the number of parameters to be estimated is 255. A sample of 4847 observations was used.
The rejection of the null hypothesis that the adding up restrictions hold would implies that an expenditure-share format of the demand system is unwarranted. In that case, the use of a share format of the demand system and the drop of an equation for its estimation would correspond to a loss of information because the error covariance matrix of the quantity model is not singular.
4. Data and Results
The estimation and hypothesis testing of the adding up condition are applied to the Canadian Family Expenditure Survey used by Lewbel and Pendakur,  . The original sample is composed of 9 commodity categories: food-in, food-out, rent, clothing, household operation, household furnishing and equipment, transportation operation, recreation, and personal care. It includes 4847 observations on quantities and prices that are spread over a period from 1969 to 1996. It comprises also a series of 5 observable demographic characteristics: 1) the person’s age minus 40; 2) the sex dummy equal to one for men; 3) a car-non- owner dummy equal to one if real gasoline expenditures (at 1986 gasoline prices) are less than $50; 4) a social assistance dummy equal to one if government transfers are greater than 10 percent of gross income; and 5) a time variable equal to the calendar year minus 1986 (that is, equal to zero in 1986). These demographic variables are indicated as Z variables. For a more detailed description of the sample data see Lewbel and Pendakur (  , pp. 839-840).
We reiterate that the principal objective of this paper consists in testing the adding up hypothesis in the estimation of demand systems with an incomplete sample of consumer data because this event is the prevalent occurrence in the empirical literature. Again, an incomplete sample occurs when the commodity categories employed in the empirical estimation do not exhaust the commodities available to consumers’ choice. In the case of the Lewbel and Pendakur database, the presumption is that the 9 categories of goods do, indeed, form a complete sample. Therefore, in order to conform to the context of this paper, the following 5 categories were selected: food-in, rent, clothing, transportation operation and recreation.
For the empirical context described above, the crucial test deals with the adding up hypothesis that, as elaborated in previous sections, cannot be performed using an expenditure-share format of a demand system. Thus, it is the main contention of this paper that a share specification will not imply a loss of information only when the adding up hypothesis will not be refuted by an appropriate statistical test.
For the AIDS model, this hypothesis requires testing the restrictions of Equation (10). For the QUAIDS model, the restrictions are stated in Equations (10) and (15). For the EASI model, the restrictions to test are specified in Equation (18). In all three cases, the parameters of the demographic Z variables require a zero sum over equations.
In all three specifications, the null hypothesis is rejected at a very high confidence level (see Table 1). It is important to remark that in all these 5-commodity systems of equations, the estimated error covariance matrix is not singular and, indeed, it is associated with a condition number of about 15.0, well below the empirical cut off point of 30.0 suggested by Besley, Kuh and Welsch  , as an indication of collinearity. This means that the estimated errors of the 5-com- modity model are not linearly dependent and dropping one equation, as the estimation of a share model requires, amounts to forcing the original quantity model into a straight jacket resulting in a loss of information.
The estimated parameters of the EASI model are reported in Table 2 for the unrestricted version of the demand system. Given the large number of estimated parameters (255) the relevant statistics are given in condensed form. One, two
Table 1. Test results of the adding up hypothesis.
Table 2. FIML estimates of the EASI model.
*, **, *** correspond to 0.1, 0.05, 0.01 confidence levels, respectively.
and three asterisks correspond to a confidence level threshold of 0.10, 0.05 and 0.01, respectively.
The parameter estimates of are highly significant and attest to the complex shape of the Engel curves for each of the five commodity categories. The price-slope coefficients, , are also highly significant. The parameters of the demographic variables and their interactions with total expenditure and prices make up the large body of estimates and suggest the plausibility of the two-way specification of interaction effects.
This paper’s motivation springs from the question of whether a share format of demand systems is warranted even in cases when the data sample deals with a rather small number of consumer goods. That is, when the sample is incomplete in the sense that the number of consumer goods is smaller (sometimes much smaller) than the number of commodity choices available to consumers. The adding up condition was identified as a crucial restriction that may not be attained when demand systems are incomplete. In such cases, the error covariance matrix of the empirical model (specified in quantity format) is not singular and a share format is unwarranted because dropping one equation―as customarily done in the estimation of share specifications―corresponds to losing sample information.
The estimation of a quantity format does not involve any additional difficulties over those ones encountered in the estimation of share formats. Quantity formats, furthermore, allow for testing of all the relevant hypotheses of consumer theory, including the adding up restrictions―an hypothesis that is precluded by share formats.
The empirical illustration of the research strategy discussed in the paper, deals with a rich information base that may constitute the most articulated data sample on consumer choices available at present. The adding up hypothesis for the five commodities that were selected to represent an incomplete sample was rejected with a high degree of confidence in all the three specifications of the demand system.
The sum over equations of the intercepts in any seemingly unrelated equation system specified in share format (with the same explanatory variables entering every equation) is equal to one. The sum over equations of the slope coefficients is equal to zero. As a consequence, the sum over equations of the disturbance terms is equal to zero and the associated error covariance matrix is singular.
In order to simplify the notation the observation index is omitted. Let be the number of equations in share format. The number of shares is divided into a vector of the first shares and the last share . Disturbance terms are divided into a vector of the first terms and the last item. The number of intercepts is divided into a vector of the first intercepts and the last of them, . Let be a vector of explanatory variables that enter each share equation. The column of the matrix of unknown slope parameters is divided into a vector and the last slope parameter . The vector is a sum vector of unitary coefficients.
The K-equation system in share format can now be stated as
The pre-multiplication of system (A.1) by the sum vector results in
Given the share format, Equation (A.2) can be restated as
and rearranging Equation (A.4)
Comparing Equations (A.3) and (A.5), we conclude that