Kidney (renal) diseases and dialysis are among the most costly of diseases, and are a worldwide burden     . Between 8% - 10% of the adult population have some form of kidney damage, and every year millions die prematurely of complications related to chronic kidney diseases (CKD)  . For example, it is reported that treatment of CKD is likely to exceed $48 billion per year in the US, that treatment for all current and new cases of kidney diseases through 2020 could cost $12 billion in Australia, that the economic loss over the next decade in China could be $558 billion, and that the annual cost of dialysis is as much as $23 million, accounting for 30% of the budget of the National Resources Fund for specialized therapies in Uruguay  . Since CKD is a very costly disease, accurate cost-benefit analysis is quite important  . Liyanage et al.  estimated that 2.618 million people worldwide received renal replacement therapy (RRT) in 2010, and the number of people needing RRT would be between 4.902 - 9.701 million.
The Centers for Disease Control and Prevention  estimated that 30 million people or 15% of US adults had CKD. CKD is classified into 5 stages, with the severity of the disease increasing the higher the stage  . The US Renal Data System (USRDS)   estimated that Medicare spending for recipients age 65 or older with CKD exceeded $50 billion, or 20% of all Medicare spending for that age group. USRDS also reported that the most severe or end stage of CKD (ESCKD; i.e., stage 5), cost $32.8 billion or 7.2% of the annual total cost of Medicare fee-for-service spending. They also suggested that 14.8% of US adults have CKD. Honeycutt et al.  estimated the Medicare costs of CKD to be $1700, $3500 and $12,700 for stages 2, 3 and 4, respectively. They did not find a significant difference for stage 1 individuals.
Albertus et al.  estimated the lifetime risk of ESRD from birth were 4.0% for males and 2.8% for females in the U.S. in 2013. They also reported that the ESRD risk was more than doubles among racial/ethic groups. The cost of CKD in England from 2009-10 was estimated at £1.44 - £1.45 billion, that is, up to 1.3% of all National Health Service spending in that year  . Damien et al.  reported that costs of CKD patients with co-morbid conditions were higher than those without such conditions. It has also been reported that CKD and its treatment have become a very serious problem and big economic burden in India   . Icks et al.  estimated that the mean total dialysis-related cost in 2006 was 54,777 euros per patient year based on surveys in western Germany. Moreover, cardiac and vascular mortality rates of dialysis patients were several times higher than those of the general population  . de Jager et al.  agreed that dialysis patients had a generally increased risk of death, but denied excess cardiovascular mortality.
Obviously, reducing new cases of CKD and its complications, disability, death, and economic costs have become an important goal  . Proper CKD management to prevent disease progression, minimize complications, and promote quality of life is another important goal  . Efforts to control costs without sacrificing quality of care are also needed  . Even small acute changes in kidney function can result in CKD, ESCDK and death  . Manns  reported that patients with renal recovery spent significantly fewer days and less costs in hospital after discharge than those on dialysis. Therefore, proper treatments for kidney disease patients are critically important.
Various studies have been conducted about the effectiveness and costs of kidney disease treatments, such as sustained low-efficiency dialysis, continuous renal replacement, prolonged intermittent replacement, and intermittent hemodialysist therapies  -  . Another important factor in preventing CKD and ESCDK is early detection and treatment for those at high risk. Vanholder  pointed out the importance of primary prevention of CKD, demonstrating that it is more important than secondary prevention. Clark  et al. reported the beneficial effects of increasing water intake on renal function in patients and those at risk of CKD, especially at the early stages of disease. Gulla et al.  pointed out the necessity of timely referrals of patients with CKD, and of a system to promote them, because referrals were often made too late. Galbraith et al.  reported the results of the See Kidney Targeted Screening Program for CKD in Canada. A total of 6329 Canadians participated in the program, and 5194 with at least one risk factor were screened. Among them, 18.8% had unrecognized CKD. (Also see, Komenda, Rigatto and Tangri  .) Risk factors of CKD include obesity, hypertension, diabetes, kidney problems, high-risk ethic group, vascular disease, family history, preeclampisa, and smoking    . Luyckx and Benner  and Luyckx et al.  highlighted the necessity of nutritional intervention, and regular monitoring of preterm and low-birth weight infants through life to reduce the risk of kidney diseases.
In Japan, according to the Ministry of Health, Welfare, and Labor, CKD patients represented 24,100 inpatients and 107,300 outpatients as of October 21-23 in 2014  ; as a result, 1.546 trillion yen or 3.8% of total medical costs (40.8 trillion yen) were used for kidney failure (under Japanese disease classification, “kidney failure” is used for kidney disease)  . The Japanese Society for Dialysis estimated that there were 324,986 or 2592 per million chronic dialysis patients in 2015  . For treatment of ESKD, one of three methods is used: in-center dialysis; home dialysis; or transplantation  . Although transplantation was both more effective and less costly than dialysis   , use of transplantation and home dialysis are very low in Japan   .
To evaluate total cost and economic burden, it is necessary to investigate kidney failure patients and to compare these patients with healthy individuals. Evaluating risk factors is also important for preventing prevalence of kidney failure. It is therefore necessary to investigate a dataset including both normal healthy individuals and kidney disease patients. However, in most countries, it is very difficult and costly to obtain a large-scale dataset that includes many normal and healthy individuals, because such people do not voluntarily go to hospitals or clinics.
The health insurance societies, formed by private companies and central and local governments, pay the costs for yearly mandatory medical checkups (hereafter, checkups) for employees age 40 or older  in Japan. (Since kidney failure and dialysis are reported in the same category at checkups, we use “kidney failure/dialysis” hereafter.) The monthly reports of medical payments, called “receipts”, are sent from medical institutes to the health insurance associations. These societies thus have both health and medical information on all members, including healthy normal individuals  .
In this paper, we first analyze the total costs and economic burden of kidney failure/dialysis using the dataset containing 113,979 checkups and 3,172,066 receipts obtained from 48,022 individuals. The distribution of medical costs shows a heavy tail on the right side and many “zeroes” are observed. Therefore, the model that combines the power transformation and Tobit model is used. Then, risk factors for becoming kidney failure/dialysis are investigated by the probit model.
2. Data and Methods
In this study, we analyzed an anonymized dataset combining checkups and receipts. First, we compared the distributions of medical costs for all cases and for kidney failure/dialysis cases. Since other variables might affect medical costs, we then evaluated their effects. For example, individuals with kidney diseases sometimes have complications such as cardiovascular diseases  . If the treatments of complications were a major source of medical costs, the kidney diseases might not be costly, and it would be better to use medical resources for preventions and treatments of the complications.
To remove the effects of other variables, we required a regression analysis. However, there are two problems in using standard regression analysis for the medical cost data. The first is that it has many zero values (about 20%). The second is that the distribution has a very heavy tail on the right side. We therefore used the power transformation Tobit model for the analysis. Finally, we use the probit model to analyze the risk factors for kidney failure/dialysis. For details regarding Tobit and probit models, see Amaiya  .
The dataset contained information regarding 113,979 checkups obtained from 48,022 members of the society and all their receipts from fiscal year 2013 to fiscal year 2015 (i.e., April 2013 through March 2016). It was created with the cooperation of the health insurance society of one large Japanese corporation that has offices and operational centers throughout Japan. The receipts were classified into five different categories: dental; inpatients of DPC hospitals (hereafter, DPC); outpatients and inpatients of non-DPC hospitals (hereafter, outpatient & non-DPC); care-giving; and pharmacies. Of these, we used DPC hospital, outpatient & non-DPC hospital, and pharmacy receipts for the analysis of kidney failure/dialysis. Total cost for these three categories is subsequently referred to as the “medical cost”. The number of DPC, outpatient & non-DPC hospital, and pharmacy, and total number of receipts for the three were 15,652, 1,986,494, 1,169,920 and 3,172,066, respectively, during the sample period. These receipts were added up and medical costs in each fiscal year were calculated. A total of 113,979 cases for which both the results of checkups and medical costs were available in the same fiscal year were used. The cases where body mass index (BMI) and diastolic blood pressure (DBP) values were too large (over 100 and 300, respectively) were excluded, leaving 95,353 cases without missing values in any explanatory variables for use in the analyses of medical costs and risk factors.
2.2. Power Transformation Tobit Model for Medical Costs
As mentioned, medical costs contain many zero values. To deal with this, a Tobit model (censored regression model, limited dependent variable model) is widely used. In the Tobit model, we can observe the value of observed dependent variables if it takes a positive value, and 0 if it takes 0 or a negative value. The model is given by
if and if
where and are vectors of explanatory variables and unknown parameters. is not observable when it the values are negative. For the estimation of the model, we assumed the normality of the distribution of the error term , and estimated the model by the maximum likelihood estimator (MLE). However, in this case, the distribution was quite different from the normal distribution, so that we could not use the MLE directly. Sittig, Friedel and Wasem  found that costs were not normally distributed, but were skewed to the right, as in this study. They used the log transformation to make the distribution close to the normal distribution. However, as medical costs contain zero values, we could not use the log transformation. Instead, we used the power transformation model given by
where is the medical cost, and is the transformation parameter. Combining these two models, we obtained a power transformation Tobit model, and the log of the likelihood became
where is the variance of and . Let be the MLE that maximizes , and be the true parameter value. Then its asymptotic distribution is given by
As explanatory variables that might affect medical costs, we chose the following variables: Age, Female, Height, BMI, SBP (systolic blood pressure), DBP (diastolic blood pressure), Eat_fast (1: eating faster than other people, 0: otherwise), Late_Supper (1: eating supper within two hours before bedtime three times or more in a week, 0: otherwise), After_supper (1: eating snacks after supper three times or more in a week, 0: otherwise), No_breakfast (1: not eating breakfast three times or more in a week, 0: otherwise), Exercise (1: doing exercise for 30 minutes or more two or more times a week for more than a year, 0 otherwise), Daily_activity (1: doing physical activities (walking or equivalent) for one hour or more daily, 0: otherwise), Walk_fast (1: walking faster than other people of a similar age and gender, 0: otherwise), Smoke (1: smoking, 0: otherwise), Alcohol_freq (0: not drinking alcoholic drinks, 1: sometimes, 2: every day), Alcohol_amount (0: not drinking, 1: drinking less than 180 ml of Japanese sake wine (about a 15% alcohol percentage) or equivalent alcohol in a day, 2: drinking 180 - 360 ml, 3: drinking 360 - 540 ml, 4: drinking 540 ml or more), Sleep (1: sleeping well; 0: otherwise), F2014 (1: fiscal year 2014, 0: otherwise), F2015 (1: fiscal year 2015, 0: otherwise), Kidney (1: with kidney failure/dialysis, 0:otherwise), Cerebrovascular (1: with cerebrovascular diseases 0: otherwise), Cardiovascular (1: with cardiovascular diseases, 0: otherwise), Diabetes (1: with diabetes, 0: otherwise), and Anamnesis (1: with anamnesis, 0: otherwise).
Age, Female and Height represent basic characteristics of an individual. BMI represents obesity  , and blood pressures are important factors affecting health conditions  . Eat_fast, Late_Supper, After_supper, No_breakfast, Exercise, Daily_activity, Smoke, variables related to drinking alcohol, and Sleep represented the person’s lifestyle. These variables are important when giving patients practical advice on how to improve their lifestyle to save medical costs. Cerebrovascular, Cardiovascular and Diabetes are costly diseases  that may be related to kidney diseases      .
As a result, in Equation (1) becomes (Model A):
2.3. Probit Model for the Risk Factors of Kidney Disease
Next, we evaluated risk factors of kidney failure/ dialysis by the probit model. Let be a dummy variable that takes 1 if an individual has kidney failure/dialysis and 0 otherwise. The probit model (Model B) is given by
if and if , and
where follows the standard normal distribution and is its distribution function. is a latent variable and only its sign is observable. Basic characteristics of individuals, lifestyles, and factors determined to be relevant in previous studies    were chosen as an explanatory variables, and Equation (3) becomes (Model B):
As mentioned before, Cerebrovascular, Cardiovascular and Anamnesis are considered as important risk factors. However, there are endogeneity problems of other disease variables. Kidney is an important anamnesis, and causalities among kidney, cerebrovascular and cardiovascular diseases are unclear. Therefore, we excluded these variables and considered the reduced form, rather than the structural form in this study.
3.1. Distribution of Medical Costs for All Cases
The distribution of medical costs for all cases is shown in Figure 1 and listed in Table 1. In Japan, medical costs are measured by points rather than yen, and medical institutes are paid 10 yen per point  . The average and standard deviations (SD) were 13,672 and 39,576 points, respectively, and the total cost was 1558 million points. The distribution shows a very heavy tail on the right side; in 18.9% of cases, the medical costs were zero. On the other hand, 1.7% used more than 100,000 points, 30.2% of the total cost. Moreover, 0.17% used over 500,000 points, and their medical costs accounted for 7.8% of the total cost.
Table 1. Distribution of medical costs.
Figure 1. Distribution of medical costs.
3.2. Distribution of Medical Costs for Kidney Failure/Dialysis Cases
In 281 cases (0.25% of all cases), individuals were diagnosed with kidney failure/dialysis as an anamnesis and their medical costs were 3.5% of total medical costs. The average and SD were 191,791 and 242,490 points. This means their average medical cost was 14.5 times as large as that of those without kidney failure/dialysis. Moreover, for the cases where the medical costs were more than 100,000 and 500,000 points, kidney failure/dialysis cases accounted for 5.9% (=114/1945) and 32.1% (=61/190), respectively. Individuals underwent checkups from one to three years in our dataset. A total of 122 individuals were diagnosed with kidney failure/dialysis one year, while 30 and 33 individuals were diagnosed with kidney failure/dialysis two and three years. The average medical cost per fiscal year were 63,608, 206,935 and 340,576 points for those diagnosed with kidney failure/dialysis for one, two and three years.
3.3. Distribution of Medical Costs for Kidney Failure/Dialysis Cases
Table 2 presents a summary of explanatory variables, and Table 3 lists the result of estimations for the medical costs (Model A). The estimate of the transformation parameter, , is 0.506. The estimates of Age, Female, Height and BMI were positive and significant at the 1% level; i.e., these variables make medical costs higher. The estimate of DBP was positive and significant at the 5% level, but the estimate of SBP was not significant. Among variables related to lifestyle, the estimates of Eat_fast and Late_Supper were positive and significant at the 1% and 5% levels. The estimates of No_Breakfast, Daily_activity, Walk_fast, Smoke and Sleep were negative and significant at the 1% level, and that for After_Supper was negative at the 5% level. These variables reduced medical costs. As for alcohol consumption, the estimates of Alcohol_freq were negative, but that of Alcohol_amount was positive and significant at the 1% level. This means that although a proper amount of alcohol consumption might reduce medical
Table 2. Summary of explanatory variables.
Table 3. Results of estimation for medical costs: Model A.
*: significant at the 5% level, **: significant at the 1% level, SE: standard error.
costs, too much increased costs. The estimate of F2015 was positive and significant at the 1% level. Concerning the disease variables, all estimates were positive and significant at the 1% level. The values were much larger than those of other variables. If an individual has an anamnesis, the medical costs become much higher, as expected. In particular, the estimate for Kidney was 233.7, much higher than even those of other diseases; this means that kidney failure/dialysis is a very costly disease even after eliminating the effects of other variables.
3.4. Probit Model for Risk Factors of Kidney Failure/Dialysis
Table 4 presents the estimation results of the probit model for risk factors of kidney failure/dialysis (Model B). The estimate of Age was positive and significant at the 1% level, but those of Female, Height and BMI were not significant. SBP and DBP were positive and negative, respectively, and significant at the 1% level; this implied that not only high SBP (SBP hypertension) but also low DBP (DBP hypotension) are important risk factors  . Among lifestyle variables, estimates of Walk_fast and Alcohol_freq were significant, but other variables were not significant at the 5% level. The estimate of Diabetes was significant at the 1% level. This means that diabetes is a major risk factor for kidney failure/dialysis. These results are largely consistent the findings of previous studies   .
The results of our analyses suggested that kidney failure/dialysis is a very costly disease, especially when it progresses (i.e., becomes CKD). The average medical
Table 4. Results of estimation for risk factors: Model B.
*: significant at the 5% level, **: significant at the 1% level, SE: standard error.
cost per fiscal year for those diagnosed with kidney failure/dialysis for one, two and three years were 4.8, 15.6 and 25.6 times as much as those of individuals without kidney failure. As a result, 99 cases for 33 individuals with kidney failure/dialysis, less than 0.1% of all cases, used 2.2% of total medical costs. Therefore, prevention and treatment at an early stage of kidney failure/dialysis, especially to avoid CKD, are a very important issue for the Japanese medical system.
The estimation results of Model A suggested that kidney failure/dialysis is a truly costly disease even when the effects of other variables are eliminated. Since the power transformation Tobit model is used, we calculated the average cost by computer simulation of 10,000 trials. For comparison of medical costs, we consider a male age 50, height 170 cm, BMI 24, SBP 125 mmHg, DBP 80 mmHg; the values of all other variables were set to zero as a base case. The average annual medical cost of this individual was about 9200 points. If he had kidney failure/dialysis, however, the cost increased to 103,390 points, or 11.3 times higher than the base case. The medical cost of an individual with both kidney failure/dialysis and diabetes became 147,960 points or 16.2 times higher.
Next, we calculated the probability of an individual becoming kidney failure/dialysis using the results of Model B and the formula . For the base case, the probability of an individual becoming kidney failure/di- alysis was 0.23%. If this individual had diabetes, the probability increased to 0.61% or 2.7 times. Nawata and Kimura  found that diabetes was a far more costly disease than thought. One reason might be that diabetes increases risk of kidney failure/dialysis and CKD. If an individual has SBP hypertension of 140, 160 and 180 mmHg (these values correspond to the SBP criteria for grades 1, 2 and 3 hypertension  ), the probabilities for kidney disease become 0.35%, 0.63% and 1.08%, respectively. Meanwhile, an individual with hypotension DBP of 60 mmHg has an increased probability of 0.41%. Those with both diabetes and blood pressure problems have a much higher risk. The probabilities of diabetic individuals with SBP of 140, 160 and 180 mmHg become 0.91%, 1.52% and 2.46%, respectively. The probability of a diabetic individual with DBP hypotension of 60 mmHg becomes 1.03%. This means that the risk of a diabetic individual with SBP hypertension of 180 mmHg and DBP hypotension of 60 mmHg is 10.5 and 4.6 times higher than the base case. For prevention and treatment at an early stage of kidney disease, screening out risky individuals is essential   . These results could help identify such high-risk individuals.
In this paper, we evaluated the medical costs and probability of kidney diseases (kidney failure/dialysis). In terms of total medical costs, the distribution shows a very heavy tail on the right side. In a small number (0.25%) of cases, individuals diagnosed with kidney diseases had medical costs totaling 3.5% of all medical costs. Care for an individual with kidney disease cost 14.5 times the cost of individuals without kidney disease. Moreover, if the disease progressed to CKD, the medical cost increased substantially. We then used the power transformation Tobit model to eliminate effects of other variables that might affect the medical costs. Even disregarding various characteristics, lifestyles and medical histories of individuals, the conclusion was the same; that is, kidney diseases are truly very costly diseases. Finally, risk factors were evaluated using the probit model. The important risk factors for kidney diseases are diabetes and blood pressure problems (not only hypertension of SBP, but also hypotension of DBP). In particular, an individual with both diabetes and blood pressure problems has a very high probability of developing kidney diseases. These results could help medical personnel to identify high-risk individuals and provide them with sound advice and/or treat them at an early stage of the kidney disease.
We note two limitations to this study. First, our dataset was observatory and covered only 3 years. Many individuals with kidney diseases were already receiving some medical treatments that might have affected the values of explanatory variables, and individuals with severe kidney diseases might have had to leave the company and thus the health insurance society. Second, the number of individuals with kidney diseases was relatively small. It will be necessary to analyze a larger and longer range dataset from various insurance societies. We are currently negotiating various health insurance societies to provide us such datasets. In Japan, the medical costs are determined by the government and the same payment system is used independent of regions with a few exceptions. Moreover, an individual can freely choose hospitals and clinics. Analyses of regional effects and characteristics of hospitals and clinics are very important. We will also need to analyze other costly diseases such as heart and brain diseases. These are subjects to be studied in future.
This study was supported by a Grant-in-Aid for Scientific Research, “Analyses of Medical Checkup Data and Possibility of Controlling Medical Expenses (Grant Number: 17H22509),” from the Japan Society of Science, and by a research grant, “Exploring Inhibition of Medical Expenditure Expansion and Health-oriented Business Management Based on Evidence-based Medicine” from the Research Institute of Economics, Trade and Industry (RIETI). The dataset was anonymized at the health insurance society. This study was approved by the Institutional Review Boards of the University of Tokyo (number: KE17-10). The authors would like to thank the health insurance society for their sincere cooperation in providing us the data. We would also like to thank an anonymous referee for his/her helpful comments and suggestions.