The city size distribution has been long recognised to satisfy a very simple distribution law since Zipf, which is attributed to the generic least effort principle of human behavior . Denote the number of cities having a population size between v and by , and the associated cumulative probability distribution function by
Zipf found that empirically, , with , especially when focusing on large cities, that is, when v is big. As well-known in city size distribution literatures, this empirical law is quite close to reality for most societies across time . However, existing empirical evidence suggests that Zipf’s law is not always observable even for the upper-tail cities of a territory  - . The controversy with empirical findings arises, may due to sample selection biases, methodological weaknesses or data limitations. The hypothesis of Zipf’s law is more likely to be rejected for the entire city size distribution hence alternative distributions may be suggested in such cases.
Moreover, although very accurate in recovering the city size distribution, these contributions do not, in general, give the mechanism for the formation in time of this behavior, thus a theoretical derivation of Zipf’s law for cities has been the object of many studies, see for example     and references therein, and more recently, by using kinetic modeling  , in which Boltzmann type and Fokker-Planck type equations for the size distribution of cities are obtained, by introducing interactions based on some migration rule among cities. This model demonstrated that the city size distribution is kinetically related to some factors such as the rate or the tendency of migration of the inhabitants. As noticed in  , the reasons behind migration are very complex, and it is quite difficult to select one or another reason as dominant. The different choices in the parameters of the kinetic interactions may explain the origin of different effects or even a mixture of effects, which give in the limit a distribution that can be closer to a Pareto or Zipf law, or a lognormal density or others. In any case, the kinetic modeling considered in this framework is useful to clarify the formation of various distributions in terms of various different microscopic interactions.
We further mention that kinetic models were originally used to describe the dynamics of rarefied gas by constructing a Boltzmann-type equation to analyse the effects of the discrete structure of gas molecules . In recent years, various kinetic models have been developed to study the social and economic interactions in multi-agent systems, for example, in social sciences, the statistical description of wealth distribution   , opinion formation   , knowledge formation , belief formation , criminality  and so on. Note that a multi-agent system is often composed of “agents” rather than particles; a kinetic model is used to describe the collective behaviour of individuals in a multi-agent system.
At the kinetic level in  , the Boltzmann collision operator has been selected to be of Maxwellian type as in the classical kinetic theory, that is, the collision kernel is chosen as a constant that does not depend on the “relative velocity of the molecules”. In the context of city size distribution, the Maxwellian hypothesis corresponds to the strong assumption that the migration rate between agents (cities) does not depend on the amount of inhabitants, thus a constant collision kernel is used. This is a simplification of the sophisticated problem, such that, it could be more easily handled from the mathematical point of view. To extend the kinetic formulation in  , in the present study, we introduce in the underlying kinetic equation of Boltzmann type a variable collision kernel as in .
The arrangement of the rest of the paper is as follows. We will show the detailed kinetic modeling of the problem in Section 2, and derive the quasi-invariant limit in Section 3. Finally, we will carry out some numerical tests to validate the model in Section 4. Note that for the quasi-invariant limit, we will show the asymptotic procedure leading from the kinetic description of Boltzmann type to the Fokker-Planck equation. The equilibrium of the Fokker-Planck equation belongs to the class of generalized Gamma distributions. The model test is based on a collection of 332 cities (prefecture-level administrative regions) in China in the 2019 statistical Yearbook, which fits well with the generalized Gamma distribution.
2. Kinetic Modeling of City Size Distribution
To study the evolution of the city size distribution by kinetic models , one first needs to specify the microscopic “collision” rules to describe the change of the population of a city. Consider a multi-agent system in which all agents (cities) are assumed to be indistinguishable . A city’s state at any instant of time is completely characterized by its number of inhabitants v. To avoid inessential difficulties, we can simply assume that although it is clear that v is a natural number. Consequently, the distribution of the multi-agent system, the city size distribution, can be fully characterized by an unknown probability density function .
Follow , we assume that the number of residents of a city will essentially increase with the inflow of immigrants and decrease with the outflow of emigrants. At the same time, due to some uncontrollable factors, the population will change for some other uncertain reasons and show random fluctuations. Hence, the microscopic variation of the city size v is the result of three different contributes
• : the number of inhabitants of a city before and after a microscopic interaction process, respectively;
• : the amount of population which can migrate towards a city from the environment (the multi-agent system). This value is usually sampled by a certain given distribution function , which characterizes the environment itself;
• : a random variable with zero mean and bounded variance, that is, , with suitably small (such that to be positive);
• : the rate of variation of the city size v consequent to internal and external mechanism, respectively. More precise description of them will be prescribed in below.
Internal mechanism. For related to the internal mechanism, we use the concept of “value function” originally used in the study of the distribution of wealth by Kahneman and Twersky : losses weigh heavier than gains in the change of the value function, that is, the value function is concave in the domain of gains and convex in the domain of losses, thus considerably steeper for losses than for gains. In the collision (2.1), the function plays the role of the value function, which can be taken with the form 
in which the value defines an ideal city size, is a small positive parameter introduced to represent the strength of the interaction, and are used to quantify the intensity of migration rates near the ideal city size . For more explanation on the choice of the value function, we refer . Here, in order to simplify the model, we first consider the ideal size of all cities in the whole system as a given value. However, for different countries, this value may depend on the history, political system or cultural background of different countries, or other factors. It is obvious that E is bounded, , and E is negative when the city size v is below the ideal size , and positive in the opposite situation. Hence, this quantity describes the tendency of the population to reach the ideal size if , with the reason that people prefer to live in a city of population .
External mechanism. A non-negative function can be used to describe a measure of the immigration rate. For simplicity, following , a non-negative constant, i.e., , is chosen in the present study. More general choice of can be 
for some positive parameters and characterizing the intensity of immigration rate.
With the interaction rule (2.1), the variation in time of satisfies a linear Boltzmann-like equation , which can be written in weak form, for all smooth functions (the observable quantities),
In (2.3), the notation denotes mathematical expectation taking into account the presence of the random variable in (2.1). And the function denotes the collision kernel, which assigns to the interaction a certain probability to occur. Note that in , the simplification of the Maxwell molecules, leading to a constant interaction kernel , has been assumed. To extend, we notice that the distribution of city size in a country has a close relationship with the national conditions of the country. It is also closely related to many factors such as the speed of urban economic development, urban construction and development conditions.
According to the National Bureau of Statistics (NBS) of China, the size of population of China’s cities has been constantly expanding over the past 70 years, with large, small and medium-sized cities distributed across the country. Among them, small cities attract people because of the low threshold of Hukou. Due to people’s pursuit for better quality of life, many people are willing to live in big cities such as Beijing or Shanghai which can provide people with more employment opportunities, better economic income and higher education, etc. However, big cities are already too crowded, in recent years, with China’s urbanisation, medium-sized cities are also attractive for their new opportunities and living conditions. The migration between different cities is much often than ever. There is strong evidence that the population mobility is greater in cities with a large population, such as Beijing, Shanghai, compared with smaller cities such as Yibin, Xining etc. On the other side, due to geographic and historic reasons, the number of cities (prefecture-level administrative regions) is relatively fixed, so the interactions for v small or near zero should be excluded. Thus, the collision frequency may proportional to the city size v. Hence, to elaborate this behavior, it seems natural to consider a variable collision kernel that
where the constants and . This kernel assigns a high probability of interactions for cities with large population, and low probability of interactions for cities with population v close to zero. By taking into account this new assumption, we consider in the following that satisfies the linear kinetic model
3. Quasi-Invariant Limit: The Fokker-Planck Equation
In order to describe the development of city size distribution more accurately and intuitively, we carry out the quasi-invariant limit. In this Section, we illustrate the main steps leading from Equation (2.5) to its Fokker-Planck limit. To avoid inessential difficulties, we will assume that the environmental distribution has a certain number of bounded moments, more precisely.
meanwhile, we introduce the notation
It’s obvious that the kinetic equation is mass preserving by taking in (2.5).
For the quasi-invariant limit, one assumes that a single interaction determines only an extremely small change of the value v. Therefore, a small parameter is introduced and we consider the scaling
At this point, under the effect of , the interaction will only produce a very small change to the population size of a city. Obviously, the conservation of “mass” of the system still holds under the scaling. To observe the evolution of the mean value, in (2.5), take , there is
Next, denote , then
Now, we can resort to a scaling of time to observe an evolution of the average value independent of . Setting , , then the evolution of the average value for satisfies
Since (3.5) means that the second term vanishes as , one obtains in the limit a closed form for the evolution of the mean value.
It can be observed that the evolution of the mean does not depend on . Since for the microscopic interactions produce a very small change of the value v, a finite variation of the mean density can be observed only if agents in the system undergo a huge number of interactions in a fixed period of time to restore the original evolution. Similarly, with this scaling, one obtains in the limit a closed form for the evolution of the second moment
The above analysis can be used to justify the passage from the kinetic model (2.5) to its continuous counterpart given by a Fokker-Planck type equation. Given a smooth function , let us expand in Taylor series around . First, by the scaling (3.3), it holds
Therefore, in terms of powers of , we easily obtain the expression
where the remainder term vanishes at the order as . Therefore, as , we can obtain that in consequence of the scaling (3.3) the weak form of the kinetic model (2.5) is well approximated by the weak form of a linear Fokker-Planck equation
Providing the boundary terms produced by the integration by parts vanish, Equation (3.12) coincides with the weak form of the Fokker-Planck equation
Without loss of generality, we will simplify Equation (3.13) by assuming
Thus, the resulting Fokker-Planck equation takes the form
As exhaustively discussed in Ref.  , the right boundary conditions that guarantee mass conservation are the so-called no-flux boundary conditions given by
With these no-flux boundary conditions, we can obtain the explicit stationary solution of the Fokker-Planck Equation (3.13) by solving the ordinary differential equation of first order
Using in (3.17) as unknown function, separation of variables gives as unique solution to (3.17) the function
where the positive constant C has been chosen to normalize the equilibrium distribution. It is not difficult to discover that tends to 0 as and . In other words, the city population size distribution obtained under the Non-Maxwellian collision does not exist with too little or too much population, which is more consistent with the real situation, and is close to the generalized Gamma density as .
4. Numerical Tests
In this section, we will use statistical data to verify the validity of the model. Here, we chose data from the Statistical Yearbook 2019 (27 provinces and 4 municipalities directly under the Central Government) released by the National Bureau of Statistics of China with the population of 332 cities (prefecture-level cities) in 2018. The histogram of the city size distribution is shown in Figure 1.
From this probability distribution, we noticed that cities with a population of 1 million to 2 million are the majority. The number of cities with a population of more than 3 million decreases with the increase of the number of people contained, and the rate of decrease also changes from a sharp decline to a slow convergence to zero with the increase of the number of people. To fit the data with our model, we take a set of parameters
The equilibrium distribution of both Maxwellian model  and our non-Maxwellian model are shown in Figure 2. This result shows that the non-Maxwellian model fits the city size distribution of China better than the Maxwellian model.
Figure 1. Probability distribution histogram of China’s cities size.
Figure 2. Theoretical steady-state distribution and the real data.
5. Conclusion and Perspectives
In this paper, we introduced non-Maxwellian kinetic modeling, in which a variable collision kernel is used in the underlying kinetic equation of Boltzmann type, to explain the evolution of city size in China. By resorting to the well-known quasi-invariant asymptotic, we obtain a kinetic Fokker-Planck counterpart and the steady-state of city size which is defined as the generalized Gamma distribution. Numerical test shows good fit of the generalized Gamma distribution with the city size distribution of China. However, further understanding of the role of each parameter, for example, the ideal city size , is still open. It would also be interesting to investigate the trend of the city size distribution under the effect of fast urbanisation of China in recent and next several years.
The research is partially supported by the National Science Foundation of China (Grant Nos. 11871335 and 11971008).
 Gualandi, S. and Toscani, G. (2019) Size Distribution of Cities: A Kinetic Explanation. Physica A: Statistical Mechanics and Its Applications, 524, 221-234.
 Arshad, S., Hu, S. and Ashraf, B.N. (2018) Zipfs Law and City Size Distribution: A Survey of the Literature and Future Research Agenda. Physica A: Statistical Mechanics and Its Applications, 492, 75-92.
 Benguigui, L. and Blumenfeld-Lieberthal, E. (2007) Beyond the Power Law—A New Approach to Analyse City Size Distributions. Computers, Environment and Urban Systems, 31, 648-666.
 Gangopadhyay, K. and Basu, B. (2009) City Size Distributions for India and China. Physica A: Statistical Mechanics and Its Applications, 388, 2682-2688.
 Malevergne, Y., Pisarenko, V. and Sornette, D. (2011) Testing the Pareto against the Lognormal Distributions with the Uniformly Most Powerful Unbiased Test Applied to the Distribution of Cities. Physical Review E, 83, Article ID: 036111.
 Rozenfeld, H., Rybski, D., Gabaix, X. and Makse, H. (2011) The Area and Population of Cities: New Insights from a Different Perspective on Cities. American Economic Review, 101, 2205-2225.
 Ghosh, A., Chatterjee, A., Chakrabarti, A.S. and Chakrabarti, B.K. (2014) Zipfs Law in City Size from a Resource Utilization Model. Physical Review E, 90, Article ID: 042815.
 Zanette, D.H. and Manrubia, S.C. (1997) Role of Intermittency in Urban Development: A Model of Large-Scale City Formation. Physical Review Letters, 79, 523-526.
 Gualandi, S. and Toscani, G. (2019) Human Behavior and Lognormal Distribution: A Kinetic Description. Mathematical Models and Methods in Applied Sciences, 29, 717-753.
 Chatterjee, A., Chakrabarti, B.K. and Manna, S.S. (2004) Pareto Law in a Kinetic Model of Market with Random Saving Propensity. Physica A: Statistical Mechanics and its Applications, 335, 155-163.
 Düring, B., Matthes, D. and Toscani, G. (2008) Kinetic Equations Modelling Wealth Redistribution: A Comparison of Approaches. Physical Review E, 78, 056-103.
 Boudin, L., Mercier, A. and Salvarani, F. (2012) Conciliatory and Contradictory Dynamics in Opinion Formation. Physica A: Statistical Mechanics and Its Applications, 391, 5672-5684.
 Pareschi, L. and Toscani, G. (2014) Wealth Distribution and Collective Knowledge. A Boltzmann Approach. Philosophical Transactions of The Royal Society A, 372, Article No. 0396.
 Brugna, C. and Toscani, G. (2015) Kinetic Models of Opinion Formation in the Presence of Personal Conviction. Physical Review E, 92, Article ID: 052818.
 Bellomo, N., Colasuonno, F., Knopoff, D. and Soler, J. (2015) From a Systems Theory of Sociology to Modeling the Onset and Evolution of Criminality. Networks & Heterogeneous Media, 10, 421-441.
 Furioli, G., Pulvirenti, A., Terraneo, E. and Toscani, G. (2019) Non-Maxwellian Kinetic Equations Modeling the Evolution of Wealth Distribution. Mathematical Models and Methods in Applied Sciences, 30, 685-725.
 Furioli, G., Pulvirenti, A., Terraneo, E. and Toscani, G. (2017) Fokker-Planck Equations in the Modelling of Socio-Economic Phenomena. Mathematical Models and Methods in Applied Sciences, 27, 115-158.