Back
 JMF  Vol.9 No.4 , November 2019
A New Way to Compute the Probability of Informed Trading
Abstract: Volume-Synchronized Probability of Informed Trading (VPIN) is a tool designed to predict extreme events like flash crashes in high-frequency trading. Its aim is to estimate the Probability of Informed Trading (PIN), which was built from a probabilistic framework. Some concerns have been raised about its theoretical foundations and its reliability. More precisely, it has been shown that theoretically the VPIN does not approximate the PIN as the PIN has been built with a time-clock framework and the VPIN with a volume clock one. On a practical point of view, the VPIN has been found to be sensitive to the starting point of computation of a data set and to different parameters, such as the classification rule. In this paper, in order to improve the PIN theoretical framework, we firstly analyze the theoretical foundations of the PIN and the VPIN models to have a better view of all its different assumption subtleties. It secondly makes it possible to point out some approximation flaws in the formula used to approximate the PIN and to propose another exact way to compute the PIN. All different results are illustrated with simulations.

1. Introduction

The amount of trading data has exploded in finance thanks to the continuing progress of high frequency techniques. It constrains practitioners to use more and more state-of-the-art algorithms to deal with this overwhelming amount of information. Computers and algorithms are more and more efficient, but still decision making is based on both the quantity and the quality of information. Thus, errors and speculations that can make the financial market toxic, i.e. conducive to crashes, are still possible. Examples in the past, such as the “Flash Crash” of May 6, 2010, have shown that algorithmic trading in finance has made it possible to introduce new kind of crashes characterized by their suddenness. Such quick crashes seem dangerous because of a kind of inherent unpredictability. However, theoretical framework to model this new phenomenon exists.

Easley, Engle, O’Hara and Wu [1] designed a model of the high-frequency financial market based on flows of informed and uninformed traders. In this model, informed traders are aware of the evolution of the price in the future and thus of which decision takes (buy or sell). The authors managed to show that information is a key parameter of the spread between ask and bid of prices, as they demonstrate that the probability of being informed within their theoretical framework is proportionally linked with it. They named this key parameter the Probability of Informed Trading (PIN). A high value of the PIN is an indicator of the level of toxicity of this high frequency trading market, as it would mean it relies on too many informed traders. Later, Easley, Lopez de Prado, O’Hara [2] [3] designed a tool, nicknamed Volume-synchronized Probability of Informed Trading (VPIN), supposed to approximate the PIN. It appeared it could predict the “Flash Crash” of May, 6 2010 a few hours before it happened [4]. A number of papers have been written [5] [6] [7], and it is proposed to use it for regulation through a VPIN contract [4] [8]. However, critics pointed out some flaws, questioning its reliability. For example, Andersen and Bondarenko have shown [9] that the VPIN is quite sensitive to the starting point of when one starts computing the VPIN on a data set. It indeed questions the VPIN prediction quality. Moreover, they have also shown that the VPIN is sensitive to other parameters, such as the trade classification rule used [10], or how one defines the average daily volume of trades [11]. Changing the classification rule may drastically change the VPIN behavior [12]. Tomas Pöppe, Sebastian Moos and Dirk Schiereck have arrived to the same conclusions with a different approach. Using a different classification rule can change the VPIN prediction power toward a crash (in their paper a German blue-chip stock) [13]. Besides, controlling ex-ante parameters seem to give poorer prediction quality [10] [11]. This point has also been checked by D. Abad, M. Massot and R. Pascual [12]. Controlling for ex-ante realized volatility, and trading intensity, as did T. G. Andersen and O. Bondarenko [11], prediction quality seems to vanish. More deeper, they have also underlined that it is not obvious how one should define a VPIN prediction, analyzing more precisely toxic and non-toxic halts, as well as toxic events. Furthermore, Torben G. Andersen and Oleg Bondarenko interpret the VPIN as being too sensitive to trading intensity. They have also explained the VPIN metric is sometimes unexpectedly correlated with other usual ones (such as VIX or RV) [9] [10]. Moreover, it has been shown [14] [15] that the VPIN does not approximate the PIN, as the PIN was built on a time-clock theoretical framework, and the VPIN with a volume-clock paradigm. In this study, we propose another way to estimate the PIN within its original time-clock framework.

The purpose of this paper is to improve the PIN theoretical framework. Some concerns have been raised about its theoretical foundations. For this reason we assess step by step all the different theoretical ideas of the PIN model. More precisely, we firstly want to explicit all the theoretical framework of the PIN and the VPIN model to have a better view of all its different assumption subtleties. It secondly makes it possible to point out some approximation errors in the formula used to approximate the PIN and to propose another exact way to compute the PIN. In the following, we first recall the PIN model (Section 2). Second, after introducing the VPIN original ideas we analyze the original first order approximation and then recall the difference of time clock and volume clock paradigm (Section 3). Finally, we suggest another way to compute the PIN (Section 4).

2. The PIN Model

2.1. The Time-Clock Framework

The Probability of Informed Trading (PIN) is computed on a simple model of information among traders [16]. Let’s describe it with the following tree below (Figure 1), originally designed in [16]. Suppose prior to the beginning of any trading day, Nature determines whether an information event is relevant to the value of the asset to occur. Suppose information events are independently distributed and occur with a Bernoulli probability of value α , which can be seen on the first two branches on the left-hand side of the tree. These events are good news with a Bernoulli probability 1 δ (i.e. signal High), or bad news with probability δ (i.e. signal Low). After the end of trading on any day, and before Nature moves again, the full information value of the asset is realized. Hence, for any of the three leaves of the tree in Figure 1, an informed trader would know which action to take. Trade arises from both informed traders (those who have seen any signal) and uninformed traders. On any day, arrivals of uninformed buyers and uninformed sellers are described by independent Poisson processes of respective intensity ϵ and μ . Individuals trade a single risky asset and money with a market maker over i = 1 , , I trading days. Within any trading day, time is continuous and it is indexed by t [ 0, T ] . Let’s define for t [ 0, T ] , for a given trading day, S t and B t the events that an order of respectively a sell and a buy arrive at time t. Let P t = ( P t ( n ) , P t ( b ) , P t ( g ) ) be the market maker’s prior belief about the events “no news” (n) “bad news” (b) and “good news” (g) at time t1. Within this model we compute the spread at t Σ t which is equal to a t b t , where a t and b t are the ask and bid at time t (respectively the minimum price a seller is willing to receive and the maximum price a buyer

Figure 1. A tree summarizing the theoretical framework.

is willing to pay). Within this framework b t is the expectation of the asset value, we denote V t , conditional on the history prior to t and on sell order S t . Similarly, a t is the expectation of V t conditional on the history prior to t and on buy order B t . Let note V ¯ , V * and V _ respectively the value of the asset under the conditions of good new, no information and bad new. We have of course the following inequalities: V _ V * V ¯ .

2.2. Computation of the Spread

We explicit now more the content of [3]. Let’s compute the bid, the ask follows exactly the same idea2:

b t = E ( V t | t , S t ) .

It can be re-written this way using the different possibilities of the tree on an event:

b t = E ( V t | t , S t , n ) P t ( n | S t ) + E ( V t | t , S t , g ) P t ( g | S t ) + E ( V t | t , S t , b ) P t ( b | S t ) = V * P t ( n | S t ) + V ¯ P t ( g | S t ) + V _ P t ( b | S t ) .

Let’s compute the first term P t ( n | S t ) , others follow the same idea. Using Bayes rule one finds the following:

P t ( n | S t ) = P t ( S t | n ) P t ( n ) P t ( S t ) ,

so, by decomposing the denominator:

P t ( n | S t ) = P t ( S t | n ) P t ( n ) P t ( S t | n ) P t ( n ) + P t ( S t | g ) P t ( g ) + P t ( S t | b ) P t ( b ) .

Let’s have a look at the term P t ( S t | n ) which is the probability at t that there will be a sell order at t under the constraints of no news. P t ( S t | n ) is a transition rate. To compute it, one must first calculate the transition probability for a strictly positive time length let say h. Formally, if one notes N t the number of jumps of the corresponding Poisson process up to t under conditions of no events, we know its intensity is ϵ t under the constraint of no news. For any h strictly positive and small enough we look to the limit of the number P ( N t N t h 1 | n ) h when h goes to zero remaining strictly positive, which defines the transition rate. At first order on h, one finds:

P ( N t N t h 1 | n ) = 1 e ϵ h = ϵ h + o ( h ) .

Dividing by h, one re-finds indeed the intensity of the Poisson process, which is a special case of a Markov jump process. Applying the same for other cases (“bad event”, “good event”), we have finally the following:

P t ( S t | n ) P t ( n ) + P t ( S t | g ) P t ( g ) + P t ( S t | b ) P t ( b ) = P t ( n ) ϵ + P t ( g ) ϵ + P t ( b ) ( μ + ϵ ) .

As the probabilities with ϵ sum to one we get the following expression:

P t ( S t | n ) P t ( n ) + P t ( S t | g ) P t ( g ) + P t ( S t | b ) P t ( b ) = ϵ + P t ( b ) μ .

Finally the bid has this expression:

b t = P t ( n ) ϵ V * + P t ( b ) ( ϵ + μ ) V _ + P t ( g ) ϵ V ¯ ϵ + P t ( b ) μ .

With the same reasoning the ask has this expression:

a t = P t ( n ) ϵ V * + P t ( b ) ϵ V _ + P t ( g ) ( ϵ + μ ) V ¯ ϵ + P t ( g ) μ .

Actually one may simplify a bit these expressions as the expectation of V has the following form:

E ( V t ) = V * P t ( n ) + V _ P t ( b ) + V ¯ P t ( g ) .

We find:

b t = μ V _ P t ( b ) ϵ + P t ( b ) μ + ϵ ϵ + P t ( b ) μ E ( V t ) ,

and:

a t = μ V ¯ P t ( g ) ϵ + P t ( g ) μ + ϵ ϵ + P t ( g ) μ E ( V t ) .

So the spread equals to:

Σ t = a t b t = E ( V t ) ( ϵ ϵ + P t ( g ) μ ϵ ϵ + P t ( b ) μ ) + μ V ¯ P t ( g ) ϵ + P t ( g ) μ μ V _ P t ( b ) ϵ + P t ( b ) μ .

In the special case where P t ( g ) = P t ( b ) one finds the following simple form:

Σ t = μ P t ( g ) ϵ + P t ( g ) μ ( V ¯ V _ ) = μ ( 1 P t ( n ) ) 2 ϵ + ( 1 P t ( n ) ) μ ( V ¯ V _ ) .

If we make the hypothesis that P t ( n ) = P 0 ( n ) = 1 α is constant, then we have the following:

Σ t = μ α 2 ϵ + α μ ( V ¯ V _ ) = thePIN ( V ¯ V _ ) .

Thus, with the assumptions: P t ( g ) = P t ( b ) = δ = 1 δ , i.e. δ = 1 2 and P t ( n ) = P 0 ( n ) = 1 α , the PIN equals the following:

PIN = μ α 2 ϵ + μ α .

We will keep the same hypothesis for the rest of the paper.

3. Analysis of the First Order Approximate within the Time-Clock Framework

The idea behind the VPIN is to find an easy way to compute the last above expression of the PIN using a volume-clock paradigm. More precisely, it aims at finding a way to easily compute the expressions obtained for the numerator α μ and denominator ( α μ + 2 ϵ ). The key heuristic behind the VPIN is to take advantage of a supposedly good property of the expectation of the absolute difference between Poisson random variable within a volume-clock framework to approximate α μ , i.e.: E ( | X Y | ) , where X and Y are Poisson variables. We will see this heuristic does not really make it possible to conclude as expected. More precisely, in the first subsection we will see which idea has been used to approximate the PIN within a time-clock framework. Secondly, we will see that first-order approximations used are not correct as the framework does not verify a required hypothesis. We analyze more precisely the first order approximates which can be made in the time-clock framework. In the third subsection, we describe the volume-clock framework and explain why its hypotheses lead to different results compared to the time-clock framework. Finally, we illustrate our results with simulations.

3.1. The Design of a New Heuristic

In the first subsection we see which idea has been used to approximate the PIN within a time-clock framework. We refer now to the related work of Easley et al. [1]. Considering the previous framework the probability to obtain on the same time y t = ( S , B ) , S sells and B buys for day t of length one is:

P ( y t = ( S , B ) ) = ( 1 δ ) α e ( μ + 2 ϵ ) ( μ + ϵ ) B ϵ S B ! S ! + ( 1 α ) e 2 ϵ ϵ B + S B ! S ! + α δ e ( μ + 2 ϵ ) ( μ + ϵ ) S ϵ B B ! S ! .

So, if one notes T T = S + B the total number of trades for this day, one finds, conditioning by all possibilities of the model:

E ( T T ) = α ( 1 δ ) E ( T T | g ) + α δ E ( T T | b ) + ( 1 α ) E ( T T | n ) .

S and B are independant Poisson process, so one can sum in each case their respective intensities to find new Poisson processes. Thus:

E ( T T ) = α ( 1 δ ) ( ϵ + μ + ϵ ) + α δ ( μ + ϵ + ϵ ) + ( 1 α ) ( ϵ + ϵ ) ,

i.e.

E ( T T ) = α μ + 2 ϵ .

Note the following:

· Remark 1: the time period is fixed, thus S and B can take whatever possible positive integer values, which won’t be the case if S + B was fixed.

· Remark 2: intensities are rates, thus the equation has a meaning because one implicitly multiplies it by one (trading day).

The authors propose to compute the expectation of the absolute value of the following random number K = S − B with an approximate. This is the intuition behind the computation of the VPIN. They refer to the following paper of Katti [17] but do not explicit any calculus. They assert that E ( | K | ) = α μ thanks to a first order approximation without explaining what it does mean. Let’s first describe the content of this reference and assumptions assumed. Then let’s describe which computations are involved within this time-clock framework.

3.1.1. Katti’s Reference Assumptions

The reference proposes several ways to compute the expectation of the absolute value of two random variables that follow same discrete positive distribution but with possibly different parameters. The case of Poisson processes is treated. Let’s describe the beginning of Katti’s paper [17]. Let’s note X 1 and X 2 two Poisson random variables of intensity λ 1 and λ 2 . We would like to compute the following number Δ 1 = E | X 1 X 2 | . One can write the following:

Δ 1 = i , k k P ( X 2 X 1 = k | X 1 = i ) P ( X 1 = i ) + i , k k P ( X 1 X 2 = k | X 2 = i ) P ( X 2 = i ) = i , k k P i 1 P i + k 2 + i , k k P i 2 P i + k 1 ,

where the summations are over 1 , 2 , 3 , and P i 1 = e λ 1 λ 1 i i ! and P i 2 = e λ 2 λ 2 i i ! . Then, one can develop it as follows:

Δ 1 = e λ 1 λ 2 ( i = 0 i = 0 k ( λ 1 λ 2 ) i λ 2 k i ! ( i + k ) ! + i = 0 i = 0 k ( λ 1 λ 2 ) i λ 1 k i ! ( i + k ) ! ) = e λ 1 λ 2 ( A 1 + B 2 ) ,

with A 1 and B 1 the two different sums. The author, in order to simplify the calculus and use a trick, makes the following assumptions: λ 1 λ 2 = ν , where ν is a constant not linked anymore to λ 2 nor λ 1 . It implies thus a relation between the two variables (for example λ 1 = 1 λ 2 ). Thanks to this assumption he can do the following:

A 1 = ( λ 2 δ δ λ 2 ) ( i = 0 k = 0 ν i λ 2 k i ! ( i + k ) ! ) = ( λ 2 δ δ λ 2 ) A 0 ,

say, with:

A 0 = i = 0 ν i ( i ! ) 2 F 1 ( 1 ; i + 1 ; λ 2 ) ,

where F 1 ( α , γ , x ) is a confluent hypergeometric function. Operating by ( λ 2 δ δ λ 2 ) it finally leads to:

A 1 = ( λ 2 ν λ 2 ) A 0 + ν e 2 ν F 1 ( 3 2 ; 3 ; 4 ν ) + ν λ 2 F 1 ( 1 2 ; 1 ; 4 ν ) .

The particular case of λ 1 = λ 2 = λ cannot be treated with this trick because it would imply equal numbers are linked by an inverse relation, so that the product is independant of λ 2 . But λ 1 = λ 2 = ν , ν is not anymore a constant of the main parameters λ 1 and λ 2 , so applying the operator does not give the previous results. One may use here another reference, one cited by Katti [18]. We will detail later the same ideas for our precise the VPIN framework. Anyway, this case leads to the following:

Δ 1 = 2 λ e 2 λ ( I 0 ( 2 λ ) + I 1 ( 2 λ ) ) ,

where I n ( x ) = i = 0 ( x 2 ) n + 2 i i ! ( n + i ) ! is a modified Bessel function of first kind.

3.1.2. How to Use as Far as Possible References’ Work to Approximate the VPIN in a Time-Clock Framework

First, let’s put ourselves in the context where we have the differences of only Poisson processes. It’s pretty simple, one just have to condition the expectation of E ( | K | ) for each case:

E ( | K | ) = α ( 1 δ ) E ( | K | | g ) + α δ E ( | K | | b ) + ( 1 α ) E ( | K | | n ) .

Then, remind K = S B . S and B are, under the model assumption, Poisson processes describing the number of sells and buys in one day of trade. We only need two different kinds of Poisson processes to describe the mixture of Poisson processes resulting of informed and uninformed traders in each case (“good event”, “bad event” and “no event”). Let’s note them as follows X ϵ S ~ P ( ϵ ) , X ϵ B ~ P ( ϵ ) , Y μ B ~ P ( μ ) and Y μ S ~ P ( μ ) , S and B labelling buys or sells. One finds then:

E ( | K | ) = α ( 1 δ ) E ( | X ϵ S ( Y μ B + X ϵ B ) | ) + α δ E ( | Y μ B + X ϵ B X ϵ S | ) + ( 1 α ) E ( | X ϵ S X ϵ B | ) .

As all Poisson processes are independant one can sum them to produce new Poisson processes, as follows3:

E ( | K | ) = α ( 1 δ ) E ( | X ϵ Y μ + ϵ | ) + α δ E ( | Y μ + ϵ X ϵ | ) + ( 1 α ) E ( | X ϵ ( 1 ) X ϵ ( 2 ) | ) .

One can thus sum the two first terms and obtain the following:

E ( | K | ) = α E ( | X ϵ Y μ + ϵ | ) + ( 1 α ) E ( | X ϵ ( 1 ) X ϵ ( 2 ) | ) .

One has to treat finally two different cases:

· different intensities: first term

· same intensities: second term

3.2. How to Reach a First Order Approximate

In this subsection we will first see that main assumption to use Katti’s result cannot be used to approximate the PIN. Therefore to approximate the PIN using authors’ intuition we describe then the following two steps:

· one way to reach numerator exact value consists in using Ramasubban’s ideas [18],

· first order asymptotic analysis involves separate cases to study sensitivity of the approximate to parameter’s values.

3.2.1. Katti’s Assumptions Are Not Met in the New Setting

We have seen that Katti’s reference use the assumption that Poisson intensities are linked by a relation of the form λ 1 = ν λ 2 where ν is independent of these parameters. Here the respective parameters would be μ + ϵ and ϵ . The product ϵ ( ϵ + μ ) has clearly no single reason to be a constant. One could create some tricky cases, but it does not seem that the model would like to be limited to these cases (indeed, one may consider for example to fit the PIN parameters maximising likelihood, like in [1] ). Thus the assumptions are not met and the reference [17] cannot be invoked to say E | K | α μ at first order, as it was done in [1] for example.

3.2.2. Computation of E ( | K | )

Anyway, let’s do nevertheless calculations to compute E ( | S B | ) . We follow the same natural ideas of T. A. Ramasubban in this paper which treats only the case of same Poisson intensities [18]. We begin with:

Δ 1 = α E ( | X ϵ Y μ + ϵ | ) + ( 1 α ) E ( | X ϵ 1 X ϵ 2 | ) .

Let’s start with the easier calculation: the case where Poisson intensities are equal.

E ( | X ϵ 1 X ϵ 2 | ) = i , j P ( X ϵ 1 = i ) P ( X ϵ 2 = j ) | i j | = 2 i j = 0 i P ( X ϵ 1 = i ) P ( X ϵ 2 = j ) ( i j ) = 2 e 2 ϵ i j = 0 i ( i j ) ϵ i i ! ϵ j j ! = 2 ϵ e 2 ϵ i * j = 0 i ϵ i 1 ( i 1 ) ! ϵ j j ! j = 1 i ϵ i i ! ϵ j 1 ( j 1 ) ! .

All the sums separately exist, we can split them in two different ones:

E ( | X ϵ 1 X ϵ 2 | ) = 2 ϵ e 2 ϵ ( i j = 0 i + 1 ϵ i i ! ϵ j j ! i * j = 0 i 1 ϵ i i ! ϵ j j ! ) = 2 ϵ e 2 ϵ ( i ϵ i i ! ( ϵ i + 1 ( i + 1 ) ! + ϵ i i ! ) ) = 2 ϵ e 2 ϵ ( i ϵ 2 i + 1 i ! ( i + 1 ) ! + i ϵ 2 i i ! 2 ) .

One recognizes here a modified Bessel functions of first kind: for an integer n and, say scalar x, I n ( x ) = i ( x 2 ) 2 i + n i ! ( n + i ) ! . Here we obtain:

E ( | Y B Y S | ) = 2 ϵ e 2 ϵ ( I 0 ( 2 ϵ ) + I 1 ( 2 ϵ ) ) .

which is the result of Ramasubban’s quoted paper. The computation with different intensities follow the same idea, expect that the symmetry of the two initial sums is broken, so we have to compute them separately.

E | X ϵ Y μ + ϵ | = i j = 0 i P ( X ϵ = i ) P ( Y ϵ + μ = j ) ( i j ) + i j = 0 i P ( X ϵ = j ) P ( Y ϵ + μ = i ) ( i j ) .

Let’s calculate the first sum and then the second:

i j = 0 i P ( X ϵ = i ) P ( Y ϵ + μ = j ) ( i j ) = e 2 ϵ μ ( i = 1 + j = 0 i ϵ i ( i 1 ) ! ( ϵ + μ ) j j ! i = 1 + j = 1 i ϵ i i ! ( ϵ + μ ) j ( j 1 ) ! ) = e 2 ϵ μ ( i = 0 + j = 0 i + 1 ϵ i + 1 i ! ( ϵ + μ ) j j ! i = 1 + j = 0 i 1 ϵ i i ! ( ϵ + μ ) j + 1 j ! ) ,

which separates as follows as all sums exist separately:

i j = 0 i P ( X ϵ = i ) P ( Y ϵ + μ = j ) ( i j ) = e 2 ϵ μ ϵ i = 0 + ϵ i i ! ( ( ϵ + μ ) i + 1 ( i + 1 ) ! + ( ϵ + μ ) i i ! ) e 2 ϵ μ μ i = 0 + j = 0 i ϵ i + 1 ( i + 1 ) ! ( ϵ + μ ) j j ! .

Replacing first sum of the rigth hand side by Bessel functions of second kind, we finally find:

i j = 0 i P ( X ϵ = j ) P ( Y ϵ + μ = i ) ( i j ) = e 2 ϵ μ ϵ ( ϵ + μ ) I 1 ( 2 ϵ ( ϵ + μ ) ) + e ϵ ( ϵ + μ ) ϵ I 0 ( 2 ϵ ( ϵ + μ ) ) e 2 ϵ μ μ i = 0 + j = 0 i ϵ i + 1 ( i + 1 ) ! ( ϵ + μ ) j j ! .

For the second sum, we do an equivalent calculus and find the following:

i j = 0 i P ( X ϵ = j ) P ( Y ϵ + μ = i ) ( i j ) = e 2 ϵ μ ϵ i = 0 + ( ϵ + μ ) i i ! ( ϵ i + 1 ( i + 1 ) ! + ϵ i i ! ) e 2 ϵ μ μ i = 0 + j = 0 i ϵ i + 1 ( i + 1 ) ! ( ϵ + μ ) j j ! ,

thus:

i j = 0 i P ( X ϵ = j ) P ( Y ϵ + μ = i ) ( i j ) = e 2 ϵ μ ϵ 2 ϵ ( ϵ + μ ) I 1 ( 2 ϵ ( ϵ + μ ) ) + e 2 ϵ μ ϵ I 0 ( 2 ϵ ( ϵ + μ ) ) + e 2 ϵ μ μ i = 0 + ( ϵ + μ ) i i ! j = 0 i + 1 ϵ j j ! .

If we put together all the terms we find:

E | X ϵ Y ϵ + μ | = e 2 ϵ μ ( 2 ϵ 2 + ϵ μ ϵ ( ϵ + μ ) I 1 ( 2 ϵ ( ϵ + μ ) ) + 2 ϵ I 0 ( 2 ϵ ( ϵ + μ ) ) μ i = 0 + j = 0 i ϵ i + 1 ( i + 1 ) ! ( ϵ + μ ) j j ! + μ i = 0 + ( ϵ + μ ) i i ! j = 0 i + 1 ϵ j j ! ) .

Arranging the last two sums of the left hand side of the equality we finnaly get:

E | X ϵ Y ϵ + μ | = e 2 ϵ μ ( 2 ϵ 2 + ϵ μ ϵ ( ϵ + μ ) I 1 ( 2 ϵ ( ϵ + μ ) ) + 2 ϵ I 0 ( 2 ϵ ( ϵ + μ ) ) + μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

Thus E | K | equals:

E | K | = 2 ϵ ( 1 α ) e 2 ϵ [ I 0 ( 2 ϵ ) + I 1 ( 2 ϵ ) ] + α e 2 ϵ μ [ 2 ϵ 2 + ϵ μ ϵ ( ϵ + μ ) I 1 ( 2 ϵ ( ϵ + μ ) ) + 2 ϵ I 0 ( 2 ϵ ( ϵ + μ ) ) ] + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

With an arbitrarily time length t for a trading period, we find:

E | K | = 2 ϵ t ( 1 α ) e 2 ϵ t [ I 0 ( 2 ϵ t ) + I 1 ( 2 ϵ t ) ] + α e 2 ϵ t μ t [ 2 ( ϵ t ) 2 + ϵ μ t 2 ϵ ( ϵ + μ ) t I 1 ( 2 ϵ ( ϵ + μ ) t ) + 2 ϵ t I 0 ( 2 ϵ ( ϵ + μ ) t ) ] + α μ t i = 0 + P ( Y ( ϵ + μ ) t = i ) ( P ( X ϵ t i + 1 ) P ( X ϵ t i + 1 ) ) .

3.2.3. Analysis of the First Order Approximate

Recall that ϵ and μ are rates of uninformed and informed traders per day (in the original the PIN model). Thus, these parameters are pretty high integers: this is the first intuition behind first order approximate. Moreover, Hankel [19] derived an asymptotic expansion of modified Bessel function of first kind as follows:

I α ( z ) ~ e z 2 π z ( 1 4 α 2 1 8 z + ( 4 α 2 1 ) ( 4 α 2 9 ) 2 ! ( 8 z ) 2 ( 4 α 2 1 ) ( 4 α 2 9 ) ( 4 α 2 25 ) 3 ! ( 8 z ) 3 + )

for | z | 1 and | arg z | < π 2

We first apply this expansion to E | K | with the condition μ 1 and ϵ 1 , as we consider there are a lot of informed and uninformed traders per day (compared to 1). We find the following:

E | K | ~ 2 ϵ π ( 1 α ) + α 2 π ( ϵ + ϵ ( ϵ + μ ) ) 2 ( ϵ ( ϵ + μ ) ) 3 4 e 2 ϵ ( ϵ + μ ) 2 ϵ μ + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

Let’s now distinguish these three cases:

· μ and ϵ are of same order,

· μ = o ( ϵ ) ,

· ϵ = o ( μ ) ,

If μ and ϵ are of same order,

in this case: ϵ ( ϵ + μ ) < 2 ϵ μ , thus one can neglect the corresponding term. We obtain:

E | K | ~ 2 ϵ π ( 1 α ) + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

Thus PIN ~ α μ α μ + 2 ϵ and

E | S B | E ( S + B ) ~ 2 ϵ π ( 1 α ) + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) α μ + 2 ϵ .

if ϵ μ then it reduces to:

E | K | ~ α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

Thus:

E | S B | E ( S + B ) ~ α μ α μ + 2 ϵ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

If μ = o ( ϵ ) ,

we find:

E | S B | E ( S + B ) ~ 2 ϵ π + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) 2 ϵ .

And: PIN ~ α μ 2 ϵ 1 .

If ϵ μ , then:

E | S B | E ( S + B ) ~ 1 ϵ π .

If ϵ = o ( μ ) ,

we find:

E | S B | E ( S + B ) ~ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) .

and P I N ~ 1

Thus, we can see that first order approximation depends a lot of:

· the respective values of μ and ϵ ,

· and in a lot of cases of the weighted average of a given Poisson distribution of the difference between cumulative density functions from opposite parts of the tail of another Poisson distribution, i.e.: i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) )

The first order approximation E | S B | ~ α μ proposed in [1] is not incorrect as we will see in the simulations, but sometimes, imprecise.

3.3. The Volume-Clock Paradigm: The Implicit Change of Model Assumptions

In this subsection, we describe the volume-clock framework and explain why its hypotheses lead to different results the PIN compared to the time-clock framework. More precisely, we first describe the new assumptions. Secondly, we make the computations within this new framework, which lead to a new value of the PIN.

3.3.1. The New Assumptions

In [3] D. Easley, M. de Prado and M. O’Hara describe a new model to compute easily the VPIN and therefore the PIN using the above previous results:

· E ( | S B | ) = α μ , supposedly at first order,

· E ( S + B ) = α μ + 2 ϵ

They introduce the paradigm of volume clock and time bars. Let’s first describe it and see that the assumptions are implicitly changed, but ignored. The idea is pretty simple. Consider a trade described by a time serie of price, say p t , labelled with time t. First, They package trades in objects called “bars” that have a fixed time volume, i.e.: they aggregate the time serie in, for example, one-minute time bars. It is equivalent to a sampling of the time serie. Each bar is a kind of new trade with several rules to guess its price. Second they agreggate these time bars to form fixed in volum “buckets”. Say these buckets have a volume V.

· Remark 1: nothing can ensure us that buckets will have a fixed volume size. Indeed, each time bar is sensitive to trading intensity. The last time bar can often be too big to be aggregated to a fixed size bucket. Which mean, that if one wants to force bucket size to be constant then, a lot of time bar won’t be of one minute lenght. If one on the contrary wants to preserve time size to be constant, a lot of buckets might not be of constant volume size.

Suppose anyway that everything is ideal and that each bucket is of constant volume. Authors note τ the label of a bucket of volume V, V τ , and V τ S and V τ B respectively the total number of sells and buys that occured in this bucket. They then refer to their previous work [1] result: E ( | V τ S V τ B | ) α μ . But even if the result does not hold as previously shown, one must note the following:

· First: here the bucket is constant in volume, thus filling volume time is random, it is a really strong hypothesis, as we have then: V τ S + V τ B = V that holds almost surely,

· Second: they use the result indeed to say that as V τ S + V τ B = V then the expectation equals V = 2 ϵ + α μ .

· But finally, one should remark that this equality lacks a time, as we are talking of rates of traders. In the first model, the time was one day, and implicitly one would multiply within the time-clock framework, rates by one day. Here, in the volume-clock framework, one does not control anymore time. One should take into account filling bucket time which is a new random variable. At first glance, the expression is inhomogenous and even if right, it is far from being trivial.

Indeed the authors preciss us “recall that we divide the trading day into equal-sized volume buckets and treat each volume bucket as equivalent to a period for information arrival”. It’s misleading. Recall that in the initial model time is fixed (one day) and thus volum is random. Here one has the contrary, volume is fixed and time is thus random. Let’s detail a bit more the calculus with the new assumptions. To do so let’s precise a bit more the new implicit framework.

3.3.2. A New Computation of E ( | V t S V t B | | V )

In fact we want to compute now Δ 1 = E ( | V τ S V τ B | | V ) , as bucket volume is fixed. Note t t the filling time of the bucket τ and then note the following:

Δ 1 = E ( | S t t o t S t t o t ( B t t o t B t t o t ) | | V ) ,

with S t t o t , S t t o t , B t t o t and B t t o t the Poisson processes of the total sell up to t and t and the total buys up to t and t . One has in distribution the following:

· S t t o t S t t o t = N S , t t t o t

· B t t o t B t t o t = N B , t t t o t

where N S , t t t o t and N B , t t t o t are Poisson processes describing total sells and buys in the bucket labelled with τ . The variables being independent, we can thus write the following:

Δ 1 = E ( | N S , t t t o t N B , t t t o t | | V ) .

One must note that there is still the constraint of the volume of a bucket:

N S , t t t o t + N B , t t t o t = V .

Thus, imposing one value, imposes the other. Let’s calculate Δ 1 . First, one can condition the events: “good event” (g), “bad event” (b) and “no event” (n):

Δ 1 = α ( 1 δ ) E ( | N S , t t t o t N B , t t t o t | | g , V ) + α δ E ( | N S , t t t o t N B , t t t o t | | b , V ) + ( 1 α ) E ( | N S , t t t o t N B , t t t o t | | n , V ) .

On each event, one knows the distribution of N S , t t t o t and N B , t t t o t . One can then re-write it the following way4:

Δ 1 = α ( 1 δ ) E ( | N ϵ ( t t ) t o t ,1 N ( ϵ + μ ) ( t t ) t o t ,1 | | V ) + α δ E ( | N ( μ + ϵ ) ( t t ) t o t ,2 N ϵ ( t t ) t o t ,2 | | V ) + ( 1 α ) E ( | N ϵ ( t t ) t o t ,3 N ϵ ( t t ) t o t ,4 | | V ) .

The two first terms corresponding to “good” or “bad events” are equal in distribution, that’s why we have:

Δ 1 = α E ( | N ϵ ( t t ) t o t , 1 N ( ϵ + μ ) ( t t ) t o t | | V ) + ( 1 α ) E ( | N ϵ ( t t ) t o t , 2 N ϵ ( t t ) t o t , 3 | | V ) .

Before going further, let’s implement the joint probability density function of for example, sells and buys and respective filling bucket time t-t’ in the case of a bad event. Let’s note it f ( S , B , t t | V , b ) . Now, we synthetise and refer to the great ideas of the proof of Kin and Le [14]. Remark first the following:

f ( S , B , t t | V , b ) = f ( S , B | V , t t , b ) f ( t t | V , b ) ,

and as f ( V | t t , b ) follows a Poisson law of intensity ( t t ) ( 2 ϵ + μ ) , then f ( t t | V , b ) classically follows an Erlang law with the following parameters Γ ( t t ; V ,2 ϵ + μ ) . Second, as S + B = V almost surely, we have the following equalities:

f ( S , B | t t , V , b ) = f ( S | t t , V , b ) = f ( B | t t , V , b ) ,

and:

f ( S , V | t t , b ) = f ( B | V , t t , b ) f ( V | t t , b ) .

We know f ( V | t t , b ) ~ P ( ( 2 ϵ + μ ) ( t t ) ) and considering for example a continuous bounded function g, one can guess easily f ( B | V , t t , b ) computing E ( g ( S , V | t t , b ) ) using that E ( g ( S , V | t t , b ) ) = E ( g ( S , S + B | t t , b ) ) . We find a binomial law B ( S ; V , ϵ 2 ϵ + μ ) i.e.:

f ( S | V , t t , b ) = f ( B | V , t t , b ) = B ( S ; V , ϵ 2 ϵ + μ ) = B ( B ; V , ϵ + μ 2 ϵ + μ ) .

So finally:

f ( S , B , t t | V , b ) = B ( S ; V , ϵ 2 ϵ + μ ) Γ ( V ,2 ϵ + μ ) .

The “no event” case is similar. We thus find the following:

f ( S , B , t t | V ) = α B ( S ; V , 1 2 ) Γ ( t t ; V ,2 ϵ ) + ( 1 α ) B ( S ; V , ϵ 2 ϵ + μ ) Γ ( t t ; V ,2 ϵ + μ ) ,

And after an integration on the random variable t-t’:

f ( S , B | V ) = α B ( S ; V 1 2 ) + ( 1 α ) B ( S ; V , ϵ 2 ϵ + μ ) .

Taking the previous joint probability into account we are thus computing the following expectations of let say X and Y in fact:

Δ 1 = α E ( | V 2 X | ) + ( 1 α ) E ( | V 2 Y | )

with Y ~ B ( S ; V , 1 2 ) and X ~ B ( V , S , ϵ 2 ϵ + μ )

Moreover, if x follows the binomail distribution of which p.d.f is B ( x ; m , p ) , then using Jensen inequality for the concave function y y we have:

· E ( | m 2 x | ) m = E ( [ ( m 2 x ) 2 ] 1 2 ) / m ( 2 p 1 ) 2 + 4 p ( 1 p ) m | 2 p 1 | for large enough m and p differing from 1 2

· E ( | m 2 x | ) m | E ( m 2 x ) | m = | 2 p 1 | .

Thus, for large enough V:

Δ 1 V α | 2 ϵ 2 ϵ + μ 1 | + 0,

i.e.

Δ 1 V α μ 2 ϵ + μ .

Thus the VPIN metric approximates the following for large enough n as shown by Kin and Le [14] :

VPIN = τ = 1 n | V τ S V τ B | n V E ( | V τ S V τ B | | V ) V α μ 2 ϵ + μ ,

which is indeed different of PIN = α μ α μ + 2 ϵ .

3.4. Some Simulation Verification

We present here some simulation verification. First we present the framework and the experienced tested. Second, we present the results.

3.4.1. Framework and Experience Tested

For purpose of illustration, we compare the empirical form of E ( | B S | ) E ( S + B ) with the PIN and the asymptotic limit5 found within the time clock framework for different cases of ϵ and μ . It is pretty easy to do, as controlling ex-ante all the parameters of the model one then just has to generate the appropriate Poisson processes to obtain all the values. We illustrate the results with three examples:

· ϵ = o ( μ ) and μ of same order than ϵ : we took ϵ = 100 and μ { 10000,20000,30000 } ,

· μ = o ( ϵ ) and ϵ of same order than μ : we took μ = 100 and ϵ { 10000,20000,30000 } ,

· ϵ of same order than μ : we took6 ϵ = 10000 and μ { 10000,2000,30000 } .

Remarks:

· We compute 20 values for each choice of ϵ and μ in the three cases above,

· For each of the 20 values, the empirical expectations are computed with an average of 10,000 values,

· To compute the sum i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) , considering the values of ϵ and ϵ + μ , we have bounded the sum to i = 100000 , when probability values starts to be then very little.

3.4.2. Results

On each case, we plot first the empirical numerator E ( | S B | ) , α μ , and the asymptotic limit found (Figure 2, Figure 4 and Figure 6). Second, we plot E ( | B S | ) E ( S + B ) , the PIN (i.e. α μ α μ + 2 ϵ and the asymptotic limit divided by α μ + 2 ϵ (Figure 3, Figure 5 and Figure 7).

Case1: ϵ = 100 , μ { 10000 , 20000 , 30000 }

On Figure 2, first order and asymptotic estimations are very close.

Case 2: μ = 100 , ϵ { 10000 , 20000 , 30000 }

On Figure 3 and Figure 4, one can see better the difference when one does not change μ anymore.

Case 3: ϵ = 10000 , μ { 10000 , 2000 , 30000 }

This last case on Figure 6 and Figure 7 illustrates a market where the number of informed and uninformed traders are of same order.

Figure 2. Empirical, asymptotic and first order numerators.

Figure 3. Empirical, asymptotic and first order approximations of the PIN.

Figure 4. Empirical, asymptotic and first order numerators.

Figure 5. Empirical, asymptotic and first order approximations of the PIN.

Figure 6. Empirical, asymptotic and first order numerators.

Figure 7. Empirical, asymptotic and first order approximations of the PIN.

4. Another Suggestion to Compute the PIN

In this section, we propose another way to compute the PIN. Indeed, as it was seen in the last section, the first order approximation of the PIN within the time-clock is not always precise and its theoretical foundation is not correct. Furthermore, the one we propose is only asymptotic and not easy to compute. Hence we propose an exact formula to compute the PIN in the time-clock framework. More precisely, in the first subsection we describe how to compute exactly the numerator α μ and then the PIN. Secondly, we describe how numerically one can design at least one methodology to compute the PIN. Finally, we present some simulation verification of our results.

4.1. One PIN Upgrade

In this subsection, we detail how to compute exactly the PIN. Recall that the probability to obtain S sells and B buys during a period of length t is:

P ( z t = ( S , B ) ) = ( 1 δ ) α e ( μ + 2 ϵ ) t ( ( μ + ϵ ) t ) B ( ϵ t ) S B ! S ! + ( 1 α ) e 2 ϵ t ( ϵ t ) B + S B ! S ! + α δ e ( μ + 2 ϵ ) t ( ( μ + ϵ ) t ) S ϵ B B ! S ! .

Recall that to compute the PIN we have the assumption: δ = 1 2 , thus we have:

P ( z t = ( S , B ) ) = α 2 e ( μ + 2 ϵ ) t ( ( μ + ϵ ) t ) B ( ϵ t ) S B ! S ! + α 2 e ( μ + 2 ϵ ) t ( ( μ + ϵ ) t ) S ( ϵ t ) B B ! S ! + ( 1 α ) e 2 ϵ t ( ϵ t ) B + S B ! S ! .

So, if one notes T T = S + B the total number of trades for this day, we find:

E ( T T ) = α 2 ( ϵ + μ + ϵ ) t + α 2 ( μ + ϵ + ϵ ) t + ( 1 α ) ( ϵ + ϵ ) t = ( α μ + 2 ϵ ) t ,

and we even have:

E ( S ) = E ( B ) = ( ϵ + α μ 2 ) t = E ( T T ) 2 .

So to estimate the PIN denominator, one can first use for an arbitrary time period an average of S, B or TT. Let’s work with S and take a time period of length t. Let’s estimate the numerator α μ t 2 . To do this, we firstly explicit the margin probability function to obtain S sells in a time period of length t and secondly we compute its first three moments. Thirdly we explain how to compute α and hence the numerator, which finally leads to a new PIN formula.

4.1.1. Margin Function

The probability to obtain S sells during a time period of length t is the following:

P ( z t = S ) = α 2 e ϵ t ( ϵ t ) S S ! + ( 1 α ) e ϵ t ( ϵ t ) S S ! + α 2 e ( μ + ϵ ) t ( ( ϵ + μ ) t ) S S ! = ( 1 α 2 ) e ϵ t ( ϵ t ) S S ! + α 2 e ( μ + ϵ ) t ( ( ϵ + μ ) t ) S S ! .

4.1.2. Computation of First Three Moments

Let’s compute the moment-generating function of this process. We will estimate the numerator using relations between moments. Let u be a real value, let V S be the random variable representing the volume of sells and t the fixed time period associated. We have:

E ( e V S u ) = ( 1 α 2 ) e ϵ t ( e u 1 ) + α 2 e ( ϵ + μ ) t ( e u 1 ) .

Let’s compute the first three moments of V S :

· First moment:

E ( V S e V S u ) = ( 1 α 2 ) ϵ t e u e ϵ t ( e u 1 ) + α 2 ( ϵ + μ ) t e u e ( ϵ + μ ) t ( e u 1 ) ,

so:

E ( V S ) = ( ϵ + α μ 2 ) t .

· Second moment:

E ( V S 2 e V S u ) = ( 1 α 2 ) ϵ t e u e ϵ t ( e u 1 ) + ( 1 α 2 ) ( ϵ t ) 2 e 2 u e ϵ t ( e u 1 ) + α 2 ( ϵ + μ ) t e u e ( ϵ + μ ) t ( e u 1 ) + α 2 ( ( ϵ + μ ) t ) 2 e 2 u e ( ϵ + μ ) t ( e u 1 ) ,

so:

E ( V S 2 ) = ( ϵ + α μ 2 ) t + ( α μ ϵ + ϵ 2 + α μ 2 2 ) t 2 ,

i.e. we have the classic decomposition:

E ( V S 2 ) = E ( V S ) + E ( V S ( V S 1 ) ) .

· Third moment:

E ( V S 3 e V S u ) = ( 1 α 2 ) ϵ t e u e ϵ t ( e u 1 ) + ( 1 α 2 ) ( ϵ t ) 2 e 2 u e ϵ t ( e u 1 ) + 2 ( 1 α 2 ) ( ϵ t ) 2 e 2 u e ϵ t ( e u 1 ) + ( 1 α 2 ) ( ϵ t ) 3 e 3 u e ϵ t ( e u 1 ) + α 2 ( ϵ + μ ) t e u e ( ϵ + μ ) t ( e u 1 ) + α 2 ( ( ϵ + μ ) t ) 2 e 2 u e ( ϵ + μ ) t ( e u 1 ) + 2 α 2 ( ( ϵ + μ ) t ) 2 e 2 u e ( ϵ + μ ) t ( e u 1 ) + α 2 ( ( ϵ + μ ) t ) 3 e 3 u e ( ϵ + μ ) t ( e u 1 ) ,

so:

E ( V S 3 ) = ( ϵ + α μ 2 ) t + 3 ( α μ ϵ + ϵ 2 + α μ 2 2 ) t 2 + ( ϵ 3 + 3 2 α μ ϵ 2 + 3 2 α ϵ μ 2 + μ 3 2 α ) t 3 ,

i.e. we just wrote:

E ( V S 3 ) = E ( V S ) + 3 E ( V S ( V S 1 ) ) + E ( V S ( V S 1 ) ( V S 2 ) ) .

4.1.3. Estimation of α

Remark the following:

E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = ( α μ ϵ + ϵ 2 + α μ 2 2 ( ϵ + α μ 2 ) 2 ) t 2 = ( μ 2 α 2 α 2 μ 2 4 ) t 2 = ( α μ t 2 ) 2 [ 2 α α ] .

Then with the same idea let’s compute the following:

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 = [ ( ϵ 3 + 3 2 α μ ϵ 2 + 3 2 α ϵ μ 2 + μ 3 2 α ) ( ϵ + α μ 2 ) 3 ] t 3 = 3 ϵ t ( α μ t 2 ) 2 [ 2 α α ] + ( α μ t 2 ) 3 [ 4 α 2 α 2 ] = 3 ϵ t ( α μ t 2 ) 2 [ 2 α α ] + ( α μ t 2 ) 2 2 α α α μ t 2 α + 2 α ,

and we know that ϵ t = E ( V S ) α μ t 2 and that ( α μ t 2 ) 2 2 α α = E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 , so:

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 = ( E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 ) ( 3 E ( V S ) 3 α μ t 2 + μ t 2 ( 2 + α ) ) ,

i.e.

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = 3 E ( V S ) + α μ t 2 2 ( 1 α ) α ,

If we use again the formula, we can then replace α μ t 2 by E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 2 α α :

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 ( E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 ) = 3 E ( V S ) + E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 2 α α 2 ( 1 α ) α ,

so:

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 3 E ( V S ) E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = 2 ( 1 α ) α ( 2 α ) .

If we arrange a bit the expression on denominator and numerator on the left hand side of the equation, we remark the following:

· Remark 1:

E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = V a r ( V S ) E ( V S ) ,

· Remark 2:

E ( V S ( V S 1 ) ( V S 2 ) ) ( E ( V S ) ) 3 3 E ( V S ) E ( V S 2 ) + 3 E ( V S ) 2 + 3 ( E ( V S ) ) 3 = E ( ( V S E ( V S ) ) 3 ) 3 V a r ( V S ) + 2 E ( V S ) .

Thus:

E ( ( V S E ( V S ) ) 3 ) 3 V a r ( V S ) + 2 E ( V S ) ( V a r ( V S ) E ( V S ) ) 3 2 = 2 ( 1 α ) α ( 2 α ) .

Introducing the skewness γ and the following notations: σ = V a r ( V S ) , γ = E ( ( V S E ( V S ) σ ) 3 ) and m = E ( V S ) , we obtain finally:

γ σ 3 3 σ 2 + 2 m ( σ 2 m ) 3 2 = 2 ( 1 α ) α ( 2 α ) .

Skewness, standard deviation and expectation are measured from data. To estimate α we thus just have to solve the following second order equation on α :

α 2 2 α + 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 = 0.

The discriminant is positive: Δ = 4 ( 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 ) . As α is a probability, we finally find:

α = 1 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 ,

which is indeed between 0 and 1.

4.1.4. Estimation of α μ t 2

We know that E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = ( α μ t 2 ) 2 [ 2 α α ] , so let’s replace the α of the right hand side of the equality (not with μ t ) by previous expresion. We then estimate α μ t 2 . We finally obtain the following as E ( V S ( V S 1 ) ) ( E ( V S ) ) 2 = σ 2 m with the previous notations:

α μ t 2 = 1 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 1 + 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 ( σ 2 m )

4.1.5. A New PIN Formula

Finally we obtain the following equivalent exact formula:

PIN = α μ t 2 ϵ t + α μ t 2 ,

i.e.:

PIN = 1 m 1 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 1 + 1 4 4 + ( γ σ 3 σ 2 + 2 m ) 2 ( σ 2 m ) 3 ( σ 2 m ) ,

or after simplifying a bit:

PIN = 2 m ( σ 2 m ) 2 4 ( σ 2 m ) 3 + ( γ σ 3 3 σ 2 + 2 m ) 2 + γ σ 3 3 σ 2 + 2 m .

One then just have to estimate on a arbitrary time lenght t, m, σ and γ to estimate the PIN number. The difficulty is then put on estimating on this time period the volume of direction of trades. We describe further a possible framework to compute this number. One can verify numerically that these two formula give the exact same numbers of the PIN.

4.2. A New Framework to Compute the PIN

In this subsection we explain, how at least one framework can be designed to compute the PIN. We would like to compute the PIN number from time, let say t, and period length, let say η , i.e. from t to t + η . With previous framework, we obviously have:

PIN t , t + η = α t , t + η μ t , t + η t 2 ϵ t , t + η t + α t , t + η μ t , t + η t 2 ,

as, all numbers α , μ and ϵ are defined on these time t and period η . And we also have:

PIN t , t + η = 2 m ( σ 2 m ) 2 4 ( σ 2 m ) 3 + ( γ σ 3 3 σ 2 + 2 m ) 2 + γ σ 3 3 σ 2 + 2 m ,

where, m, γ and σ are calculated for the volume of sell V S , t , t + η , between t and t + η .

Thus two things must be implemented to well estimate the PIN:

· the empirical averages implicitly behind m, σ and γ : we will have to put some hypothesis on the time series of volumes to use classic theorems.

· the volume of sells: one needs a model of classifier to guess on a given amount of time the number of sells within the total volume of sells.

Estimation of m, σ and γ

We would like to use the law of large number. We basically need random variable independant and identically distributed. Here: noting N S , t the Poisson process of sells at time t (i.e. the number of sells at t). Then we have:

V S , t , t + η = N S , t + η N S , t .

According to the model, Nature chooses at each time period η the parameters and independently each day. So ( V S , t , t + η ) t is a sequence of (successive non-overlapping) in dependant random variables. But, the V S , t , t + η are not identically distributed. Nothing guarantees it. Indeed, Nature’s choices won’t necessarily be the same, and so α t , t + η , γ t , t + η and σ t , t + η . To handle with this, one can do the following. We need a statistically significant mean. Within the time period: [ t , t + η [ 7, Nature’s choice is the same, so considering n intervals of length η n within [ t , t + η [ , the random variables ( V S , t + ( i 1 ) η n , t + i η n ) i = 1 , , n are then independent and identically distributed. For n high enough the following approximations hold:

· m 1 n i = 1 n V S , t + ( i 1 ) η n , t + i η n ,

· σ 1 n 1 i = 1 n ( V S , t + ( i 1 ) η n , t + i η n 1 n 1 i = 1 n V S , t + ( i 1 ) η n , t + i η n ) 2 ,

· γ 1 n i = 1 n ( V S , t + ( i 1 ) η n , t + i η n 1 n i = 1 n V S , t + ( i 1 ) η n , t + i η n 1 n 1 i = 1 n ( V S , t + ( i 1 ) η n , t + i η n 1 n 1 i = 1 n V S , t + ( i 1 ) η n , t + i η n ) 2 ) 3 ,

Thus the choices to make here are:

· the time length η ,

· the number n of sub-intervals to have a precise average.

To reduce standard variation of PIN t , t + η , one direct way to do it is to take both the averages of the PIN estimated using volume of sells (let’s note it now thePIN t , t + η S and the PIN estimated using volume of buys (let’s note it PIN t , t + η V ). Indeed, the previous calculations are exactly the same if one would have use volume of buys instead of sells. And within the PIN framework V S and V B are independent random variables. So:

PIN t , t + η = 1 2 ( PIN t , t + η S + PIN t , t + η V ) ,

and so, an estimate of PIN t , t + η S is only a function of the process ( V S ) and PIN t , t + η V the same function but depending of the process ( V B ). Thus, these two estimates being independent, if one notes σ PIN S the standard deviation using only the process V S , now the standard deviation σ PIN with both process equals the following:

σ PIN = σ PIN S 2 .

4.3. Some Simulation Verification

1 m 1 1 4 4 + ( γ σ 2 σ 2 + 2 m ) 2 ( σ 2 m ) 3 1 + 1 4 4 + ( γ σ 2 σ 2 + 2 m ) 2 ( σ 2 m ) 3

We present finally some simulation verification. First we describe its framework. Second we present the results. The values of parameter tested are exactly the same as in the last framework, as we would like to compare previous results with the values of our new formula. The only difference which slightly change our framework, is that to compute the new formula one needs more sample. We detail it now.

4.3.1. Framework and Experience Tested

For purpose of illustration, we compare the empirical form E ( | B S | ) E ( S + B ) with the PIN and the new formula8 found within the time clock framework for different cases of ϵ and μ . It is pretty easy to do, as controlling ex-ante all the parameters of the model one then just has to generate the appropriate Poisson processes to obtain all the values. We illustrate the results with three examples:

· ϵ = o ( μ ) and μ of same order than ϵ : we took ϵ = 100 and μ { 10000,20000,30000 } ,

· μ = o ( ϵ ) and ϵ of same order than μ : we took μ = 100 and ϵ { 10000,20000,30000 } ,

· ϵ of same order than μ : we took9 ϵ = 10000 and μ { 10000,2000,30000 } .

Remarks:

· We compute 20 values for each choice of ϵ and μ in the three cases above,

· For each of the 20 values, for a choice of ϵ and μ , we generate 1,000,000 Poisson processes, we divide them in 100 consecutive intervals of 10,000 values. For each of the 100 intervals we compute empirical average to approximate mean m, standard deviation σ and skewness γ . We then compute an approximation of the PIN with an average of these 100 values10.

4.3.2. Results

On each case (Figure 8, Figure 9 and Figure 10), we plot E ( | B S | ) E ( S + B ) (VPIN), the PIN ( α μ α μ + 2 ϵ ) and the new PIN value (labelled as NPIN).

Case1: ϵ = 100, μ { 10000,20000,30000 }

On Figure 8, new formula (NPIN) and PIN are very close.

Case 2: μ = 100, ϵ { 10000,20000,30000 }

Here on Figure 9 one can see better the difference when one does not change μ anymore.

Case 3: ϵ = 10000, μ { 10000,2000,30000 }

This last case on Figure 10 illustrates a market where the number of informed and uninformed traders are of same order. The VPIN really slightly over-estimates the true PIN value.

In any case one sees that new formula estimated is closer than the VPIN one. By the way, we have checked that new PIN formula obviously equals true PIN formula for any parameter ϵ , μ and α of the model.

5. Conclusions

In this last section, we present first a general summary of our findings. Then we propose suggestion for further research on this topic.

Figure 8. Old (VPIN) and new approximation (NPIN) of the PIN.

Figure 9. Old (VPIN) and new approximation (NPIN) of the PIN.

Figure 10. Old (VPIN) and new approximation (NPIN) of the PIN.

In this study we have analyzed the theoretical foundation of the PIN model and we have shown that its time-clock framework makes it hard to apply the VPIN original heuristic to estimate the probability of informed trading. Indeed, first order asymptotic is not that simple to estimate theoretically and in practice. That’s why we propose another way to estimate the PIN, which is theoretically exact and hence more precise than the asymptotic formula, which is confirmed by our first tests. Moreover, the study recalls and highlights the difference of the volume-clock and time-clock paradigms which leads to a different formula of the PIN, and which respective hypotheses cannot therefore be used simultaneously to approximate the PIN.

Here are some ideas to further study this precise subject:

· test and compare the performance of the new formula within the time-clock framework with real trading data: find local optima parameters (n, η , trade classification algorithm, …) to maximize prediction quality,

· analyze and assess stability of the new formula and compare it to other ones.

Acknowledgements

We thank the Editor and the referee for their comments. Useful guidance and discussions in the LBNL team are gratefully acknowledged.

NOTES

1We summarize here the theoretic framework as described in [16]. Formally, considering the random variables corresponding to order arrival of sells and buys St and Bt we associate the canonical respective filtrations to define later conditioned expectations. They are still noted as the events “St” and “Bt”.

2We use the same notations as the author, distinguishing the events “t” and “ S t ”.

3S and B labels do not have any more importance, to differenciate Poisson processes of the last expectation we have thus just put label one and two to distinguish the “no event” case.

4the label 1, 2, 3, … are used to note that these are the same distributions, but these are still different random variables.

5 E | K | ~ 2 ϵ π ( 1 α ) + α 2 π ( ϵ + ϵ ( ϵ + μ ) ) 2 ( ϵ ( ϵ + μ ) ) 3 4 e 2 ϵ ( ϵ + μ ) 2 ϵ μ + α μ i = 0 + P ( Y ϵ + μ = i ) ( P ( X ϵ i + 1 ) P ( X ϵ i + 1 ) ) . .

6This case is more tricky and actually the asymptotic limit is closer to the empirical value than the first order approximate proposed by the authors, but the trend is not obvious and need more study. We present here the good case that works fine. Further study must maybe be done.

7Let’s suppose that choices are made in this time interval, to not bother about possibly overlapping Nature’s choice.

8We use in these simulations for symmetry reasons this formula.

9This case is more tricky and actually asumptotic limit is closer to the empirical value than first order approximate proposed by authors, but the tren is not obvious and needs more study. We present here the good case that works fine. Further study must maybe be done.

10This double average equals traditional the VPIN formula as values are consecutive.

Cite this paper: Bambade, A. (2019) A New Way to Compute the Probability of Informed Trading. Journal of Mathematical Finance, 9, 637-666. doi: 10.4236/jmf.2019.94032.
References

[1]   Easley, D., Engle, R.F., O’Hara, M. and Wu, L. (2008) Time-Varying Arrival Rates of Informed and Uninformed Trades. Journal of Financial Econometrics, 6, 171-207.
https://doi.org/10.1093/jjfinec/nbn003

[2]   Easley, D., de Prado, M.L. and O’Hara, M. (2012) The Volume Clock: Insights into the High Frequency Paradigm. Journal of Portfolio Management, 39, 19-29.

[3]   Easley, D., de Prado, M.L. and O’Hara, M. (2012) Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies, 25, 1457-1493.
https://doi.org/10.1093/rfs/hhs053

[4]   Easley, D., de Prado, M.L. and O’Hara, M. (2011) The Microstructure of the “Flash Crash”: Flow Toxicity, Liquidity, Crashes and the Probability of Informed Trading. The Journal of Portfolio Management, 37, 118-128.
https://doi.org/10.2139/ssrn.1695041

[5]   Zheng, Y.M. (2017) VPIN and the China’s Circuit-Breaker. International Journal of Economics and Finance, 9, 126.
https://doi.org/10.5539/ijef.v9n12p126

[6]   Abad, D. (2011) From PIN to VPIN: An Introduction to Order Flow Toxicity. The Spanish Review of Financial Economics, 6, 8-13.
https://doi.org/10.1016/j.srfe.2012.10.002

[7]   Wu, K.S., et al. (2013) A Big Data Approach to Analyzing Market Volatility. Algorithmic Finance, 2, 241-267.

[8]   Easley, D., de Prado, M.L. and O’Hara, M. (2011) The Exchange of Flow Toxicity. The Journal of Trading, 6, 8-13.
https://doi.org/10.3905/jot.2011.6.2.008

[9]   Andersen, T.G. and Bondarenko, O. (2014) VPIN and the Flash Crash. The Journal of Financial Markets, 17, 1-46.
https://doi.org/10.1016/j.finmar.2013.05.005

[10]   Andersen, T.G. and Bondarenko, O. (2014) Reflecting on the VPIN Dispute. Journal of Financial Markets, 17, 53-64.
https://doi.org/10.1016/j.finmar.2013.08.002

[11]   Andersen, T.G. and Bondarenko, O. (2015) Assessing Measures of Order Flow Toxicity and Early Warning Signals for Market Turbulence. Review of Finance, 19, 1-54.
https://doi.org/10.1093/rof/rfu041

[12]   Abad, D., Massot, M. and Pascual, R. (2017) Evaluating VPIN as a Trigger for Single-Stock Circuit Breakers. Journal of Banking and Finance, 86, 21-36.
https://doi.org/10.1016/j.jbankfin.2017.08.009

[13]   Pöppe, T., Moss, S. and Schiereck, D. (2016) The Sensitivity of VPIN to the Choice of Trade Classification Algorithm. Journal of Banking and Finance, 73, 165-181.
https://doi.org/10.1016/j.jbankfin.2016.08.006

[14]   Ke, W.-C. and Lin, H.-W.W. (2017) An Improved Version of the Volume-Synchronized Probability of Informed Trading. Critical Finance Review, 6, 357-376.

[15]   Easley, D., de Prado, M.L. and O’Hara, M. (2017) An Improved Version of the Volume-Synchronized Probability of Informed Trading (VPIN): A Comment. Critical Finance Review, 6, 377-379.

[16]   Easley, D., Kiefer, N.M., O’Hara, M. and Paperman, J.B. (1996) Liquidity, Information, and Infrequently Traded Stocks. Journal of Finance, 51, 1405-1436.
https://doi.org/10.2307/2329399

[17]   Katti, S.K. (1959) Moments of the Absolute Difference and Absolute Deviation of Discrete Distributions. The Annals of Mathematical Statistics, 31, 78-85.
https://doi.org/10.1214/aoms/1177705989

[18]   Ramasubban, T.A. (1959) The Mean Difference and the Mean Deviation of Some Discontinuous Distributions. Biometrika, 45, 549-556.
https://doi.org/10.1093/biomet/45.3-4.549

[19]   Abramowitz, M., Stegun, I.A., Abramowitz, M. and Stegun, I.A. (1964) Handbook of Mathematical Functions. National Bureau of Standards Applied Mathematics Series 55, 377-378.

 
 
Top