Optimal Demand-Side Management for Smart Micro Grid with Storage

Show more

1. Introduction

Demand-Side Management (DSM), which is the management mechanism of demand side in the next generation of the grid [1] , seeks to address various problems such as efficient energy usage, improvement of the demand profile, reduction of the operation cost, shift energy consumption to reduce PAR, and balance power supply and demand [2] . Several previous works have been studied in order to implement and motivate users to participate in DSM program. Recently, research has concentrated on pricing mechanism which principally comprises ToU pricing, CPP [3] , and RTP schemes [4] that provide economic incentives to consumers to efficiently schedule their energy and get financial benefits. In that context, the authors in [5] have proposed various electricity market models.

Research in Bayesian game is much going beyond the game with complete or incomplete information. The authors of [6] explore a Bayesian game theoretic framework for multiple energy producers competing in energy market in which each producer (player) optimizes its own objective function given the utility demand. On the other hand, the authors in [7] developed a scheduling strategy for DSM with a noncooperative game with incomplete information and each residential user does not know the energy consumption of other users instantly, but the future overall consumptions of all users were given with statistical information. Authors in [8] considered a game with incomplete information in which realtime information to the destination may not be guaranteed to be received adequately, due to the packet loss. In the proposed scenario, the grid agent and the customer agents are the players, and estimate realtime demand and price based on the probability of belief to each other. The previous work developed for DSM using game theory did not take account the presence and the influence of storage in the smart micro-grid.

The rest of the manuscript is organized as follows. The MG model considered in our work is described in System Model. A novel DSM strategy based on game theory is developed in Demand-Side Management based on Bayesian Game Theory. Some performance results are illustrated in Numerical Results, where its use in the management of the recharge of PHEVs in a MG is analyzed.

2. System Model

In this study we consider a low voltage MG consists of $n\in \mathcal{N}\triangleq \left\{\mathrm{1,}\cdots \mathrm{,}n\mathrm{,}\cdots \mathrm{,}N\right\}$ residential users, where $N=\left|\mathcal{N}\right|$ , equipped with RE (e.g., solar PV panel). Users are connected each other and to the public utility via power line. Residential consumers gathering in the MG community share their surplus of energy by storing it in a shared ESU managed by a controller, and act as a single entity when interacting with the public utility. Each user has two types of power loads: ULs and SLs. ULs are appliances that can be turned on at arbitrary instants of the day, i.e. their energy consumption schedule is strictly constrained; that category contains appliances such as refrigeratorfreezer, heating, electric stove and lighting [9] . SLs are considered as appliances whose activation can be softly scheduled within specified interval of time during the day.

Furthermore, each household in the MG is assumed to be equipped with a SM which controls and monitors the energy sharing and the electricity consumption. Each household’s SM also exchange, with other SMs via data network, some information about the RE forecasts, prices of energy, the customers’ demands at every instant and can get information of energy available in the storage unit. We assume that the communication between MG and power utility is supervised by a MSM, i.e. an upgraded SM adapted for operating at high power and serving as the intermediate link between NAN and the main grid global network (BN). The architecture of the proposed MG is shown in Figure 1.

At every instant of time t, each household $n\in \mathcal{N}$ has following sets of power: the renewable power produced by his own RESs, the power demand from his appliances (SLs and ULs). The real time power exchanged by each customer with the MG is evaluated as follows:

${l}_{n}\left(t\right)={l}_{n}^{\left(r\right)}\left(t\right)+{l}_{n}^{\left(s\right)}\left(t\right)$ . (1)

Equation (1) gives the instant power ${l}_{n}\left(t\right)$ exchanged by the user itself at an instant of time t with the MG. It is the sum of: ${l}_{n}^{\left(r\right)}\left(t\right)$ accounting both the power from RESs for the considered user and the power absorbed by its ULs and, ${l}_{n}^{\left(s\right)}\left(t\right)$ depending on the activation of its SLs. That user’s power quantity can be positive (if he is absorbing power from the MG) or negative (if he is supplying power to the MG) and we highlight that it is constrained to the following inequality:

${P}_{g,\mathrm{max}}^{\left(n\right)}<{l}_{n}\text{\hspace{0.05em}}\left(t\right)<{L}_{a,\mathrm{max}}^{\left(n\right)}$ (2)

where
${P}_{g,\mathrm{max}}^{\left(n\right)}$ is the maximum power generated by the n^{th} user’s renewable resources and
${L}_{a,\mathrm{max}}^{\left(n\right)}$ is the maximum power consumed by the same user.

The battery model needs more clarifications concerning its power consumption. The battery’s controller provides real time monitoring of the power ${p}_{b}\text{\hspace{0.05em}}\left(t\right)$ exchanged by the battery itself at time instant t with the MG; this quantity is positive (negative) if the battery is charging from (discharging to) the MG and satisfies the following inequality at any instant of time t:

${P}_{dch,\mathrm{max}}^{\left(b\right)}<{p}_{b}\text{\hspace{0.05em}}\left(t\right)<{P}_{ch,\mathrm{max}}^{\left(b\right)}$ (3)

where ${P}_{dch,\mathrm{max}}^{\left(b\right)}$ and ${P}_{ch,\mathrm{max}}^{\left(b\right)}$ represents respectively the maximum power that can be discharged from the battery and the maximum power needed for charging the battery.

The overall power monitored by the MSM (see Figure 1 for details) is derived as follows; let ${l}_{T}\left(t\right)$ be that power at an instant t of time; it is expressed as

Figure 1. Architecture of the MG [10] .

follows:

${l}_{T}\left(t\right)={\displaystyle \underset{i=1}{\overset{N}{\sum}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{l}_{n}\text{\hspace{0.05em}}\left(t\right)+{p}_{b}\text{\hspace{0.05em}}\left(t\right)$ . (4)

That overall power is also constrained to: ${P}_{pu}^{\left(inj\right)}$ : the negative maximum power that can be injected in the main grid; ${P}_{pu}^{\left(abs\right)}$ : the maximum positive power that can be absorbed from the public utility, and the previous expressions lead to the following inequality:

${P}_{pu}^{\left(inj\right)}<{l}_{T}\left(t\right)<{P}_{pu}^{\left(abs\right)}$ . (5)

The maximum power that can be injected to the publicutility is a negative value of the summation of power generated by all RESs and the maximum power that can come from the battery when discharging; it is expressed as follows:

${P}_{pu}^{\left(inj\right)}\triangleq {\displaystyle \underset{n=1}{\overset{N}{\sum}}}\text{\hspace{0.05em}}{P}_{g,\mathrm{max}}^{\left(n\right)}+{P}_{dch,\mathrm{max}}^{\left(b\right)}<0$ . (6)

3. Demand-Side Management Based on Bayesian Game Theory

In this Section a brief description of our game model is provided and, on the basis of this model, a mixed strategy for the activation of SLs is developed.

3.1. Rules and Description of the Game

The SM installed at each prosumer’s premise is considered as player (taken as player 1 in the following) behaving in a selfish and rational manner, capable of turning the load on or off. Furthermore, as this SM competes with the rest of the MG community in the exploitation of the energy resources available, we can model other $N-1$ prosumers as a single aggregated opponent called player 2 and the shared battery as player 3 because it will be competing with all the prosumers for charging and discharging.

The power flow for player 2 is defined as the difference between the overall power available in the MG and power for player 1 and player 3; it can be expressed as follows:

${l}_{-n}\left(t\right)\triangleq {l}_{T}\left(t\right)-{l}_{n}\left(t\right)-{p}_{b}\left(t\right)={\displaystyle \underset{\begin{array}{c}l=1\\ l\ne n\end{array}}{\overset{N}{\sum}}}\text{\hspace{0.05em}}{p}_{l}\left(t\right)$ (7)

which verify the following inequality:

${P}_{g,\mathrm{max}}^{\left(-n\right)}<{l}_{-n}\left(t\right)<{L}_{a,\mathrm{max}}^{\left(-n\right)}$ (8)

where ${L}_{a,\mathrm{max}}^{\left(-n\right)}>0$ and ${P}_{g,\mathrm{max}}^{\left(-n\right)}\le 0$ are respectively the maximum powers absorbed and generated by player 2; We note that ${l}_{-n}\left(t\right)>0$ when player 2 is absorbing power from the MG and ${l}_{-n}\text{\hspace{0.05em}}\left(t\right)<0$ when player 2 is providing power to the MG.

Those previous MG parameters have led us to a simplified three players game instead of a complicated game of $N+1$ players. Our 3 players game model can be used to describe the interactions between each player with the rest of the MG community.

From the every player’s point of view, there is a payoff associated to each of its actions. The evaluation of payoffs gives a full description of the game. For our work we will assume that:

• For prosumers, the action of keeping SLs off is associated with a payoff equals to 0 for the corresponding prosumer without taking into account the power absorbed or generated by other prosumers and the battery.

• The activation of SLs will entail a variation in the payoff for the corresponding prosumer because it will change the operating conditions of the MG, which means that the associated payoff $E{P}_{n}$ will depend on the expected (statistical) future consumption/generation of the whole MG community.

• The battery (player 3)’s payoff will depend on its charging and discharging efficiency. In other words, its payoff will decrease as the number of charging cycles gets larger; we assume that the battery’s life will be reduced as a function of its charging cycles which will influence its capacity.

The derivation of the expected payoffs $E{P}_{n}$ in the following sections of this work will take into account:

1) A pricing model for power shared between players and the MG; means that each power exchange will be paid or rewarded with a certain amount of monetary units.

2) Specific statistical information available at each prosumer’s SM and the battery’s controller.

3.2. Economic Model of the Smart Micro-Grid

The pricing model takes into account any power exchange between players and the MG. A provision of service, accounted by a power exchange between player and the MG, involves a variation in the total amount of virtual currency owned by the corresponding player. In our work, such variation depends on the MG’s condition and the cost function.

We assume in the following that it depends on the operating condition of the MG represented by a state variable that can take two values. Those values describe the normal (briefly state 0) and stress (briefly state 1) operating conditions. To bring out the characteristics of those states, we consider a positive power threshold that verifies the following inequalities:

${l}_{T}\left(h\right)\le {P}_{c}\text{\hspace{1em}}\text{forthenormalstate}$ (9)

${P}_{c}<{l}_{T}\left(h\right)<{P}_{pu}^{\left(abs\right)}\text{\hspace{1em}}\text{forthestressstate}$ . (10)

The normal state represents the regular operating condition of the MG, whereas the stress state corresponds to high consumption of power which may end up by some risk of blackout.

3.3. Cost Function

The derivation of the cost function is given by:

$\begin{array}{c}C\left({l}_{n},{l}_{-n},{p}_{b}\right)\triangleq -{k}_{A}\left({l}_{T}\right)\cdot \mathrm{max}\left({l}_{n},0\right)-{k}_{G}\left({l}_{T}\right)\cdot \mathrm{min}\left({l}_{n},0\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{k}_{F}\left({l}_{n},{l}_{-n},{p}_{b}\right)\cdot g\left({l}_{n},{l}_{-n},{p}_{b}\right).\end{array}$ (11)

The cost function expresses, for given players’ powers, the cost (if negative) or reward (if positive) for the considered player. The first term of the Equation (11) represents the cost associated with the power absorbed by player 1 from the MG, the second one represents the gain coming from the power supplied to the MG and the third one is a fairness term referring to the immediate power exchange between player 1 with that of player 2 and player 3, and the coefficients ${k}_{A}\mathrm{,}{k}_{G}$ and ${k}_{F}$ are weight functions that can be adjusted by the MSM in order to influence the behavior of the MG community. The variation of powers according to time has been omitted to ease the reading.

In our work, the weight functions appearing in the right hand side of Equation (11) are given by the following expression:

${k}_{X}\left({l}_{T}\right)\triangleq (\begin{array}{ll}{k}_{X}^{\left(j\right)}\hfill & \text{for}\text{\hspace{0.17em}}{l}_{T}\le {P}_{c}\hfill \\ {k}_{X}^{\left(j\right)}+{k}_{X}^{\left(j\right)}\left({l}_{T}-{P}_{c}\right)/{P}_{c}\hfill & \text{for}\text{\hspace{0.17em}}{l}_{T}>{P}_{c}\hfill \end{array}$ . (12)

In the Equation (12), X can take two values: A (absorbing) and G (generating), j can also take two values: 0 (normal state) and 1 (stress state).

3.4. Statistical Information Evaluated by Each Player

As Bayesian game theory is concerned, we assumed in (2) the availability of statistical information at every SM and controller; then player 1 is provided with three different probability density functions (pdfs);
${f}_{{l}_{T}^{\left(j\right)}}\left(x;\tau \right)$ ,
${f}_{{l}_{n}^{\left(r\right)}}\left(x;\tau \right)$ and
${f}_{{p}_{b}}\left(x\mathrm{;}\tau \right)$ which are related to the overall power available in the MG, the n^{th} prosumer’s behavior and the battery’s behavior respectively. In short, the controller and n^{th} SM’s statistical knowledge about the complete MG can be summarized as follows:

1) The first order probability density function (pdf) ${f}_{{l}_{T}^{\left(j\right)}}\left(x;\tau \right)$ with $\left(\tau >t\right)$ which refers to the overall power absorbed by the MG or supplied to the public utility without taking into account DSM.

2) The first order pdf ${f}_{{l}_{n}^{\left(r\right)}}\left(x;\tau \right)$ of the instantaneous portion ${l}_{n}^{\left(r\right)}\left(t\right)$ of ${l}_{n}\left(t\right)$ (see Equation (1)).

3) The first order pdf ${f}_{{p}_{b}}\left(x\mathrm{;}\tau \right)$ of the instantaneous battery power level ${p}_{b}\left(t\right)$ (see Equation (3)).

In order to derive the payoffs function $E{P}_{n}$ , what is needed to the prosumer’s SM is the knowledge of the joint probability ${f}_{{l}_{n}^{\left(r\right)},{p}_{b},{l}_{-n}^{\left(j\right)}}\left(x,y,z;\tau \right)$ . The number of prosumers forming player 2 influences the statistical behavior of ${l}_{-n}^{\left(j\right)}$ in a way that they may exhibit different behaviors when speaking of power consumption/ generation compared to player 1 and player 3. We will assume in the following that the joint probability can be factored as follows:

${f}_{{l}_{n}^{\left(r\right)},\text{\hspace{0.05em}}{p}_{b},{l}_{-n}^{\left(j\right)}}\left(x,y,z;\tau \right)={f}_{{l}_{n}^{\left(r\right)}}\left(x,\tau \right){f}_{{p}_{b}}\left(y,\tau \right){f}_{{l}_{-n}^{\left(j\right)}}\left(z,\tau \right)$ . (13)

It is interesting to mention that in order to estimate the above indicated pdfs, specific learning algorithms have to be developed for a real world implementation of the suggested strategy.

Firstly, the pdf ${f}_{{l}_{T}^{\left(j\right)}}\left(x;\tau \right)$ can be evaluated by the MSM and then its

representation distributed to all players. To achieve that, the MSM must be provided with the exact knowledge of past consumption/generation of players. After receiving the necessary data about users and the weather predictions, the MSM exploits them using improved machine learning tools such as regression models to reliably forecast the statistical behavior of the MG. The assumption

${f}_{{l}_{T}^{\left(i\right)}}\left(x;\tau \right)\cong {f}_{{l}_{-n}^{\left(j\right)}}\left(x;\tau \right)$ can be adopted when the number of N prosumers is big.

Secondly, the approximation of the pdf ${f}_{{l}_{n}^{\left(ru\right)}}\left(x;\tau \right)$ can be reliably evaluated by each prosumer’s SM using machine learning tools on the basis of its real time energy consumption data stored over a number of different days.

Lastly, the estimation of the pdf ${f}_{{p}_{b}}\left(x;\tau \right)$ can be accomplished by the battery controller capable of a real time sensing of the battery power level and be able to predict the statistical charging/discharging behavior of the battery.

3.5. Derivation of the Expected Payoff

Knowing the cost function $C\left({l}_{n}\mathrm{,}{l}_{-n}\mathrm{,}{p}_{b}\right)$ given in Equation (11) and the statistical information previously described, the expected payoff related to the switching on (briefly ON) of SLs can be calculated as follows. We first define the

expected overall cost $E{C}_{n}=\left({l}_{n}^{\left(s\right)}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)$ , charged to player 1 for its power flow in time slot $\left[{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right]$ with ${t}_{sl\mathrm{,1}}^{\left(n\right)}=\left[{t}_{sl\mathrm{,0}}^{\left(n\right)}+{T}_{sl}^{\left(n\right)}\right]$ which is the integral of the cost function evaluated with respect to ${l}_{n}^{\left(r\right)}$ , ${l}_{-n}^{\left(j\right)}$ and ${p}_{b}$ in the interval $\left[{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right]$ and is given by the following equation (note that ${l}_{n}\left(t\right)<{P}_{pu}^{\left(abs\right)}-{l}_{-n}\left(t\right)-{p}_{b}\left(t\right)$ see Equations (5) and (7)):

$\begin{array}{l}E{C}_{n}\left({l}_{n}^{\left(s\right)};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)\\ \triangleq {\displaystyle {\int}_{\tau ={t}_{sl,0}^{\left(n\right)}}^{{t}_{sl,1}^{\left(n\right)}}}{\displaystyle {\int}_{{x}_{2}={P}_{g,\mathrm{max}}^{\left(-n\right)}}^{{P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}}}{\displaystyle {\int}_{{x}_{1}={P}_{g,\mathrm{max}}^{\left(n\right)}}^{\mathrm{min}\left({L}_{a,\mathrm{max}}^{\left(n\right)},\text{\hspace{0.05em}}{P}_{pu}^{\left(abs\right)}-{x}_{2}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)}}{\displaystyle {\int}_{{x}_{3}={P}_{dch,\mathrm{max}}^{\left(b\right)}}^{{P}_{ch,\mathrm{max}}^{\left(b\right)}}}C\left({l}_{n},{l}_{-n},{p}_{b}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\cdot {f}_{{l}_{n}^{\left(r\right)},\text{\hspace{0.05em}}{p}_{b},\text{\hspace{0.05em}}{l}_{-n}^{\left(j\right)}}\left({x}_{1}-{l}_{n}^{\left(s\right)}\left(\tau \right),{x}_{2},{x}_{3};\tau \right)\text{d}{x}_{1}\text{d}{x}_{2}\text{d}{x}_{3}\text{d}{x}_{\tau}\end{array}$ (14)

Next, the expect payoff $E{P}_{n}$ associated with the ON action of player 1 is defined as the difference between the expected cost related to the activation of the considered load at $t={t}_{sl,0}^{\left(n\right)}$ and that associated with keeping it off, which leads to:

$\begin{array}{l}E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)\\ \triangleq E{C}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)-E{C}_{n}\left({l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)f\end{array}$ (15)

Parameters ${l}_{n}^{\left(s\right)+}$ and ${l}_{n}^{\left(s\right)-}$ represent the function ${l}_{n}^{\left(s\right)}\text{\hspace{0.05em}}\left(t\right)$ (see Equation (15)) respectively associated with the ON and OFF actions. A simplified equation for

$E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)$ (15) can be found as follows. We first replace (13) in Equation (14) and we get:

$\begin{array}{l}E{C}_{n}\left({l}_{n}^{\left(s\right)}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)\\ \cong {\displaystyle {\int}_{\tau ={t}_{sl,0}^{\left(n\right)}}^{{t}_{sl,1}^{\left(n\right)}}}\text{\hspace{0.05em}}{\displaystyle {\int}_{{x}_{2}={P}_{g,\mathrm{max}}^{(n)}}^{\left({P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)}}{f}_{{l}_{-n}^{\left(j\right)}}\left({x}_{2},\tau \right)\cdot {\displaystyle {\int}_{{x}_{3}={P}_{dch,\mathrm{min}}^{\left(b\right)}}^{{P}_{ch,\mathrm{max}}^{\left(b\right)}}}{f}_{{p}_{b}}\left({x}_{3},\tau \right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\cdot {\displaystyle {\int}_{{x}_{1}={P}_{g,\mathrm{max}}^{\left(n\right)}}^{\mathrm{min}\left({L}_{a,\mathrm{max}}^{\left(n\right)},{P}_{pu}^{\left(abs\right)}-{x}_{2}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)}}C\left({l}_{n},{l}_{-n},{p}_{b}\right)\cdot {f}_{{l}_{n}^{\left(s\right)}}\left({x}_{1}-{l}_{n}^{\left(s\right)};\tau \right)\text{d}{x}_{1}\text{d}{x}_{2}\text{d}{x}_{3}\text{d}{x}_{\tau}\end{array}$ (16)

We can further simplify the equation by substituting the upper limit of the second integral appearing in the right hand side of Equation (16) $\mathrm{min}\left({L}_{a,\mathrm{max}}^{\left(n\right)},{P}_{pu}^{\left(abs\right)}-{x}_{2}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)$ by ${L}_{a,\mathrm{max}}^{\left(n\right)}$ ; this simplification is justified by the fact that the integral function takes negligible values in the interval that has been added in the integration domain, which yields:

$\begin{array}{l}E{C}_{n}\left({l}_{n}^{\left(s\right)}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},\text{\hspace{0.05em}}{t}_{sl,1}^{\left(n\right)}\right)\\ \cong {\displaystyle {\int}_{\tau ={t}_{sl,0}^{\left(n\right)}}^{{t}_{sl,1}^{\left(n\right)}}}\text{\hspace{0.05em}}{\displaystyle {\int}_{{x}_{2}={L}_{g,\mathrm{max}}^{\left(-n\right)}}^{\left({P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)}}{f}_{{l}_{-n}^{\left(j\right)}}\left({x}_{2};\tau \right){\displaystyle {\int}_{{x}_{3}={P}_{dch,\mathrm{min}}^{\left(b\right)}}^{{p}_{ch,\mathrm{max}}^{\left(b\right)}}}{f}_{{p}_{b}}({x}_{3};\tau )\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\cdot {\displaystyle {\int}_{{x}_{1}={L}_{g,\mathrm{max}}^{\left(n\right)}}^{{L}_{a,\mathrm{max}}^{\left(n\right)}}}C\left({l}_{n},{l}_{-n},{p}_{b}\right){f}_{{l}_{n}^{\left(r\right)}}\left({x}_{1}-{l}_{n}^{\left(s\right)}\text{\hspace{0.05em}}\left(\tau \right);\tau \right)\text{d}{x}_{1}\text{\hspace{0.05em}}\text{d}{x}_{2}\text{\hspace{0.05em}}\text{d}{x}_{3}\text{\hspace{0.05em}}\text{d}{x}_{\tau}\end{array}$ (17)

We finally replace Equation (17) in Equation (15) which leads to the following equation after some manipulations:

$E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)\cong {\displaystyle {\int}_{{l}_{-n}={P}_{g,\mathrm{max}}^{\left(-n\right)}}^{\left({P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right)}}\beta \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)\text{d}{l}_{-n}$ (18)

where:

$\begin{array}{l}\beta \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)\\ ={\displaystyle {\int}_{\tau ={t}_{sl,0}^{\left(n\right)}}^{{t}_{sl,1}^{\left(n\right)}}}{f}_{{l}_{-n}^{\left(j\right)}}\left({l}_{-n};\tau \right){\displaystyle {\int}_{y={P}_{dch,\mathrm{min}}^{\left(b\right)}}^{{p}_{ch,\mathrm{max}}^{\left(b\right)}}}{f}_{{p}_{b}}\left(y;\tau \right){\displaystyle {\int}_{x={P}_{g,\mathrm{max}}^{\left(n\right)}}^{{L}_{a,\mathrm{max}}^{\left(n\right)}}}\text{\hspace{0.05em}}C\left({l}_{n},{l}_{-n},{p}_{b}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\cdot \left[{f}_{{l}_{n}^{\left(r\right)}}\left(x-{l}_{n}^{\left(s\right)+};\tau \right)-{f}_{{l}_{n}^{\left(r\right)}}\left(x-{l}_{n}^{\left(s\right)-};\tau \right)\right]\text{d}x\text{\hspace{0.05em}}\text{d}y\text{\hspace{0.05em}}\text{d}\tau \end{array}$ (19)

The parameter in Equation (19) can be interpreted as an expected cost density because it indicates how the overall expected cost $E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)$ (18) is allocated over the ${l}_{-n}$ axis in the considered time interval $\left[{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right]$ . As in the work of [7] , we have used a generalized expression of $\beta \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)$ that has a discount factor $\omega $ ( $0<\omega <1$ ). This consideration is justified by the fact that:

1) The game is replayed by player 1 every ${T}_{s}$ s until ${n}_{sl}^{th}$ SLs is activated or the maximum activation time limit is reached;

2) For each shiftable load, the activation interval is scheduled during ${N}_{l}^{\left(n\right)}$ slots; which means that the activation time interval for the considered load is: ${T}_{sl}^{\left(n\right)}={N}_{l}^{\left(i\right)}{T}_{s}$ ;

3) The density cost function $\beta \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)$ (19) can be formulated as the sum of ${N}_{l}^{\left(n\right)}$ expressions, each related to a different time slot. To each time slot, we assign a weight factor decreasing exponentially with the slot index [11] .

The new expression of the expected cost density taking into account the above considerations can be expressed as follows:

$\stackrel{\u02dc}{\beta}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)=\frac{1-\omega}{1-{\omega}^{{N}_{l}^{\left(n\right)}}}{\displaystyle \underset{z=0}{\overset{{N}_{l}^{\left(n\right)}-1}{\sum}}}{\omega}^{z}\cdot {\beta}_{z}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\right)\text{d}{p}_{-n}$ (20)

where:

${\beta}_{z}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\right)\triangleq \beta \left({l}_{-n}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}+z{T}_{s}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,0}}^{\left(n\right)}+\left(z+1\right){T}_{s}\right)$ . (21)

In our considered game, player 1 attempts to maximize his own expected payoff $E{P}_{n}$ . For that reason, the optimal pure strategy can be formulated as follows:

${\stackrel{^}{t}}_{sl,0}^{\left(n\right)}=\underset{{\stackrel{\u02dc}{t}}_{sl,0}^{\left(n\right)}\in {S}_{0}^{n}}{\mathrm{arg}\mathrm{max}}E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};\text{\hspace{0.05em}}{\stackrel{\u02dc}{t}}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{\stackrel{\u02dc}{t}}_{sl,0}^{\left(n\right)}+{T}_{sl}^{\left(n\right)}\right)$ (22)

where ${S}_{0}^{n}=\left\{{t}_{p}|{t}_{p}={t}_{sl,0}^{\left(n\right)}+p{T}_{s};\text{\hspace{0.05em}}p=0,1,2,3,\cdots \right\}$ represents all possible instants on which loads can be activated. We need to specify that the expected payoff $E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)$ mainly depends on the power consumption from other players which makes difficult to derive the equilibrium point for the optimal strategy (22).

In our work, we did not adopt the strategy given by Equation (22) for the following reasons:

1) As stated before, the MSM estimates and periodically broadcasts an update of two probabilities density functions; one related to the overall power in the MG:
${f}_{{l}_{-n}^{\left(j\right)}}\left(y\mathrm{;}\tau \right)$ and the second one related to the battery state (charging or discharging) and power level:
${f}_{{p}_{b}}\left(z\mathrm{;}\tau \right)$ ; in the same way, the probability density function
${f}_{{l}_{n}^{\left(r\right)}}\left(x\mathrm{;}\tau \right)$ related to the power consumption of a prosumer is estimated by the n^{th} SM at least on the daily basis. That is why different values of the cost function appearing in the right hand of Equation (22) may emerge when computed at different instants of time and need to be recalculated when an update of the mentioned pdfs is broadcasted.

2) When multiple SLs are simultaneously activated by the n^{th} prosumer, they need to be properly and efficiently scheduled.

These remarks and considering previous work on load management have led us to developing a mixed strategy. It will be part of the following section.

3.6. Mixed Strategy for the Game

In the proposed game, player 1 replays the game at instants
${t}_{p}={t}_{sl,0}^{\left(n\right)}+p{T}_{s}$ , with
$p=0,1,\cdots ,{K}_{n-1}$ , until he chooses to turn the load ON or the maximum number of activation trial (
${K}_{n}$ ) is reached. The selection of a specific action in the p^{th} attempt is randomly chosen in a given set of action based on the probabilities
${P}_{on}^{\left(n\right)}\left[p\right]$ and (
$1-{P}_{on}^{\left(n\right)}\left[p\right]$ ) corresponding to the ON and OFF actions respectively. In our game model,
${P}_{on}^{\left(n\right)}\left[p\right]$ is the activation probability for the n^{th} prosumer in the p^{th} attempt. We need to highlight that:

• Given the activation vector ${P}_{on}^{\left(n\right)}\left[p\right]$ ; where $p=0,1,\cdots ,{K}_{n-1}$ , the probability ${P}_{s}^{\left(n\right)}$ that the ON action is chosen in ${K}_{n}$ trials is expressed a follows:

${P}_{s}^{\left(n\right)}={P}_{on}^{\left(n\right)}\left[0\right]+{\displaystyle \underset{l=1}{\overset{{K}_{n}-1}{\sum}}}{P}_{on}^{\left(n\right)}\left[l\right]{\displaystyle \underset{k=0}{\overset{l-1}{\prod}}}\text{\hspace{0.05em}}\left(1-{P}_{on}^{\left(n\right)}\left[k\right]\right)$ . (23)

• If for any $p=0,1,\cdots ,{K}_{n-1}$ , the activation probability ${P}_{on}^{\left(n\right)}\left[p\right]$ remains constant over ${K}_{n}$ trials (i.e. ${P}_{on}^{\left(n\right)}\left[p\right]={P}_{on}^{\left(n\right)}$ ); the probability of success in Equation (23) will be written as follows after factoring:

${P}_{s}^{\left(n\right)}=1-{\left(1-{P}_{on}^{\left(n\right)}\right)}^{{K}_{n}}$ . (24)

We can derive the activation probability from Equation (24) which gives:

${P}_{on}^{\left(n\right)}=1-{\left(1-{P}_{s}^{\left(n\right)}\right)}^{1/{K}_{n}}\cong \frac{{P}_{s}^{\left(n\right)}}{{K}_{n}}$ (25)

is the activation probability to be selected at each trial to get the probability of success equals to ${P}_{s}^{\left(n\right)}$ .

The objective of our mixed strategy is to adjust the activation probabilities ${P}_{on}^{\left(n\right)}\left[p\right]$ ; $p=\left\{0,1,\cdots ,{k}_{n}-1\right\}$ so as to minimize, on the average over the set of prosumers, the reduction in the expected utility $E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)$ evaluated on the basis of the expected cost density (19). To derive this strategy, we first define a daily average (where ${t}_{b}$ represents the beginning of a considered day and ${T}_{D}=86400\text{\hspace{0.17em}}\text{s}$ its duration):

$\stackrel{\xaf}{\beta}\left({l}_{-n},{l}_{n}^{\left(s\right)+},{l}_{n}^{\left(s\right)-}\right)\triangleq \frac{1}{{T}_{D}}{\displaystyle {\int}_{\tau ={t}_{b}}^{{T}_{D}+{t}_{b}}}\stackrel{\u02dc}{\beta}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};\tau ,\tau +{T}_{sl}^{\left(n\right)}\right)\text{d}\tau .$ (26)

of the expected cost density $\beta \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)$ (19) and the function $\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)$ which represents for a given ${l}_{-n}$ the deviation of the expected cost density from its average and it can be expressed as follows:

$\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\triangleq \stackrel{\u02dc}{\beta}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)-\stackrel{\xaf}{\beta}\left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\right)$ . (27)

Then, the deviation $\Delta E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)$ of the expected payoff $E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)$ from its daily average in the considered time interval $\left({t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)$ is given by the following equation:

$\begin{array}{l}\Delta E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)\\ \triangleq E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)-\stackrel{\xaf}{E{P}_{n}}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\right)\\ ={\displaystyle {\int}_{{l}_{-n}={P}_{g,\mathrm{max}}^{\left(-n\right)}}^{{P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}}}\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\text{d}{l}_{n}\end{array}$ (28)

where

$\stackrel{\xaf}{E{P}_{n}}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\right)={\displaystyle {\int}_{{l}_{-n}={P}_{g,\mathrm{max}}^{\left(-n\right)}}^{{P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}}}\stackrel{\xaf}{\beta}\left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\right)\text{d}{l}_{-n}$ . (29)

The integration domain of the integral appearing in the right hand side of Equation (28) ${\Lambda}^{\left(n\right)}=\left[{P}_{g,\mathrm{max}}^{\left(-n\right)},{P}_{pu}^{\left(abs\right)}-{L}_{a,\mathrm{max}}^{\left(n\right)}-{P}_{ch,\mathrm{max}}^{\left(b\right)}\right]$ can be divided into two parts: ${\Sigma}^{\left(n\right)}$ and its complement: $\stackrel{\xaf}{{\Sigma}^{\mathrm{(}n\mathrm{)}}}$ given by:

${\Sigma}^{\left(n\right)}\triangleq \left\{{l}_{-n}|{l}_{-n}\in {\Lambda}^{\left(n\right)},\phi \left({l}_{-n}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{s}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)}\text{\hspace{0.05em}},{t}_{sl,1}^{\left(n\right)}\right)<0\right\}$ . (30)

The expected payoff’s deviation from its average in Equation (28) can now be rewritten as follows:

$\begin{array}{l}\Delta E{P}_{n}\left({l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}},{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}};{t}_{sl,0}^{\left(n\right)},{t}_{sl,1}^{\left(n\right)}\right)\\ ={\displaystyle {\int}_{\stackrel{\xaf}{{\Sigma}^{\left(n\right)}}}}\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\text{d}{l}_{-n}-{\displaystyle {\int}_{{\Sigma}^{\left(n\right)}}}\left|\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\right|\text{d}{l}_{-n}.\end{array}$ (31)

We need to specify that the first part of the right hand side of Equation (31) is a positive expression describing the reward in monetary units, whereas the

second one, given by: ${\int}_{{\Sigma}^{\left(n\right)}}}\left|\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\right|\text{d}{l}_{-n$ is a negative term

representing a cost or loss of monetary units.

The equilibrium point for our game model can be defined as a reference power level represented by ${\stackrel{\xaf}{P}}_{r}$ for the overall power flow ${l}_{T}\text{\hspace{0.05em}}\left(t\right)$ (1) and consequently, for ${l}_{-n}\text{\hspace{0.05em}}\left(t\right)$ (7) from the assumption: ${l}_{-n}\left(t\right)\cong {l}_{T}\left(t\right)$ . We then partition the integration ${\Sigma}^{\left(n\right)}$ (30) into two different sets given by:

${\Sigma}_{+}^{\left(n\right)}=\left\{{l}_{-n}|{l}_{-n}\in {\Sigma}^{\left(n\right)},{l}_{-n}>{\stackrel{\xaf}{P}}_{r}\right\}$ (32)

and

${\Sigma}_{-}^{\left(n\right)}=\left\{{l}_{-n}|{l}_{-n}\in {\Sigma}^{\left(n\right)},{l}_{-n}<{\stackrel{\xaf}{P}}_{r}\right\}$ (33)

and the error signal is defined by:

$\begin{array}{c}{e}_{n}\text{\hspace{0.05em}}\left[p\right]\triangleq {{\displaystyle \int}}_{{l}_{-n}\in {\Sigma}_{+}^{\left(n\right)}}\left|\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\right|\text{d}{l}_{-n}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}-{{\displaystyle \int}}_{{l}_{-n}\in {\Sigma}_{-}^{\left(n\right)}}\left|\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)\right|\text{d}{l}_{-n}\end{array}$ (34)

It is important to note that the two integrals appearing in the right-hand side of the Formula (34) represent areas of specific regions underlying the function $\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)$ (27), as illustrated in Figure 2, which refers to a scenario of the MG described in chap: Results-Discussion (DSM and no battery, slot = 25), where a positive value of ${e}_{n}\left[p\right]$ is obtained since associated with the domain ${\Sigma}_{-}^{\left(n\right)}$ (33) (area in blue region) is bigger than that associated with ${\Sigma}_{+}^{\left(n\right)}$ (32) (area in red region).

3.7. Discussion on the Error signal

From Equation (34), it is important to highlight that if the error signal ${e}_{n}\left[p\right]$ is positive, i.e. that the first part of the right hand side of Equation (34) is greater than the second; as ${p}_{-n}$ (power for player 2) or the overall power flow is lesser than the threshold, then player 1 should be encouraged to increase the activation probability ${P}_{on}^{\left(n\right)}\left[p\right]$ of his SLs and be discouraged when ${e}_{n}\left[p\right]$ is negative. To achieve that, we need to develop a strategy which adapt ${P}_{on}^{\left(n\right)}\left[p\right]$ (with $p=0,1,2,\cdots ,{K}_{n}-1$ ) based on the signal ${e}_{n}\left[p\right]$ ; this strategy should produce a monotonous increase according to this signal. We adopted, for its simplicity, the following formula:

Figure 2. Representation of the function $\phi \left({l}_{-n}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)+}\text{\hspace{0.05em}}\mathrm{,}{l}_{n}^{\left(s\right)-}\text{\hspace{0.05em}}\mathrm{;}{t}_{sl\mathrm{,0}}^{\left(n\right)}\text{\hspace{0.05em}}\mathrm{,}{t}_{sl\mathrm{,1}}^{\left(n\right)}\right)$ versus ${p}_{-n}$ for the recharge of a PHEV in a specific time interval.

${P}_{on}^{\left(n\right)}\left[p\right]={\stackrel{\xaf}{P}}_{n}+{\gamma}_{n}{\stackrel{\u02dc}{e}}_{n}\left[p\right]$ (35)

where: Figure 3 shows the variation of the activation probability as a function of the error signal ${e}_{n}\left[p\right]$ (34). We notice that this probability becomes zero when the signal error is negative, which corresponds to the MG power consumption higher than the reference power ${\stackrel{\xaf}{P}}_{n}$ .

The parameters appearing in the right-hand side of Equation (35) are defined as follows: ${\stackrel{\xaf}{P}}_{n}$ represents a reference probability level, ${\gamma}_{n}$ is a real positive parameter and ${\stackrel{\u02dc}{e}}_{n}\left[p\right]$ is defined as follows:

${\stackrel{\u02dc}{e}}_{n}\left[p\right]\triangleq {\Phi}_{n}\left({e}_{n}\left[p\right]\right)$ (36)

where:

${\Phi}_{n}\left(e\right)\triangleq (\begin{array}{ll}-{\stackrel{\xaf}{P}}_{n}{\gamma}_{n}^{-1}\hfill & \text{for}\text{\hspace{0.17em}}e<-{\stackrel{\xaf}{P}}_{n}{\gamma}_{n}^{-1}\hfill \\ e\hfill & \text{for}\text{\hspace{0.17em}}-{\stackrel{\xaf}{P}}_{n}{\gamma}_{n}^{-1}<e<\left(1-{\stackrel{\xaf}{P}}_{n}\right){\gamma}_{n}^{-1}\hfill \\ \left(1-{\stackrel{\xaf}{P}}_{n}\right){\gamma}_{n}^{-1}\hfill & \text{for}\text{\hspace{0.17em}}e>\left(1-{\stackrel{\xaf}{P}}_{n}\right){\gamma}_{n}^{-1}\hfill \end{array}$ (37)

Equation (37) represents a clipping function dependent on each prosumer, which limits the variation interval of ${P}_{on}^{\left(n\right)}\left[p\right]$ , evaluated using Equation (35) to the range [0, 1]. The computation of Equation (35) requires the knowledge of the parameters ${\stackrel{\xaf}{P}}_{n}$ and ${\gamma}_{n}$ . In our work the value suggested by (25) has been selected:

${\stackrel{\xaf}{P}}_{n}=\frac{{P}_{s}^{\left(n\right)}}{{K}_{n}}$ (38)

which means that the value assigned to
${\stackrel{\xaf}{P}}_{n}$ is the same as the value that each element of the sequence
${P}_{on}^{\left(n\right)}\left[p\right]$ should take on if all the activation trails made by the n^{th} prosumer were equally likely. On the other hand, the evaluation of
${\gamma}_{n}$ follows an optimization approach based on the following considerations. After

Figure 3. Variation of the activation probability ${P}_{on}^{\left(n\right)}\left[p\right]$ (35) in respect to the error signal ${e}_{n}\left[p\right]$ (34).

replacing (36) in (35) and (35) in (23) we get the following expression:

$\begin{array}{l}{P}_{s}^{\left(n\right)}={f}_{s}\left({\stackrel{\xaf}{P}}_{n},{\gamma}_{n}\right)={\stackrel{\xaf}{P}}_{n}+{\gamma}_{n}{\Phi}_{n}\text{\hspace{0.05em}}\left({e}_{n}\left[0\right]\right)+{\displaystyle \underset{l=1}{\overset{{K}_{n}-1}{\sum}}}\left[{\stackrel{\xaf}{P}}_{n}+{\gamma}_{n}{\Phi}_{n}\text{\hspace{0.05em}}\left({e}_{n}\left[l\right]\right)\right]\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\cdot {\displaystyle \underset{k=0}{\overset{l-1}{\prod}}}\left(1-{\stackrel{\xaf}{P}}_{n}-{\gamma}_{n}{\Phi}_{n}\text{\hspace{0.05em}}\left({e}_{n}\left[k\right]\right)\right)\end{array}$ (39)

We can notice from the Equation (39) that the probability of success ${P}_{s}^{\left(n\right)}$ presents a nonlinear dependence according to the parameter ${\gamma}_{n}$ . For different values of ${P}_{s}^{\left(n\right)}$ and specific sequence of $\left\{{e}_{n}\left[l\right]\right\}$ , the condition ${f}_{s}\left({P}_{s}^{\left(n\right)}{K}_{n}^{-1}\mathrm{,}{\gamma}_{n}\right)$ must be satisfied. After selecting ${\stackrel{\xaf}{P}}_{n}$ according to Equation (38), the value of ${\gamma}_{n}$ satisfying ${P}_{s}^{\left(n\right)}$ and specific sequence of $\left\{{e}_{n}\left[l\right]\right\}$ , the condition ${f}_{s}\left({P}_{s}^{\left(n\right)}{K}_{n}^{-1}\mathrm{,}{\gamma}_{n}\right)$ can be graphically found and evaluated by means of a simple direct search method and be used to emphasize the weight of ${e}_{n}\left[p\right]$ in the evaluation of ${P}_{on}^{\left(n\right)}\left[p\right]$ on the basis of Equation (35).

4. Steps of the MG Bayesian Game Strategy

The implementation of our strategy follows many steps that are summarized in Figure 4.

For using the proposed strategy, the following considerations should be taken into account:

1) In the MG, SLs owned by prosumers can be identified according to their types, where a type could be characterized by two values (absorbed power, absorption duration). Then, when the n^{th} prosumer wants to activate in the same time interval
${N}_{sl}^{\left(n\right)}$ SLs of different types, his SM will evaluate distinct probabilities. Turning on a specific SLs, causes a change in the power absorbed
${l}_{n}^{\left(s\right)}\left(t\right)$ , hence the probabilities of the other SLs waiting for activation must be recalculated.

Figure 4. Steps for executing the Bayesian DSM strategy.

2) Each prosumer repeats the game every ${T}_{s}$ until he opts for the ON action or the OFF action. The slot duration ${T}_{s}$ is proportional to the overall power. Then, if the OFF action has been selected in a certain repetition of the game, the next attempt should occur when the overall power ${L}_{T}\left(t\right)$ has undergone a significant change.

3) The energy available in the MG is shared between the three players on the basis of probabilistic mechanism. It may happen an instant when the scheduled power for SLs is not sufficient. In that case, the MSM is supposed to broadcast a disconnection message of specific portion (if not all) of active SLs to avoid overload risks.

4) The reference power ${\stackrel{\xaf}{P}}_{r}$ which is selected by the MSM and broadcasted to all prosumers, plays an important role because its change in value modifies the equilibrium of the MG. The value of that threshold is practically fixed on the basis of the expected consumption over the whole day.

5. Numerical Results

This chapter describes the results obtained by simulating the DSM strategy for a MG collecting $N=100$ residential prosumers sharing a battery capable of generating 300 KW for 1 hour. Analysis of the results to evaluate the effectiveness of the Bayesian DSM strategy is also presented.

5.1. Assumptions for the Simulation

The following assumptions have been made in all our simulations:

1) Each prosumer in the MG has made a contract to not exceed a maximum power absorbed of ${P}_{a,\mathrm{max}}^{\left(n\right)}=6\text{\hspace{0.17em}}\text{kW}$ , of which, 3.6 kWh are solely for charging his electrical vehicle (PHEV). The PHEV will be considered as the only SLs for each prosumer. Furthermore, the prosumer’s PV panels are able to generate up to $-{P}_{g,\mathrm{max}}^{\left(n\right)}=3\text{\hspace{0.17em}}\text{kW}$ . In order to account the daily fluctuations in the power generated by solar PV panels, we have superposed the average power generated by those renewable sources with a zero mean random Gaussian process.

2) We have considered 15 appliances for each prosumer that are characterized by a probability Mass Function (pmf). For a specific appliance, a pmf represents its activation probability for a single day and is dependent to the prosumer’s behavior. An example of a daily power consumption of a random prosumer is shown in Figure 5 with 1 hour step size. The approximation of pdf
${f}_{{l}_{n}^{\left(r\right)}}\left(x\mathrm{;}\tau \right)$ has been done using a Gaussian model since it gives an accurate representation of this pdf. The parameters (mean and variance) of that pdf are perfectly known by the n^{th} prosumer’s SM.

3) The Gaussian approximation has also been used for the pdf ${f}_{{l}_{T}^{\left(i\right)}}\left(x;\tau \right)$ related to the overall power of the MG in absence of the DSM. As previously mentioned, the approximation ${f}_{{l}_{-n}^{\left(n\right)}}\left(y;\tau \right)\cong {f}_{{p}_{T}^{\left(i\right)}}\left(x;\tau \right)$ has been used and its parameters (mean value and variance) are perfectly known by all prosumers.

4) In our simulation of the proposed DSM strategy, the parameters presented in Table 1 have been used in all simulations.

5) The MG load demand has been observed for three days corresponding to 72 hours and 288 slots when taking 15 minutes as slot duration. The activation requests of SLs (only the PHEV for simplicity), has been concentrated in the second day after reaching a steady state condition in order to evaluate the effectiveness of our DSM strategy under different load demand.

6) The energy storage unity (the battery in our case) has been sized according to the total load demand. In our simulation we have estimated the battery to

Figure 5. Representation of a specific realization for the daily power consumption associated with 15 home appliances (unshiftable loads) owned by a MG prosumer.

Table 1. Values of the main parameters characterizing the considered MG and the proposed DSM strategy.

supply half of the load demand (see Table 1) for one hour without taking into account power fluctuations or voltage drops.

5.2. Overall Powers Comparison

A sample function of the overall power absorbed by the MG from the public utility if it is positive or generated by the MG and injected in the grid if is negative is represented in Figures 6-8 for the considered three days. Three cases, corresponding to the operating conditions of the MG without the battery, with the battery charging and with the battery discharging have been considered. In both cases, the performance of the proposed DSM strategy has been analyzed for SLs (PHEVs).

In particular, in Figure 6, the overall power ${l}_{T}\left(t\right)$ (blue curve) absorbed (if positive) by the MG prosumers from public utility or injected (if negative) by the MG to the utility itself in the absence (upper picture) and in the presence (lower picture) of the proposed DSM strategy and the overall power ${l}_{\text{PHEV}}\left(t\right)$ (red curve) absorbed by the PHEVs are illustrated.

Similarly, in Figure 7, the simulation is carried on the MG with a battery charging from the total power available in the MG. The overall power in the MG ${l}_{T}\left(t\right)$ (blue curve) was also compared to the overall power absorbed by the PHEVs (red curve) in the absence (upper figure) and in presence (lower figure) of the developed DSM strategy.

Finally, the simulation of the micro grid was done taking into account the discharge of the battery. Figure 8 illustrates the operation of the MG in the absence of the DSM (upper figure) and in the presence of the DSM (lower figure). The overall power consumed or delivered by the micro grid (blue curve) and the overall power absorbed by the PHEVs are shown.

Figure 6. MG overall power ${l}_{T}\left(t\right)$ and overall power ${l}_{\text{PHEV}}\left(t\right)$ absorbed by PHEVs for three days without battery.

Figure 7. MG overall power ${l}_{T}\left(t\right)$ and overall power ${l}_{\text{PHEV}}\left(t\right)$ absorbed by PHEVs for three days.

These results evidence that the scheduling of PHEVs may substantially lower the peaks in load demand due to SLs.

As we can see, the operation of the MG in the presence of the battery presents a significant reduction of peaks in the load demand thanks to the compensation in power provided by the battery. This conclusion is also supported by Figure 9 which shows the pdf ${f}_{PAR}\left(x\right)$ of the percentage improvement in the MG PAR due to the DSM and the battery compared to the case where the DSM and the battery are not used. In fact, these results show that the combined

Figure 8. MG overall power ${l}_{T}\left(t\right)$ and overall power ${l}_{\text{PHEV}}\left(t\right)$ absorbed by PHEVs for three days.

Figure 9. Pdf ${f}_{\text{PAR}}\left(x\right)$ of the percentage improvement in the MG PAR.

use of DSM and a shared battery in the MG brings between 40% and 42% on average in the improvement of the MG PAR. Note that this improvement is substantially better compared to that provided by the use of the Bayesian game theory developed in [7] , where 34% improvement of the PAR has been reached.

5.3. Evaluation of the Expected Payoff $E{P}_{n}$

The expected payoff $E{P}_{n}$ (15) evaluated for the activation of the PHEVs (SLs) owned by the prosumer in the presence of the DSM and the battery in relation to the cases where they are not present may vary from one prosumer to another. Figures 10-12 show the gap of the expected payoff existing between the use of DSM (blue squares) and the battery compared to the cases where DSM is absent (red squares) and the battery charging or discharging. Note that the line that connects the two squares referring to a specific user is blue (red) if the first value is greater (smaller) than the second one.

Especially Figure 10 shows the gap between the realization of the values of the mentioned expected payoffs
$\left\{E{P}_{n},n=1,2,\cdots ,100\right\}$ evaluated for the activation of the PHEV owned by the n^{th} prosumer without considering the battery.

The realization of the values of the above mentioned expected payoffs $\left\{E{P}_{n},n=1,2,\cdots ,100\right\}$ when the battery is charging is shown in Figure 11.

The last case in the evaluation of the expected payoffs $\left\{E{P}_{n},n=1,2,\cdots ,100\right\}$ related to the activation of the PHEV has been simulated when the battery is discharging; which is exemplified in Figure 12.

These results show that, in the MG community, 70, 64, and 73 prosumers respectively for the MG operates without battery, with battery charging and finally with battery discharging, benefit from the DSM strategy in terms of monetary

Figure 10. Expected payoffs $E{P}_{n},n=1,2,\cdots ,100$ evaluated for the activation of the PHEVs in the absence of the battery.

Figure 11. Expected payoffs $E{P}_{n},n=1,2,\cdots ,100$ evaluated for the activation of the PHEVs with the battery charging.

Figure 12. Expected payoffs $E{P}_{n},n=1,2,\cdots ,100$ evaluated for the activation of the PHEVs with the battery discharging.

units. Further simulations have shown that the average Expected payoff for the PHEV activation is:

1) −0.87 mu in the presence of the DSM and −1.46 mu in its absence and without tarnishing the battery;

2) −0.57 mu in the presence of DSM and −0.66 mu in its absence and with the battery charging;

3) −2.23 mu in the presence of the DSM and −4.44 mu in its absence and with the battery discharging.

6. Conclusion and Future Work

In this paper, a game theory based on DSM strategy relying on statistical information about prosumer consumption, the charging/discharging of a shared battery and the overall consumption of a MG has been developed. The proposed strategy helps to mitigate fluctuations in the load demand when applied to a MG with SLs and preserve privacy for users. Numerical results obtained when using the strategy to the MG and considering a shared battery in a multi user scenario, evidence a significant reduction in the MG PAR for the management of the recharge of PHEVs considered as shiftable load owned by each user. Furthermore, the strategy allows a substantial satisfaction for the activation of those SLs when the community storage is contributing to the power supply in the MG. Future work concerns the management of the community storage by autonomously scheduling the charging and discharging of the energy storage units in a MG.

References

[1] Logenthiran, T., Srinivasan, D. and Shun, T.Z. (2012) Demand Side Management in Smart Grid using Heuristic Optimization. IEEE Transactions on Smart Grid, 3, 1244-1252.

https://doi.org/10.1109/TSG.2012.2195686

[2] Gao, B., Zhang, W., Tang, Y., Hu, M., Zhu, M. and Zhan, H. (2014) Game-Theoretic Energy Management for Residential Users with Dischargeable Plug-in Electric Vehicles. Energies, 7, 7499-7518.

https://doi.org/10.3390/en7117499

[3] Aghaei, J. and Alizadeh, M.I. (2013) Critical Peak Pricing with Load Control Demand Response Program in Unit Commitment Problem. IET Generation, Transmission & Distribution, 7, 681-690.

https://doi.org/10.1049/iet-gtd.2012.0739

[4] Yu, R., Yang, W. and Rahardja, S. (2012) A Statistical Demand-Price Model with Its Application in Optimal Real-Time Price. IEEE Transactions on Smart Grid, 3, 1734-1742. https://doi.org/10.1109/TSG.2012.2217400

[5] Lui, T.J., Stirling, W. and Marcy, H.O. (2010) Get Smart. IEEE Power and Energy Magazine, 8, 66-78.

[6] Kamgarpour, M. and Tembine, H. (2013) A Bayesian Mean Field Game Approach to Supply Demand Analysis of the Smart Grid. 1st International Black Sea Conference on Communications and Networking, Batumi, 3-5 July 2013, 211-215.

[7] Misra, S., Bera, S., Ojha, T. and Zhou, L. (2015) ENTICE: Agent-Based Energy Trading with Incomplete Information in the Smart Grid. Journal of Network and Computer Applications, 55, 202-212.

https://doi.org/10.1016/j.jnca.2015.05.008

[8] Mohsenian-Rad, A.-H., Wong, V.W.S., Jatskevich, J., Schober, R. and Leon-Garcia, A. (2010) Autonomous Demand Side Management Based on Game-Theoretic Energy Consumption Scheduling for the Future Smart Grid. IEEE Transactions on Smart Grid, 1, 320-331.

https://doi.org/10.1109/TSG.2010.2089069

[9] AlSkaif, T. (2016) Energy Sharing in Smart Grids: A Game Theory Approach [Recurs Electrònic].

[10] Sola, M. and Vitetta, G.M. (2016) A Bayesian Demand-Side Management Strategy for Smart Micro-Grid. Technology and Economics of Smart Grids and Sustainable Energy, 1, 8.

https://doi.org/10.1007/s40866-016-0008-z

[11] Fudenberg, D. and Levine, D.K. (1998) The Theory of Learning in Games. The MIT Press, Cambridge.