Although the internet was originally intended for non-time-critical transport  , there is a growing interest in adding real-time traffic to the traditional non-time-critical bulk traffic. Real-time traffic is characterized by bounds on some performance metrics (such as delay, jitter or packet loss probability). Voice over IP (VoIP) and Internet Protocol TV (IPTV) are examples of real-time traffic. Because of these performance bounds, real-time traffic requires preferential service during transport.
The strategy for mixing real-time and bulk traffic is to use, at the nodes of the network, separate queues for different classes of traffic, so the real-time traffic can get the service it requires. Priority queueing  is the simplest mechanism that provides preferential service to some classes of traffic; in the priority queueing, lower priority traffic can be serviced only when all queues of higher priority classes are empty. Such a policy works well when the traffic is not very intensive but can result in blocking lower priority traffic for extended periods of time if the traffic in higher priority classes becomes intensive. Therefore a number of modifications of (strict) priority queueing were proposed to avoid such blocking and to guarantee some levels of service for lower priority classes independently of traffic in higher priority classes  ,  . Weighted priority queueing is one of such modifications which assigns fractions of the bandwidth to traffic classes according to class weights.
Modern communication networks  are complex structures which―for modeling―require a flexible formalism that can easily handle concurrent activities as well as synchronization of different events and processes that occur in such networks  . Petri nets  ,  are such formal models. As formal models, Petri nets are bipartite directed graphs, in which the two types of vertices represent, in a very general sense, conditions and events. An event can occur only when all conditions associated with it (represented by arcs directed to the event) are satisfied. An occurrence of an event usually satisfies some other conditions, indicated by arcs directed from the event. So, an occurrence of one event causes some other event to occur, and so on.
In inhibitor Petri nets, in addition to directed arcs, inhibitor arcs provide “test if zero” condition which does not exist in “standard” Petri nets. Inhibitor arcs are needed for modeling priority mechanisms.
In order to study performance aspects of systems modeled by Petri nets, the durations of modeled activities must also be taken into account. This can be done in different ways, resulting in different types of temporal nets. In timed Petri nets  , occurrence times are associated with events, and the events occur in real-time (as opposed to instantaneous occurrences in other models). For timed nets with constant or exponentially distributed occurrence times, the state graph of a net is a Markov chain (or an embedded Markov chain), in which the stationary probabilities of states can be determined by standard methods  . These stationary probabilities are used for the derivation of many performance characteristics of the model.
Timed Petri nets are used in this paper to develop models of weighted priority queueing and then performance characteristics of simple queueing systems are obtained by discrete-event simulation of developed models.
Section 2 recalls basic concepts of Petri nets and timed Petri nets. Section 3 describes the net model of weighted priority queueing while Section 4 uses the developed model to analyze the performance of simple weighted priority queueing systems. Section 5 concludes the paper.
2. Petri Nets and Timed Petri Nets
Petri nets  are formal models of systems that exhibit concurrent activities. Computer systems, communication networks, manufacturing systems and transportation systems are examples of such systems. Concurrent activities are represented in Petri nets by tokens which can move within a (static) graph-like structure of the net. More formally, a marked inhibitor place/transition Petri net is defined as a pair , where the structure is a bipartite directed graph, with the two types of vertices being a set of places P and a set of transitions T, and a set of directed arcs A which connect places with transitions and transitions with places, , while H is a set of inhibitor arcs which connect places with transitions, ; usually . Finally, is the initial marking function which assigns nonnegative numbers of tokens to places of the net, . Places which are assigned nonzero numbers of tokens by a marking function m are called marked places, while places with zero tokens are called unmarked places. Marked nets can be equivalently defined as .
In Petri nets the distribution of tokens over places changes by occurrences (or firings) of transitions. A transition t is enabled by a marking function m if all places connected to t by directed arcs are marked and all places connected to t by inhibitor arcs are unmarked. When an enabled transition t occurs (or fires), one token is removed from each place connected to by a directed arc and one token is deposited to each place connected to by an outgoing arc. An occurrence of a transition creates a new marking function, a new set of enables transitions, and so on. The set of all marking functions that can be created starting from the initial marking is called the reachability set of a net. This set can be finite or infinite.
A place is shared if it is connected to more than one transition. A shared place p is free-choice if the sets of places connected by directed arcs and inhibitor arcs to all transitions sharing p are identical. All transitions sharing a free-choice place constitute a free-choice class of transitions. For each marking function, either all transitions in each free-choice class are enabled or none of these transitions is enabled. It is assumed that a choice of an occurring transition in each free-choice class is random and can be described by probabilities associated with transitions. A shared place which is not free-choice is a conflict place and transitions sharing it are conflicting transitions.
Temporal behavior can be introduced in Petri nets in several ways, resulting in different classes of Petri nets “with time”  . In timed nets  , occurrence times are associated with transitions, and transition occurrences are real-time events (as opposed to instantaneous occurrences in other models  ); so, tokens are removed from input places at the beginning of the occurrence period, and they are deposited to the output places at the end of this period. All occurrences of enabled transitions are initiated in the same instants of time in which the transitions become enabled (although some enabled transitions may not initiate their occurrences). If, during the occurrence period of a transition, the transition becomes enabled again, a new, independent occurrence can be initiated, which will overlap with the other occurrence(s). There is no limit on the number of simultaneous occurrences of the same transition (sometimes this is called infinite occurrence semantics). Similarly, if a transition is enabled “several times” (i.e., it remains enabled after initiating an occurrence), it may start several independent occurrences in the same time instant.
Formally, a timed Petri net is a triple, , where is a marked net, c is a choice function which assigns probabilities to transitions in free-choice classes and relative frequencies of occurrences to conflicting transitions, , and is a timing function which assigns an (average) occurrence time to each transition of the net, , where is the set of nonnegative real numbers.
The occurrence times of transitions can be either deterministic or stochastic (i.e., described by some probability distribution function); in the first case, the corresponding timed nets are referred to as D-timed nets  , in the second, for the (negative) exponential distribution of firing times, the nets are called M-timed nets (Markovian nets)  . In both cases, the concepts of state and state transitions have been formally defined and used in the derivation of different performance characteristics of the model. In simulation applications, other distributions can also be used, for example, the uniform distribution (U-timed nets) is sometimes a convenient option. In timed Petri nets different distributions can be associated with different transitions in the same model providing flexibility that is used in simulation examples that follow.
In timed nets, it is convenient to have a possibility of some events to occur “immediately”, i.e., in zero time; all transitions with zero occurrence times are called immediate (while the others are called timed). Since the immediate transitions have no tangible effects on the (timed) behavior of the model, it is convenient to “split” the set of transitions into two parts, the set of immediate and the set of timed transitions, and to first perform all occurrences of the (enabled) immediate transitions, and then (still in the same time instant), when no more immediate transitions are enabled, to start the occurrences of (enabled) timed transitions. It should be noted that such a convention effectively introduces the priority of immediate transitions over the timed ones, so the conflicts of immediate and timed transitions are not allowed in timed nets. Detailed characterization of the behavior or timed nets with immediate and timed transitions is given in  .
3. Weighted Priority Queueing
In priority queueing  , separate queues are used for packets of different classes of traffic (different priorities). Packets for transmission (over the shared communication channel) are always selected starting from the (nonempty) queues of highest priority. Consequently, packets from lower priority queues are selected only if all higher priority queues are empty. This can block the lower priority classes of traffic for extended periods of time if the traffic is intense.
Weighted priority scheduling limits the number of consecutive packets of the same class that can be transmitted over the channel; when the scheduler reaches this limit, it switches to the next nonempty priority queue and follows the same rule. These limits are called weights, and are denoted . With k classes of traffic, if there are sufficient numbers of packets in all classes, the scheduler selects packets of class 1, then packets of class 2, …, then packets of class k, and again packets of class 1, and so on. Consequently, in such a situation (i.e., for sufficient supply of packets in all classes), the channel is shared by the packets of all priority classes, and the proportions are:
where is the transmission rate for packets of class i. If the transmission rates are the same for packets of all classes (as is assumed for simplicity in the illustrating examples), the proportions are:
For an example with 3 priority classes and the weights equal to 4, 2 and 1 for classes 1, 2 and 3, respectively, these “utilizations bounds” are equal to 4/7, 2/7 and 1/7, for classes 1, 2 and 3, respectively.
A Petri net model of weighted priority scheduling for three classes of packets with weights 4, 2 and 1 is shown in Figure 1. The model is composed of three identical interconnected sections corresponding to the three priority classes.
The main elements of the model are the three queues represented by places , and for traffic class 1, 2 and 3, respectively, and timed transitions , and modeling the transmission of selected packets through the communication channel. The three classes of packets are generated (independently) by transitions , and with places , and . The occurrence times , and determine the arrival rates for queues 1, 2 and 3, respectively.
The scheduling is based on repeated selection of queues in order of priorities (first class 1, then 2, and so on) for the transmission of queued packets. This selection operation is represented by a loop with places , , and , and , and . There is a single “control token” in this loop (shown in place in Figure 1). This token indicates the queue that is used for transmission of packets (by the subscript 1, 2 or 3); a token in place indicates that no queue is selected.
Let be marked. If all three queues are empty, the next packet arriving to one of the queues enables one of the transitions , or , the control token is moved from to place corresponding to the nonempty queue, and an occurrence of transition selects a token from for transmission. At the same time, one token from place is moved to place . When the channel becomes available for transmission (which is indicated by an occurrence of ), the control token is returned to . Now there are three possibilities:
・ if the queue (place ) is nonempty and the weight ( ) is nonempty, another token is selected from and forwarded for transmission;
・ if the queue is empty, an occurrence of transition moves the control token from to ;
・ if the weight is empty, an occurrence of transition also moves the control token from to .
A token in moves (by repeated occurrences of ) all tokens from place back to , and when becomes empty, an occurrence of transition moves the control token to the next class represented by . If the queue for this class is empty, occurrences of transitions and move the control token to a subsequent class until is reached, and then the highest priority nonempty class is selected by an occurrence of one of transitions , or .
Figure 1. Petri net model of weighted priority queueing with three priority classes, infinite queues and weights 4-2-1.
The (finite) capacity of the queue is represented by the initial marking of place (shown in Figure 2 as K). When a packet is generated (by ) and the queue is not full, i.e., place is marked, an occurrence of enqueues the packet in . If, however, the queue is full, place is unmarked, the inhibitor arc enables and the packet is dropped.
Finally, when a packet is selected for transmission and is removed from the queue, each occurrence of transition returns a token to , indicating that the queue can store another packet.
4. Performance Characteristics
The model shown in Figure 1 (three classes of traffic, weights 4-2-1) is used for performance analysis of weighted priority queueing. The utilizations of the shared communication channel as functions of traffic intensity of class 1 (the highest priority), , with constant traffic intensities for classes 2 and 3, and , is shown in Figure 3.
For , channel utilizations for classes 2 and 3 are constant at the levels of 0.5 and 0.25, respectively (all service rates are equal to 1 for simplicity, so the utilizations are equal to traffic intensities and also the arrival rates are equal to traffic intensities); for class 1, the utilization changes linearly with . It should be noted that traffic intensities and are significantly greater that the performance levels guaranteed by the weights 4-2-1 (equal to 2/7 and 1/7 for classes 2 and 3, respectively). For , the channel becomes fully utilized ( ), so further increases of result in decreasing utilizations of the channel for classes 2 and 3, until the levels guaranteed by the weights are reached (these levels are 2/7 or 0.286 and 1/7 or 0.143). This occurs at or 0.571.
For , queues 2 and 3 are nonstationary because their arrival rates are greater than departure rates. Similarly, for , queue 1 is nonstationary. In practical queueing systems the capacities of queues are finite, so the nonstationary regions correspond to dropping of some arriving packets because they cannot be queued.
If, however, the (constant) traffic intensities and do not exceed the levels of traffic determined by the weights, the behavior of the queueing system is different, as shown in Figure 5 for and .
In this case queue 1 becomes nonstationary at . Moreover, the waiting times for classes 2 and 3 depend rather insignificantly on the traffic of class 1, as shown in Figure 6.
Figure 2. Petri net model for class 1 of weighted priority queueing with a finite queue and weight 4.
Figure 3. Channel utilizations as functions of with and for weighted priority queueing with infinite queues and weights 4-2-1.
Figure 4. Average waiting times as functions of with and for weighted priority queueing with infinite queues and weights 4-2-1.
Figure 5. Channel utilizations as functions of with and for weighted priority queueing with infinite queues and weights 4-2-1.
Figure 6. Average waiting times as functions of with and for weighted priority queueing with infinite queues and weights 4-2-1.
When the capacity of a queue is finite, packets which arrive when the queue is full are dropped as they cannot be queued. The percentage of dropped packets is an important metric of the system. Figure 7 shows the fraction of packets which are dropped in a weighted priority queueing with weights 4-2-1 and with queue length equal to 5, as functions of traffic intensity with and .
Figure 7 shows that the fraction of packets dropped increases for and―for classes 2 and 3―reaches the level of 45% for close to 0.6. This should not be surprising because in the same range of values of the utilization of the shared channel decreases from 0.5 to 0.286 for class 2 and from 0.25 to 0.143 for class 3 (as shown in Figure 3). This decrease results is dropping about 45% of packets (practically the same for classes 2 and 3).
Figure 7. Fraction of dropped packets as functions of with and for weighted priority queueing with queues length = 5 and weights 4-2-1.
Figure 8. Average waiting times as functions of with and for weighted priority queueing with queue length = 5 and weights 4-2-1.
Figure 9. Average queue lengths as functions of with and for weighted priority queueing with queue length = 5 and weights 4-2-1.
Results shown in Figure 7, Figure 8 and Figure 9 are related to each other. For weights 4-2-1 and for high-intensity traffic, each scheduling cycle includes 4 packets from class 1, 2 packets from class 2 and just 1 packet from class 3. Each packet served from class 3 is thus accompanied by 6 other packets, so if the average length of the queue 3 is n, the average waiting time for class 3 is expected to be 7n. For (Figure 9), this results in the average waiting time for class 3 that is close to 30 (as shown in Figure 8). For class 2, two packets are served in each scheduling cycle, so its average waiting time is one half of that for class 3 (the average queue lengths are practically the same for classes 2 and 3, as shown in Figure 9).
It should be observed that from performance point of view, it is not beneficial to have long queues for packets waiting for service. For high intensity traffic these queues will be practically full, and then the average waiting time will simply increase proportionally with the queue length. Figure 10 and Figure 11 show the average queue length and the average waiting time for the case when all queue lengths are equal to 10.
Finally, Figure 12 and Figure 13 show the fraction of the dropped packets and the average waiting times for the case when the traffic intensities do not exceed the levels determined by the weights, i.e., and , as in Figure 6.
For class 1, the increase of the fraction of dropped packets is caused by queue 1 which is becoming full; all arriving packets which cannot be queued, are dropped.
For classes 2 and 3, the fraction of dropped packets is very small and the average waiting times are also rather small.
Figure 10. Average queue lengths as functions of with and for weighted priority queueing with queue length = 10 and weights 4-2-1.
Figure 11. Average waiting times as functions of with and for weighted priority queueing with queue length = 10 and weights 4-2-1.
Figure 12. Fraction of dropped packets as functions of with and for weighted priority queueing with queues length = 5 and weights 4-2-1.
Figure 13. Average waiting times as functions of with and for weighted priority queueing with queue length = 5 and weights 4-2-1.
5. Concluding Remarks
Efficient use of modern networks requires detailed knowledge of network characteristics, traffic statistics, transmission media types, and so on. Some of this information can be obtained by measurements performed under real traffic, but other can only be provided by detailed models, verified by comparisons with measurement data. On the basis of these characteristics, specific methods can be developed to determine the optimal numbers of links, the transmission capacity of links, the management strategy for resources shared among traffic classes, and others.
The goal of this paper is to provide insight into the behavior of weighted priority queueing, a modification of (strict) priority queueing that eliminates blocking of lower priority traffic that is typical for priority-based traffic management schemes. The paper shows that when the weights match the characteristics of lower priority traffic, the performance provided by the analyzed scheme is actually quite good. However, since in real communication networks the characteristics often change, a dynamic weight selection method may be needed for adjusting the performance to the changing character of the traffic. Some ideas for such a dynamic weighted queueing can be found in  and  .
The weighted priority queueing exhibits several similarities to the weighted fair queueing  ,  but seems to be simpler to implement. An in-depth comparison of these queueing methods is needed for better understanding their relative strengths and weaknesses.
 Georges, J.-P., Divoux, T. and Rondeau, E. (2005) Strict Priority versus Weighted Fair Queueing in Switched Ethernet Networks for Time Critical Applications. 19th IEEE International Parallel and Distributed Processing Symposium, Denver, 4-8 April 2005, 141-148.
 Dekeris, B., Adomkus, T. and Budnikas, A. (2006) Analysis of QoS Assurance Using Weighted Fair Queuing (WFQ) Scheduling Discipline with Low Latency Queue (LLQ). 28th International Conference on Information Technology Interfaces, Cavtat/Dubrovnik, 19-22 June 2006, 507-512.
 Zuberek, W.M. (1991) Timed Petri Nets—Definitions, Properties and Applications. Microelectronics and Reliability (Special Issue on Petri Nets and Related Graph Models), 31, 627-644.
 Zuberek, W.M. (1986) M-Timed Petri Nets, Priorities, Preemptions, and Performance Evaluation of Systems. In: Advances in Petri Nets 1985, Springer-Verlag, Berlin, 478-498.
 Wang, H., Shen, C. and Shin, K. (2001) Adaptive Weighted Packet Scheduling for Premium Service. IEEE International Conference on Communications. Conference Record, Helsinki, 11-14 June 2001, 1846-1850.
 Panza, G., Graziolli, M. and Sidoti, F. (2005) Design and Analysis of a Dynamic Weighted Fair Queueing (WFQ) Scheduler. 15th IST Mobile and Wireless Communications Summit, Dresden, 19-23 July 2005, 134-138.
 Quadros, G., Alves, A., Monteiro, E. and Boavida, F. (2000) How Unfair Can Weighted fair Queuing Be? Fifth IEEE Symposium on Computers and Communications (ISCC 2000), Antibes-Juan Les Pins, 3-6 July 2000, 779-784.