Many machine scheduling problems practically require setup times prior to the processing of jobs. It is common that the setup times of one job are dependent on the last job that has been previously processed on the same machine. For instance, in the package printing industry, the time taken to prepare the ink colors for a task is dependent on the colors that have been used on the last printing task. In this paper, we consider an identical parallel machine scheduling problem with jobs’ sequence-dependent setup times and release dates. The problem’s objective is to minimize the total completion time. This problem is denoted by the three fields notation  as .
Parallel machine scheduling with job setup times problem is widely studied in the literature. Readers can refer to the survey of Allahverdi  for a broader family of similar problems. For that reason, we limit our literature review to the problem with time criteria objectives. To the best of our knowledge, the first study of this problem dates back to 1993 by Guinet . Nessah et al.  proved that the problem is NP-Hard in the strong sense. They solved the problem by an efficient Branch-and-Bound algorithm with a strong dominance property. To solve the problem with the total weighted completion time minimization objective, Fowler et al.  developed a hybridized genetic algorithm with dispatching rules. To solve the problem with the makespan minimization objective, Montoya-Torres et al.  established a randomized search heuristic. Lin et al.  proposed a genetic algorithm to solve the minimization of the maximum lateness. However, those mentioned works have not introduced any mathematical formulation of the problem with the min-sum criterion.
As far as we know, Kurz and Askin  were the first to construct an integer programming for the problem. However, their formulation, which is graph-based, contains one non-linear constraint to ensure the completion times of two consecutive jobs. Anderson et al.  developed a network-based mixed-integer-linear programming (MILP) to the earliness/tardiness minimization problem. Nonetheless, the formulation included the big-M whose value is known to be very impacting on the model solving time. One can find an extensive analysis of the formulation of a similar scheduling problem on a single machine environment in the paper of Nogueira et al. . This work aimed to structure many families of formulations of the single-machine environment. The paper proposed new upper bounds for all formulations and compared their performances by thorough numerical tests. According to our knowledge, a similar analysis is still a void in the literature for identical parallel machine scheduling.
For this reason, we extend the works of Nessah et al.  and Nogueira et al.  to analyze the performances of different exact approaches/MILP formulations on the problem. The current paper is organized as follows:
● In Section 2, we introduce two MILP formulations: graph-based formulation and sequence-based formulation. We establish tight upper bounds for the completion times of jobs for each formulation.
● In Section 3, we apply the test protocol introduced by Nessah et al.  to 1) quantify how three factors of the data structure (the arrival density, the setup time relative length and the number of machines) would impact the tractability of resolutions methods and 2) compare the performance of resolution methods for each instance structure.
● In Section 4, we draw conclusions from the numerical tests and present four perspectives of future researches.
2. Mathematical Formulation
We consider a problem where one has to schedule n jobs on m identical machines. m should be strictly inferior to n unless the problem becomes trivial. Each job j has a release date when it is ready to process and a sequence-dependent setup time if it is preceded immediately by job i on the same machine. We consider three following assumptions:
● The setting up of a job can be done before the release date of this job.
● The first job to be processed in each machine does not need to be setup because it is preceded by no other job.
● All the values of the setup time must satisfy the triangle inequality for all job i, j, k. This assumption is important to avoid the insertion of a job k with and for some jobs i and j.
Sets and parameters
● : set of jobs.
● : set of machines.
● : release date of jobj.
● : processing time of job j.
● : Setup time if job j follows job i to be processed on the same machine.
2.1. A Graph-Based Formulation
We introduce a dummy job, Job 0, which has processing time equal to zero ( ) and it is available at the beginning of the scheduling horizon ( ). The setup times required all jobs preceding task 0 are also zero. To limit the number of identical machines we must have a constraint that assures at most m jobs can connect to job 0.
● : set of jobs including imaginary job 0.
● : the completion time of job j.
● : the sequence decision variable. if job j processes right after job i on the same machine (denoted by ); otherwise .
Figure 1. An example of the graph-based formulation with 10 jobs and 3 machines.
with is an upper bound of the completion time of task i.
Constraint (2) makes sure that a job j must start after its release date ( ). Constraint (3) ensures the equation to be valid only if job i precede job j, i.e. . Otherwise, because . By constraint (4), at most m jobs can follow job 0 because of m available machines. Constraint (5) ensures that each job, except job 0, can have only one precedence. Constraint (6) limits each job, except job 0, to have at most one follower. Constraint (7) forbids a job to follow itself.
We develop an upper bound for the completion time of job j for the identical machine scheduling environment, based on the work of Nogueira et al.  for the single machine environment.
Theorem 1. There is an optimal schedule such that the number of jobs scheduled on any machine is less than or equal to (Yalaoui and Chu  ).
Proposition 2. is an upper bound of job j’s completion time according to theorem 1, where:
● is the jth sorted element of in a non-decreasing order.
Proof. We consider firstly a schedule with one machine where each task has the maximum setup time for any following job: .
Let j be the last job to be scheduled on this machine. Let us assume that job k precedes job j. Hence, .
Furthermore, then .
We consider secondly a schedule with two machines, according to theorem 1, there is at least one task, denotedl, to be scheduled on the second machine. The improved job j’s completion time bound is then:
We consider a general case of m machine. By applying the same process, one can have a strict bound of the completion time of job j:
In addition, , we can rewrite the upper bound as:
2.2. Sequence-Based Formulations
The intuition behind the sequence formulation is to have a global job sequence cut into job sequences of machines. First, we use the variables to make a jobs sequence: if then job j occupies the kth position on the scheduling sequence, otherwise . Second, we cut the sequence into parallel machines by using variable . If then the job at the position k will start to process on a new machine.
We take the same example introduced previously with 10 jobs and 3 machines to illustrate the assignment of variables for the sequence-based formulation. The value assigned to the variables is shown in Table 1.
The sequencing of jobs on machines is as follow:
● Machine 1: .
● Machine 2: .
● Machine 3: .
Auxiliary variable denotes the sequencing of two jobs i and j. If then job j follows job i and takes the kth position of the scheduling sequence. The completion time of the job assigned to position k is denoted by .
Table 1. The value of variables and .
● : jobs’ position.
● : the completion time of job at position k.
● : is equal to 1 if job j occupies position k.
● : is equal to 1 if job at the position k will start to process on a new machine.
● : an auxiliary variable, is equal to 1 if job j follows job i at kth position.
where is an upper bound of the completion time of the job that takes the kth position, and .
Constraints (14) and (15) limit one job at one position and vice-versa. Constraint (16) ensures a job j to start after its release date when it is assigned to kth position. Constraint (17) triggers the variable to be greater or equal to 1 if job i is in the position and job j is in the position k, otherwise is unconstrained. Constraint (18) computes the completion time of the job taking the kth slot. Constraint (19) limits the number of machines to m. The last constraints initialize and bound the decision variables.
It is important to note that the integral constraint of variable can be relaxed, i.e. , without violating the integrality of the solution .
We introduce in the following proposition an upper bound for . This is an adaptation from the completion time upper bound for a single machine scheduling, which is introduced by Nogueira et al. , to identical parallel machines scheduling.
Proposition 3. is a valid upper
bound of the completion time of job at the kth position, , for any schedule with respect to theorem 1 where:
● is the jth sorted element of in a non-decreasing order.
Proof. Completion time of the job at the first position cannot exceed . The job in the second position cannot complete after , and so on, until we reach the job at the position. Completion time of this job is bounded the same way as the job at position. Because from the position 1 to there would be at least one job that has (by theorem 1). Since and then . In the same way, from the position 1 to there would be at least jobs that have .
Table 2 resumes the number of constraints and variables of the two formulations.
3. Numerical Tests
To conduct the numerical test, we use the same benchmark introduced by . As an attempt to parameterize the instances, we use three factors: jobs’ arrival density (JAD), jobs’ setup time relative length (SRL) and the number of machines (m).
The objective of the numerical tests is two-fold. First, we try to observe the sensibility of each solving method to the corresponding data structure. Second, we tend to examine the impacts of the data structure to the tractability of the problem.
The following subsection will mention the random generators of the instances.
3.1. Numerical Tests Protocol
The number of jobs generated is and the number of machines generated is .
Job’s release date is generated randomly according to the uniform distribution
which takes the value between . Jobs arrival density is scaled by
Table 2. The number of constraints and variables.
the value of which takes value from the set . When , jobs arrival would be densest and earliest. In the contrary, when , jobs arrival would distribute more evenly along the scheduling horizon.
The processing times of jobs are uniformly generated in . The setup time is equal to , with a randomly generated in . Table 3 resumes the parameters of the test generation.
For each data structure , we generate 10 instances, hence, each problem size has 250 instances generated. The total number of instances tested is 6000 instances.
We compare the CPLEX solving MILP formulations with an exact method: the Branch-and-Bound algorithm developed by Nessah et al. .
The MILP formulations are solved by IBM ILOG CPLEX 12.8. The Branch-and-Bound algorithm is coded in C++. The computer runs on Window 7 professional 64bits (Dell Optiplex 9020, CPU Intel Core i5-4690 3.5GHz and 8192MB RAM). We report in this section the following KPIs (Key Performance Indicators):
● Time (in second): solving time, which is limited to 3600 seconds.
● LPGap (in percentage): the LP relaxation gap. For the solution obtained by solving MILP formulations, LPGap is the gap between the best objective of the integral solution and LP relaxation one, i.e. when all the integrity constraint are removed. When Branch-and-Bound algorithm solves the instance, LPGap is the gap between the best solution objective and the best lower bound.
● MLBGap (the maximal lower bound, in percentage): the gap between the objective value found by each method to the best lower bound found by all the method. The MLBGap value can be considered as the worst possible gap to the optimal solution of each instance tested. This KPI is used to compare the quality of the actual solution found by a method to others.
● Opt: percentage of instances solved to optimality.
Figure 2 illustrates how to calculate the LPGap and MLBGap.
Table 3. Random instances generation’s parameters.
Figure 2. An example of MLBGap and LPGap.
3.2. Results Analysis
3.2.1. Computational Times (CPUTimes)
Impacting factor: Number of machines
Table 4 shows the results of average computational times of each method according to the variation of machines numbers m. Adding the more machine takes the Branch-and-Bound (BnB) largely more time to solve: when the number of machines goes from 2 to 5 the average CPUTimes increase nearly seven times. The solving time of the SF formulation increases also with the number of machines, but more slowly in comparison to the Branch-and-Bound algorithm. In the contrary, adding more machines reduces the solving times of the graph-based MILP by 32%.
Figure 3 plots the average CPUTimes of each value of m in function of n numbers of jobs. The solving time three solving methods discriminate when the number of jobs is greater than or equal to 15. While the CPUTimes of CPLEX for GF formulation increase in a stable way, the solving time of the other two methods increases in an exponential fashion.
Impacting factor: Jobs’ arrival density
From the data of Table 4, one can see the non-negligent impact of the distribution of the release times of jobs to the solving times. For the two sparsest arrival rate ( and ), GF has outstanding solving time. Figure 4 plots the average CPUTimes of each scale factor in function of n numbers of jobs. Data from this figure show us that the solving time of GF formulation seems to be independent of the problem size when the release dates are sparse. On the contrary,GF formulation is very sensible to the dense release time when the resolution time of CPLEX for this formulation worsens exponentially. BnB algorithm is also sensitive to this factor: the computational times are reduced by half when goes from 0.6 to 3.0. The computational time increase with in and decreases with in . The global trend for all solving methods is that it takes a shorter time to solve an instance with a sparse release date.
Impacting factor: Jobs’ setup times relative length
Table 4 presents the average computational times and the setup time scale factor ranges. Globally, the solving time tends to increase when the relative length of jobs’ setup time increase. The least sensitive method to this factor is GF
Table 4. The average computational times (in second) of instances grouped by the number of machines m, the arrival density scale α and setup time scale a. The best value found by a method is in bold.
Figure 3. The average value of CPUTime according to the number of machines m and number of jobs n.
Figure 4. The average value of CPUTime according to arrival density scale parameter α and number of jobs n.
MILP and the most sensitive method to this factor is BnB. Concerning the computational time, the job setup times relative lengths are less impacting than the other two factors. Figure 5 plots the average CPUTimes of each scale factor a in function of n numbers of jobs.
Figure 5. The average value of CPUTime according to setup time scale factor a and number of jobs n.
Table 5 shows the correlations between the computational times and the number of machines, the release date density, and the setup time scale.
3.2.2. Linear Relaxation Gap (LPGaps)
Impacting factor: Number of machines
Table 6 show the result on the average LPGap of each method to different numbers of machines. When the number of machines increases, the LPGap of GF is reduced three times while the LPGap of SF and BnB increase slightly.
Figure 6 details the impact of the number of machines to the average LPGap of each method in the function of n jobs. The LPGap curve of the Branch-and-Bound increases in a very stable way when the number of jobs rises up when the number of machines is small ( ) the LPGap of the BnB algorithm nearly zero regardless of the number of jobs.
Impacting factor: Jobs’ arrival density
Table 6 presents the average LPGaps of each solving methods in function of the arrival density scale parameter . One can observe an important impact of to the quality of the solution found all three methods, especially the solving of formulation GF. The critical value of is 1.5, the LPGaps deteriorates when .
Figure 7 adds a dimension which is the problem size, to the observation. The quality of the solution found by CPLEX solving GF differs greatly since n reaches 15, while for the two other methods, the threshold of n which differs the solution quality is 25.
Impacting factor: Jobs’ setup relative length
Table 6 presents the average LPGap of each solving methods in function of the jobs setup time scale parameter a. It seems that this parameter controls very weakly the KPI LPGap of the three methods. One can notice only a very slight increment of the LPGap in all methods to when the setup times increase.
In Figure 8, with more details on the instance size, one can observe a slight impact of the setup time length to the solving of SF formulation when n reaches 25. For the two other methods, the impact of the setup times length to the LPGap is not clear.
Altogether, Branch-and-Bound has tightest LPGap. The all-average gap is equal to 0.6% and the standard deviation is 1.7%. GF has the largest LPGap
Table 5. Correlation between the computational times and m, α and a.
Table 6. TheLPGap (in percentage) corresponding to the impacting factors (number of machines, arrival density scale and setup time scale) and their respective values.
Figure 6. The average value of LPGap according to number of machines m and number of jobs n.
Figure 7. The average value of LPGap according to arrival density scale parameter α and number of jobs n.
Figure 8. The average value of LPGap according to setup time scale factor a and number of jobs n.
when the release dates are dense where while yielding excellent results at other value of . This fact underlines our previous observation about the sensitivity of GF to the job availability distribution. Table 7 states the correlation between the LPGap and the impacting factors.
3.2.3. Gap to the Best-Known Lower Bound
The quality of the solution found by each method/formulation can be evaluated by the gap of the best-known lower bound (MLBGap) to the actual solution. This gap can be considered as the maximum possible gap to the optimal solution.
Figures 9-11 shows the impact of the scale factors on three paradigms over the MLBGap. When having the worst LPGap, GF has the best MLBGap in all instance size and in any scale factors. Inversely, the Branch-and-Bound yields the worst MLBGap while having the best LPGap. This observation underlines again the fact that the lower-bound of the Branch-and-Bound calculated by the mSPRT heuristic  is tight. Since the calculation of this lower bound is dependent to the number of machines, the MLBGap is also impacted by the number of machines. The empirical data shows that the release date density is very impacting to the value of MLBGap. Scale factor α diversifies the performance of all paradigms when , especially the Branch-and-Bound.
The global results on MLBGap of the three methods according to the setup times length scale factors are shown in Table 8. The correlation between the MLBGap and the impacting factors is shown in Table 9.
3.2.4. Percentage of Instances Being Solved to Optimality
For this KPI, we tend to examine the tractability of an instance according to its structure, i.e. quantify how easy an instance can be solved according to a specific structure. First, we introduce the graphs of the evolution of the percentages of the instances being solved to optimality in function of the problem size, and, corresponding to the value of m, α and a. As for the other KPI which are previously introduced, those graphs give us an idea on how well each solving method cope with the instance in an specific structure. Second, we introduce an cross-table analysis, with combined two factors. This analysis would help us to have an insight on how easy an instance can be solved with a specific pair of configurations.
Table 7. Correlation between the gap to lower bound and m, α and .
Table 8. The average MLBGap (in percentage) corresponding to the impacting factors (number of machines, arrival density scale and setup time scale) and their respective values.
Table 9. Correlation between the MLBGap and the impacting factors.
Figure 9. The average value of MLBGap according to number of machines m and number of jobs n.
Figure 10. The average value of MLBGap according to arrival density scale parameter α and number of jobs n.
Figure 11. The average value of MLBGap according to setup time scale factor a and number of jobs n.
Figures 12-14 show the percentages of instances that are solved optimally according to each scale factor.
Cross-Table 10 presents the percentage of instances being solved to optimality corresponding to each pair of the release date density and the setup time length scale. CPLEX solves the instance easily to optimal by GF formulation when α is greater than or equal to 1.5, regardless of the value of setup-times. For the Branch-and-Bound algorithm, both values of α and a harm the number of instances being solved to optimality. The relation between the number of instance solved and the structure α and a is more random when CPLEX solves instances by SF formulation. When we combined all methods, the number of optimal solutions increase with α and decrease with a.
Cross-Table 11 presents the percentage of instances being solved to optimality corresponding to each pair of the setup time length scale and the number of machines. From the data, we can remark a clear frontier between easy-to-solve and hard-to-solve instances classified by the number of machine for the Branch-and-Bound algorithm. Combining all methods, an instance is easier to solve with less number of machines and smaller setup times.
Cross-Table 12 presents the percentage of instances being solved to optimality corresponding to each pair of the jobs release dates density and the number of machines. From the data, one can observe that both the number of machines and the distribution of jobs arrival date have the discriminating power to separate easy-to-solve instance to hard-to-solve instances. However, since CPLEX solves instances with evenly distributed arrival dates outstandingly good with GF,
Figure 12. The percentage of instances solved to optimality according to number of machines m and number of jobs n.
Figure 13. The percentage of instances solved to optimality according to arrival density scale parameter α and number of jobs n.
Figure 14. The percentage of instances solved to optimality according to setup time scale factor a and number of jobs n.
Table 10. The percentage of instances solved to optimality according to [A, B] and α.
Table 11. The percentage of instances solved to optimality according to [A, B] and m.
Table 12. The percentage of instances solved to optimality according to α and m.
the release date density has more discriminating power. When combined all solving methods, more than 94% of the instances are solved to optimality with sparse jobs availability ( ). When is less than 1.5, then, instances with less number of machines are easier to be solved to optimality.
4. Conclusions and Perspective
First, we present in this paper two MILP formulations to solve the parallel machine scheduling with task release dates and sequence-dependent setup times to minimize total completion time. We provide also two upper bounds for the job’s completion times corresponding to each formulation.
Second, by a thorough analysis of the numerical tests, we observe a considerable impact of the job arrival rate, and a non-negligible impact of the number of machines to the quality of the solution found and the time-performance of the solving algorithm. The cross-table analysis at the end of the numerical tests section could establish a base for further classification methods.
We developed four perspectives on future research of this problem. First, we may notice the difference between the LPGap and the actual gap of each instance. Thus, a better estimation of the lower bound can be very helpful to improve the computational effort. Second, we tend to test the performances of GF, SF, and Branch-and-Bound with other objective functions such as makespan minimization and total weighted completion time minimization. Third, we notice that the MILP formulation is very sensitive to the upper bound of the completion time so a tighter upper bound could be helpful. Finally, one can observe that the performance of the different methods is input data characteristics (jobs’ arrival density scale parameter and setup time scale factors) dependent. Further work is to develop a new method corresponding to our previous conclusion.
This research is supported by Chaire Connected Innovation tenured by Prof. Farouk Yalaoui.
 Graham, R.L., Lawler, E.L., Lenstra, J.K. and Kan, A.R. (1979) Optimization and Approximation in Deterministic Sequencing and Scheduling: A Survey. Annals of Discrete Mathematics, 5, 287-326.
 Allahverdi, A. (2015) The Third Comprehensive Survey on Scheduling Problems with Setup Times/Costs. European Journal of Operational Research, 246, 345-378.
 Guinet, A. (1993) Scheduling Sequence-Dependent Jobs on Identical Parallel Machines to Minimize Completion Time Criteria. The International Journal of Production Research, 31, 1579-1594.
 Fowler, J.W., Horng, S.M. and Cochran, J.K. (2003) A Hybridized Genetic Algorithm to Solve Parallel Machine Scheduling Problems with Sequence Dependent Setups. International Journal of Industrial Engineering: Theory Applications and Practice, 10, 232-243.
 Montoya-Torres, J.R., Soto-Ferrari, M., Gonzalez-Solano, F. and Alfonso-Lizarazo, E.H. (2009) Machine Scheduling with Sequence-Dependent Setup Times Using a Randomized Search Heuristic. 2009 International Conference on Computers & Industrial Engineering, Troyes, 6-9 July 2009, 28-33.
 Lin, S.-W., Lee, Z.-J., Ying, K.-C. and Lu, C.-C. (2011) Minimization of Maximum Lateness on Parallel Machines with Sequence-Dependent Setup Times and Job Release Dates. Computers & Operations Research, 38, 809-815.
 Kurz, M. and Askin, R. (2001) Heuristic Scheduling of Parallel Machines with Sequence-Dependent Set-Up Times. International Journal of Production Research, 39, 3747-3769.
 Anderson, B.E., Blocher, J.D., Bretthauer, K.M. and Venkataramanan, M.A. (2013) An Efficient Network-Based Formulation for Sequence Dependent Setup Scheduling on Parallel Identical Machines. Mathematical and Computer Modelling, 57, 483-493.
 Nogueira, T.H., De Carvalho, C. and Ravetti, M.G. (2014) Analysis of Mixed Integer Programming Formulations for Single Machine Scheduling Problems with Sequence Dependent Setup Times and Release Dates. Optimization Online, 39.