Samples or forensic evidence is normally collected from crime scenes, victims and suspects in criminal cases and then submitted to the Forensic Science Laboratory. Each case submitted is treated separately, documentation and management of which is referred to as a case file. Sample influx into the FSL is defined as number of samples per each case file, which is normally analyzed and reported as a single block submitted to clients as single report. In the FSL, because of uncertainty in search for criminals, an exceedingly large number of samples are normally received from the investigation teams, leading to high workload for analysts, extended turnaround time (TAT), increased backlogs, and increased analysis costs. The received number of samples is sometimes in excess of minimum required to answer the case, although sometimes all the samples are unable to answer the case satisfactorily.
Turnaround time (TAT) for FSL (as the time used to analyze a case file from the time it is received within FSL until the analysis report is released and collected by the customer/client) depends strongly on sample influx. The effect of high sample influx on TAT is of paramount importance to FSL and its stakeholders. As the samples are received in large numbers, they may become backlogged if analysis is not completed on time. The average time that it takes the FBI Laboratory to provide DNA testing results to contributors, for instance, is lengthy, ranging from approximately 150 days to over 600 days  .
This increases the turnaround time since the number of staff will still remain the same while the number of samples has increased. Therefore, the demand for forensic casework will outpace the FSL capacity to work on forensic cases. There is a need for re-sampling to reduce cost and increase FSL’s productivity, which in turn affects the stakeholders (especially the law enforcement) as they become frustrated due to extended TAT. Another reason for establishing re-sampling regime is the courts slowing down hearing processes waiting for laboratory results, such that suspects are held longer awaiting trial, and sometimes wrongly held while waiting for the laboratory analytical results to exonerate them. The cost to the families who wait for death certificates, unable to do anything with the deceased’s assets, or to obtain insurance payments, necessitates faster analytical reports from FSL. It is evident that higher sample influx leads to backlog in Forensic Science Laboratory due to extended TAT, as discussed earlier. Moreover, the analysis cost increases in terms of consumables, staff time and machine operating times given limited capacity of Forensic Science Laboratories. Thus, some case files end-up uncompleted on time, leading to an increase in stress backlog.
Assessing the possibility of taking samples from the arriving sample influx (re-sampling) is part of business modeling, as it involves changing existing structures to pursue a new business opportunity for FSL. This will actually pave way towards saving and minimizing analysis cost and time by analyzing fewer samples while getting useful case reports for the judicial use. This also transforms the FSL business from one opportunity to another. The business model changes will also require the key stakeholders to collaborate and understand the key aspects of the FSL business model changes since the changes in number of samples per case file will bring mutual benefits to both criminal investigators and the FSL  . Since there are other stakeholders who need to collaborate on these changes, then an inter-organization perspective towards the sample influx management concept is focusing on value network dimension of the business model  .
The analysis of sample influx and minimization of samples will improve the performance of the FSL in terms of reduced workload and cost, pricing methods and revenue structure, the so called value finance dimension of business model. Moreover, the business model applicable to this analysis is a so called platform family, since it requires networked business model as it involves the FSL and other organizations in the government (police force, prosecution and judiciary or the court system, etc.). That is, a platform requires us to build changes in the system with both producers and consumers in mind. The value outcome of such business model changes will create and push out value, in this case revenue to be consumed by model users (the FSL staff). To allow other networked organizations to interact as participants of this model, training will be inevitable for the benefit of FSL  .
This study analyses the sample influx data collected in two consecutive years, that is, 2014 and 2015 using statistical methods, and estimates the costs incurred. After establishing the costs based on the number of samples received, the model suggests non-statistical re-sampling plans adopted from literature applicable to the samples received into the FSL  and estimates the resulting saving to FSL.
In this study, the populations from which samples are taken include seized material submitted to the FSL as a single case file. The number of samples for each seizure varies from one case file to another and between forensic science disciplines, in random manner. In this study, sample influx data from three forensic science disciplines were analyzed (biology/DNA, chemistry and toxicology). Various arbitrary sampling methods suggested have been suggested in literature which are often used in practice and reported to work well in many situations were tested in this study  .
2. Literature Review
The higher sample influx leads to high workload to the FSL scientists and hence backlog are created, which continues to be a problem in laboratories because the demand for testing services is increasing faster than the capacity of the laboratories to process cases    . For DNA applications, for instance, there are two types of samples submitted to the FSL: case work (collected from crime scene) and convicted offender and arrestee samples (easier and faster to analyze). The former type involve large number of samples as it involves guessing for a match, whereas, collecting reference samples or direct collection from individuals requires less number of samples, hence lower sample influx. The current of forensic analysis is able to assist criminal investigations within the first 1 to 2 weeks when the most investigative resources are consumed. The lack of accurate forecasting of demand for FSL analysis due to stochastic nature of sample influx has created a vicious cycle of forensic backlogs in the FSL  . Without this accurate forecast, as more laboratory capacity is slowly added, the increase in output is overtaken by new demand. As a result, DNA cases for instance, spend the majority of time waiting behind other cases rather than undergoing analysis   .
Forensic DNA can be obtained from crime scenes or evidentiary items such as envelopes, clothing, and drinking glasses and compared to samples collected from known persons in an attempt to identify a perpetrator to a crime  . All these items are received at the FSL as evidence, and when they fall under one request or case file, they form sample influx. A single forensic case can contain multiple pieces of evidence, each of which may in turn yield several samples. For example, in a sexual assault case, DNA evidence left behind by a perpetrator may be collected from the victim’s body, clothing, and the physical location where the assault occurred. In addition to collecting forensic DNA evidence from crime scenes, evidentiary items, or victims, DNA samples can be collected from persons who have been charged or convicted of certain crimes  . As a result of the need for a match, forensic DNA evidence increases sample influx and cost of analysis and reporting.
Sampling is defined as the action or a decision of taking a part of a substance, material or product, for testing in order to reach a conclusion, make an inference about, and report on the whole. Sampling is only used when there is a reasonable assumption of homogeneity about the whole population. For exhibits or evidence arriving at the sample receiving office (SRO)... that consists of a multi-unit population (e.g., tablets, baggies, bindles), a re-sampling plan is a statistically valid approach to determine the number of sub-items that must be tested in order to make an inference about the whole population. Thus, re-sampling is mainly possible for forensic chemistry than forensic biology and toxicology.
The basis of re-sampling is that the composition found in the arriving samples taken reflects, in principle, the composition of the whole lot. As a consequence, only a fraction of the total packages in a seizure of drugs of abuse can be investigated. Re-sampling is an intentional choice to refrain from doing things to (unnecessary or impossible) perfection, for reasons of efficiency and cost effectiveness. As an example, if one sample out of a population of 10 is taken, and the analysis of the sample shows cocaine, the hypothesis that this is the only one containing cocaine is much more unlikely (10%) than the hypothesis that the majority of the 10 items contains cocaine (more than 50%)  .
In forensic science, a sampling strategy is full dependent on the question and thus the problem that has to be solved, in this case, a forensic investigation problem. There may be different needs for prosecution of possession, production, or trafficking. The question usually arises from the national law, or from a national policy or sometimes directly from the prosecutor’s opinion or from the police officers.
Sampling plans fall into two main categories: statistical and non-statistical plans  . Statistical sampling plans include hypergeometric, Bayesian and other probability-based approaches  . On the other hand, non-statistical sampling plans (also called arbitrary sampling) include square root of the number of samples received at the FSL, management directives and judicial requirements  . Hypergeometric sampling plan, for example, is a statistically-based sampling plan that allows the forensic scientists to analyze a portion of a population and make a statistical inference about the whole population stating that material was analyzed with a statistical sampling plan that demonstrates 95% confidence that at least 90% of the material contains the identified controlled substance(s). This study focuses on non-statistical or arbitrary sampling plans. Since the sampling schemes have no statistical foundation, they may lead to a very large sample size in case of large seizures  . Application of such sampling plans aims at saving analysis cost and time.
Exhibits of illicit drugs in a large number of containers are frequently submitted to the FSL. The forensic chemist often needs to select randomly and then examine a number of these containers to provide information regarding the composition of the overall exhibit which is sufficient to support the requirements of the criminal justice system  . However, such resampling must have scientific basis. The sample influx into FSL increases the workload and demands requiring proper prioritization and need for addressing additional laboratory costs and backlog issues (also a number of uncompleted requests or services). Efforts have been made in the past to establish sampling protocols which will reduce time and backlogs.
The sample flow into the FSL forms a significant performance factor adding stress on the ability of the scientists in meeting their daily workload. It is important to identify the type of case files received by FSL whose requests exceeds staffing capabilities, among forensic chemistry, toxicology, and forensic DNA disciplines. Based on Tanzanian case and experience, it’s in non-DNA discipline laboratories that expect demands to increase due to extreme sample influx increase rather than remain constant (for instance, forensic chemistry). Sampling guidelines of such categories are important where large numbers of relatively homogeneous material are available. These cases are characterized by differences in materials, amounts, packages types and sizes, and/or sometimes with different suspects  . Before applying such protocols, it is important to understand the nature of samples that arrive into the FSL and the problems that the samples or evidence try to solve. Figure 1 summarizes the major forensic functions of the
Figure 1. Major forensic functions performed by FSL in 2015 by type of jurisdiction based on sample receiving officers (SRO) data sheet.
FSL under study as received in Y2015.
Forensic scientist shall evaluate which items to analyze in a case based on several factors, which include nature of potential charge(s), location of item, and the nature of the item, i.e., biohazards, chemical hazards and insufficient sample   . The sample influx from the investigation teams determines the population or multiple unit items (MUI) to sample from. Thus, scientists must visually inspect each of the units in the item carefully as well as any contents for homogeneity in size, weight, color, packaging, markings, labeling, indications of tampering and other characteristics.
While increased sample influx creates demand for additional staff, it is impossible in most cases to add staff, because of the space limitations, limited funds to hire additional analysts, workload to all other analysts in other areas of FSL, limitation on skills requirements for forensic samples (e.g., biology/DNA) analysis and also due to procedures required in forensic sample chain of custody, and lack of extra funding.
In developing countries like Tanzania, analysts of required capability are not readily available in the labor market. However, literature shows that an increase in the number of available analysts alone will not solve all the problems associated with increase in sample influx or demand for services while other factors, including availability of equipment, reagents, standards, etc., will also create stress on the other side    .
Moreover, hiring additional analysts without hiring additional supervisor only leads to an increase in inefficiency. The question remains how can FSL increase efficiency apart from obtaining funding to hire additional analysts and supervisors  . The crime rate of the community served by FSL and the performance of the police investigation unit manifest in the number of case files and to a smaller extent on the sample influx at the SRO. The sample influx mainly depends on investigators’ knowledge and skills in sampling at crime scenes. There are also no standards regarding the number of FSL analysts and supervisors necessary to meet the need of a particular community with a particular crime rate to avoid piling of unattended or unanalyzed evidence in laboratories  .
One of the commonly identified causes of a laboratory’s receipt of large sample influx is the failure of stakeholders to understand the limitations of FSL’s disciplines. Also, lack of knowledge among stakeholders on sampling techniques is one of the major causes of the higher sample influxes  . Thus, training is needed on what is involved in conducting particular types of analysis and the time required to perform it. Moreover, training of investigation officers on sampling techniques is essential.
Some of the sample influx and workload issues stem from a lack of communication between investigating officers, attorneys and analysts. While the workload piles in FSL, laboratory directors are not told when defendants have pleaded guilty or when prosecutors have decided not to prosecute a case so that they can skip the analysis. The laboratories are also not told when task forces are being contemplated so that the laboratory can prepare for the increased workload. Furthermore, laboratories are often asked to handle “rush request”, which tend to disrupt the day-to-day operations of the laboratory.
High sample influx puts demand on staff requirement and time for accessioning. The process of receiving, sorting and labeling samples becomes demanding when there are a large number of samples per case files and when such case files are received at high rate. It should be noted that the sample receiving personnel are also responsible for accurate distribution of the samples to the correct laboratory managers for testing. Under extreme cases of large number of case files tied with high sample influx (such as a case of forensic chemistry), a burden for distribution to the respective laboratory managers increases.
There are three main sampling techniques: representative sampling, arbitrary sampling, and statistical sampling methods. A representative sampling procedure can be performed on a population of units with sufficient similar characteristics (e.g., size, color), which can be applied to drugs of abuse  . The decision on how to perform it is left to the analyst, which leads to the need for calling forensic science laboratory analysts to the crime scene when such substances are seized by police investigators. An example of similar external characteristics is very important. Considering a group of heroin street doses, packed in similar packaging, a sampling rule can be applied to this population. If a large number of street doses are seized with different groups of external characteristics, these have to be separated into as many groups as dissimilarities   . Each group is considered as a whole population and sampled alone.
During re-sampling, it is essential that the following principles are maintained: the properties of the sample are a true reflection of the properties of the population from which the samples were taken; and, each unit in the population has an equal chance of being selected    . For simplicity, after having observed that the external characteristics are the same, all the units can be put in a “black box” like a plastic bag, and a sample can be chosen randomly. This is applicable in cases like seizure of a thousand heroin street doses in similar external packages or a thousand tablets. Black box eliminates chances of a sample from consciously selecting specific items from the population.
Many forensic science laboratories have substantial backlogs of evidence not yet tested or otherwise processed like rape cases and DNA testing cases  . Clearing these backlogs is a major concern and goal of laboratory directors. Backlogs are such a problem for so many forensic scientists that their vision does not extend beyond clearing the backlogs. A survey conducted by USA Today in 1996 asked laboratory directors how they deal in the short run with their mushrooming caseloads with limited budgets and staff. Emerging coping strategies included: 1) prioritizing cases, that is, the most serious cases and cases with set court dates are worked first, and, 2) random sampling, which is a widely accepted approach in which laboratories test only a portion of confiscated drugs. However, many laboratories don’t encourage random sampling and some jurisdictions prohibit it, forcing analysts to spend countless extra hours doing analyses. The purpose of sample minimization by re-sampling is to utilize scientific methods to attain economic balance between service delivery and cost of analysis. Value finance, looking at information related to costing, pricing methods and revenue structure should allow FSL to generate revenue and sustain its profit stream over time. Component considerations should be focused on how FSL does business. This looks at totality of how FSL selects its customers, defines and differentiates its services portfolio, defines the tasks it will perform itself; and, configures its resources, creates utility for customers and captures profits   . Strategic outcome is networking the key interdependent systems that create and sustain the forensic science business or services.
3.1. Study Area and Data Collection Method
The sample influx data from two consecutive years was used to accomplish this study, that is, 2014 and 2015. The data was collected from a database of documented and computerized case files data. In each case file, information was tabulated such as: discipline of FSL where the analysis was conducted, number of samples per case file, average analysis cost per sample (based on the price list), also laboratory number and the date of registration. Each analysis request from a requesting authority (law enforcement agencies) was assessed and data entered into a data base, part of which is shown in Table 1, based on the data collected for the year 2015. This PAER focuses on the stochastic nature of the data in the 5th column (sample influx), while Figure 1 focuses on 6th column (requested analysis). The span of sample characteristics is portrayed in the last column (description of samples).
The data was made available after a management effort towards improved documentation which started in 2013. Out of 628 case files received by the FSL in 2014, 526 case files were analyzed in the three FSL disciplines, which form the major part of this analysis. In 2015, however, the number of case files analyzed by FSL disciplines increased to 732, also forming a major part of the study.
Table 1. Part of the database used for data collection (from Y2015).
3.2. Mathematical Formulations for Sample Influx Analysis
Table 2 summarizes the list of parameters developed in this study; definition and formula, where applicable; and, remarks on the respective parameter. In this paper, various ways for approximating the cost of laboratory analysis were considered, since different equipment with widely varying costs is used by the different FSL disciplines (and in some cases, by the same discipline), and since some equipment is utilized in different types of analysis. Thus, the equipment cost was not associated directly with the specific type of analysis, and instead, average cost per sample was used. Based on the above cost analysis, it was possible to determine the saving after applying a re-sampling technique. The study is meant to show that sampling techniques can bring different savings for forensic chemistry services offered by FSL based on the difference in Sf structure.
3.3. Selection of Re-Sampling Plans
Several sampling methods from literature were implemented for sample influx data submitted to FSL for forensic chemistry analysis  , as summarized in Table 3. The advantages and disadvantages of such sampling plans have been listed
Table 2. Mathematical formulations.
Table 3. Sampling methods applicable to sample influx into FSL  .
in Table 3 to provide an insight understanding of applicability, pros and cons of such sampling plans. The focus is to attain a minimum sample size but still maintain forensic report for further use in the courts and other decision making.
4. Results and Discussion
4.1. Detailed Analysis of Case File and Sample Influx Data
In this analysis, it is important to distinguish between case files and sample influx. The sample influx is the data attached to each case file as received from the investigation team. For this purpose the two data sets are discussed separately. Figure 2 shows the distribution of case files received into FSL by forensic science laboratory disciplines, i.e., forensic chemistry, biology/DNA and toxicology laboratories. Based on total case files received, it is evident that most case files requested biology/DNA analysis (537), forensic chemistry (511) and toxicology received the least number of case files (209) in the period of the two years. Moreover, the biology/DNA laboratory had a highest increase in case files between 2014 and 2015 (56.9%) as compared to toxicology (40.2%) and forensic chemistry (23.1% increases).
The increase in case files for biology/DNA laboratory can be attributed to increased awareness by stakeholders due to training for investigators, prosecutors and public campaign on DNA services offered by the FSL. It should be noted that there is no sampling that can be implemented to case files, as they have to be completed and reported. While each case file presents a varying number of samples, more focus on reducing the cost of analysis is on samples per case file or sample influx rather than Ncf which is a fixed variable during laboratory analysis.
Figure 3 presents the distribution of total number samples arriving into the three laboratories for the two consecutive years. The number of samples for forensic chemistry was always the highest in both years. For these two consecutive years a total of 13,048 samples were received in the forensic chemistry laboratory,
Figure 2. Case files received and reported in the two consecutive years.
Figure 3. Samples received and analysed in the two consecutive years.
which was about 7 and 15 times higher that case files received into biology/DNA and toxicology, respectively. Analysis of such samples of forensic chemistry, if conducted for each sample, is highly expensive and time consuming. However, re-sampling for such case can be made different from DNA and toxicology samples, as emphasized later. Progressively, there was a decrease in number of samples requesting forensic chemistry analysis (−4.9%) while the number of samples requiring DNA and toxicology analysis increased by 37.9% and 16.0%, respectively.
There is a great difference between the distribution of case files and samples into the forensic science laboratories, as stipulated in Figure 4 using pie charts. About 85% and 80% the samples for 2014 and 2015 were received in the forensic chemistry, while other laboratories received the remaining part. The portion of samples that required DNA analysis increased from 10% to 14% from 2014 to 2015. The change for toxicology samples received between 2014 and 2015 was slightly small, from 5% to 6% only. This small increase when compared to other laboratories should be differetiated from the 16%, which is a comparison within the same laboratory (in Figure 3). Figure 2 and Figure 4 shows the number of cases handled by FSL on the criminal justice point of view, by presenting the numbers of case file submitted, but also the burden of analysis to the laboratory staff by presenting the number of samples analyzed. This burden manifests itself in the FSL via human resource and financial requirements surge, necessitating analysis of re-sampling after the samples are submitted to the laboratory.
4.2. Laboratory Analysis Cost for Forensic Samples Received
The price list used in FSL gives analytical cost estimates for the specific discipline of forensic science laboratory. Price lists will tell the clients what charges to expect in the specific analytical services. It recommends that the retailer sales the product which is the analytical report at prices recommended by the Government.
Figure 4. Distribution of the case files (top charts) and samples (bottom charts) among the three FSL disciplines for 2014 and 2015.
Each case has several parameters requested by the client or requesting authority leading to the total cost of analysis presented to the client. Thus, each parameter is priced accordingly with respect to chemical, reagents and consultancy involved. However, such costs were estimated while assuming that other costs are covered by subsidies from the government, such a salaries, power, water, other utilities and overtime expenses, which is not the case for Tanzania FSL at the moment. Based on the price list, the average cost per sample was established to be $29.0, $42.0 and $14.2 for chemistry, biology/DNA and toxicology samples, respectively, as shown in Table 4, for as case of Y2014.
Based on number of samples (assumed uniform), it was possible to estimate the total annual analysis cost for the samples submitted to the three forensic science disciplines for both 2014 and 2015 as shown in Table 4 and also in Figure 5.
Figure 5 shows that the analysis cost for forensic chemistry samples is the highest (up to $194,000 per year) compared to the other two forensic science laboratories ($47,400 and $6500 only per year). The cost of analysis for case files and the corresponding sample influx varied in the ratio of 35:6:1 and 28:12:1 for forensic chemistry, biology/DNA and toxicology for year 2014 for 2015, respectively. Although Table 3 shows that the price per sample is highest for DNA analysis (about $42), compared to forensic chemistry samples, the large number of samples for the latter outweighs the total cost for DNA analysis by far (more than 4 times). Figure 4 shows one of the reasons why re-sampling plans are required for forensic chemistry samples as compared to forensic biology/DNA and toxicology.
Table 4. Analysis costs for received samples based on FSL price list (for Y2014).
Figure 5. Estimated annual analysis costs based on sample influx and average analysis cost per sample (from price list).
4.3. Time Series of Sample Influx into Forensic Science Laboratory
The sample influx into FSL does not have a uniform trend, and shows a random behaviour. For the data collected in 2015, and segregated into FSL disciplines category, the behaviour in high flux values for forensic chemistry compared to the flux into forensic biology/DNA and forensic toxicology is obvious as per Figure 6. In the FSL, Sf of up to 400 samples/case file were observed in forensic chemistry while for other laboratories the highest was about 25 samples/case file for forensic biology/DNA and about 20 samples per case file for forensic toxicology. Smallest sample influx values below 10 samples/case file were dominant for forensic toxicology.
The sample influx time series behavior observed necessitates use of advanced data analysis techniques like frequency analysis or chaos analysis to establish an insight understanding of the inherent behaviors.
4.4. Statistical Analysis of the Sample Influx Data
4.4.1. Analysis of Sample Influx and Total Number of Samples
The mean sample influx or the average sample/case file is equivalent to the total number of samples divided by number of case files for a period considered such
Figure 6. Time series of the sample influx data into forensic science laboratories for the year 2015.
as a year or a month. The sample influx profile was characterized by number of cases files received and number of samples in each case file (sample influx), as shown in Table 5.
The data presented in Table 5 starts with Ncf,d (number of case files for specific FSL discipline), mean sample influx (average number of samples per case file, mode (most recurring sample influx value), skewness and kurtosis as measures of spread in the sample influx data. The sum indicates the total number of samples received in the specific forensic science laboratory discipline for that year.
The average sample influx for chemistry, biology/DNA and toxicology were observed to be 30, 4 and 5 samples per case file for the year 2014, being highest for forensic chemistry than forensic biology/DNA and toxicology for both years. However, the mode values were 1, 3, and 1 for forensic chemistry, biology/DNA and toxicology respectively, being low compared to average value due to some of the case file having very large number of samples. However, the mode increased to 3 for toxicology for 2015. In case of forensic Biology/DNA the model Sf values remained high for this discipline due to the fact that, investigative questions in case files with respect to forensic biology/DNA are not answered through a single sample and also sample influx remained consistently at 3 samples due to cases involving paternity issues. Different biological samples have equal chance to contribute to the search or missing identification of a person. Therefore, it is important to have as much samples that will give answers beyond reasonable doubts, giving high skewness of 12.2 for 2014. In the same year, crime scene investigation officers received training on sampling organized by the FSL, leading to a drop in Sk,d in 2015 for DNA.
However, most case files in forensic biology/DNA are related to paternity testing (requests submitted by advocates, courts and social welfare officers), whereby it involves three types of samples that may answer the request, leading to a mode of 3 and higher Ku of 163.3. This data is important for FSL management in terms of staffing or placement since sample influx data from case files received do not recur. Also in terms of supply chain management (SCM), special procurement methods and efforts for goods required in the laboratories are required.
The higher kurtosis for biology/DNA sample influx data shows that there is a high peak around the model value of 3 samples per case file with very few case
Table 5. Sample influx distribution and statistics by forensic science disciplines for the year 2014.
files having large sample influx. This is obvious because most of the case files for paternity cases are submitted with 3 samples adding the number to the criminal cases. The sample influx data shows positive skewness, i.e., peaked on the left and longer tails on the right at higher values Sf values which appear at lower frequency.
The forensic chemistry sample influx data shows highest average value of 30 samples/case file compared to forensic toxicology and biology/NDA with only 5 and 4 samples per case file, respectively. Similar trend was observed for the year 2015. Similarly, the total number of samples, Ns was highest for chemistry (6689), than biology/DNA (818) and for toxicology (393) with similar trends in both 2014 and 2015. This data supports the need for cost saving concepts like implementing re-sampling plans.
While the number of case files for biology/DNA and chemistry were very close (209 and 229 respectively) in 2014, the total number of samples were completely different, being very high for forensic chemistry than biology/NDA (85%) compared to 10% only as received for analysis, as per Figures 1-3. The forensic toxicology samples are characterized by few samples per case file, leading to a small fraction of total samples, that is, 5% and 6% only for 2014 and 2015, respectively.
4.4.2. Probability Functions for Sample Influx Data
Figure 7 shows the PDF of sample influx data for the three forensic science laboratories collected for 2014 and 2015, respectively. The tallest peak corresponding to forensic biology/DNA in both plots can be attributed to the fact that this laboratory has a high tendency of receiving at least three samples per case file for paternity testing causing a distinct model value of 3. The higher values of sample influx for this laboratory discipline results from criminal cases whose samples are collected from crime scenes. The results indicates high positive skewness, that is, sample influx data shows an asymmetric tail extending towards more positive values (i.e., towards very high sample flux at lower frequency).
This trend was observed for data from both 2014 and 2015. The Sk,d and Ku,d values together with average values were presented in Table 3. The shapes of PDF curves were the same for each forensic science discipline, for both years. Moreover, the curves were specific and different for each laboratory disciplines. In comparison, the sample influx data for forensic chemistry shows longest tail of all laboratories close to 1000 samples/case file. Based on the observed analysis cost (Figures 4-6) it was necessary to study a possibility of applying re-sampling plans for forensic chemistry samples, using scientific methods.
Figure 8 shows the cumulative distribution functions for sample influx into the three forensic science laboratory disciplines (chemistry, biology/DNA and toxicology), for the two consecutive years. The curves were observed to be similar for 2014 and 2015, indicating that controlling factors remained the same.
The cumulative charts show that the forensic chemistry samples occur at higher frequency than other samples for Sf < 3 samples/case file. Between 3 < Sf < 60 samples/case files, forensic biology and toxicology samples appear more frequently
Figure 7. Probability density functions of sample influx data for FSL.
than forensic chemistry. For Sf > 60 samples/case file, more of the biology/DNA and toxicology samples appears, showing that very high samples influx to FSL are those of forensic chemistry only. For the year 2015, the maximum Sf values were distinct, being 25, 60 and 400 for forensic biology/DNA, toxicology and chemistry, respectively.
4.5. Cost of Analysis for Case File and Samples without Re-Sampling
To start with, the analysis costs were determined for each case file, with the corresponding sample influx, using the average cost per sample. According Table 3, this is a so called “all Nas = Sf”, that is, when the analyst has 100% certainty about the composition of the population. The disadvantage is mainly excessive sample sizes for larger populations like sample influx for case forensic chemistry observed in this study.
The results are presented in Figure 9 for all three forensic science laboratories.
Figure 8. Cumulative distribution functions for sample influx data from three forensic science laboratories.
It is interesting to note that all the Sf data sets leads to semi-logarithmic curves, differing only in the cost values due to differences in the average cost per sample. The forensic chemistry data set, however, was very distinguishable with large number of data points on the graph, despite the similar number of case files with biology/DNA, due to strong variation of sample influx data as per Figure 9. In all three cases, the data follows a general rule as per Equation (11):
with values of a equal to cost per sample based on price list and b = 1, thus, varying according to discipline of forensic samples. A semi-logarithmic scale was used due to wide differences in sample influx values between the three disciplines, but also due to large number of values at lower ranges of Sf. It should be noted that the data points plotted in Figure 9 correspond to the actual data of sample influx received within FSL for 2014 or 2015, at a varying frequency or repetition, such that when multiplied by average analysis cost per sample, the cost function is obtained.
As stated above, the plot for forensic chemistry shows a large number of data points compared to other disciplines mainly because the latter shows least variations in sample influx, leading to repeating Sf values. However, in Figure 9, where cumulative functions for forensic chemistry have large number of data points, the data points correspond to the large number of case files.
4.6. Minimization Function for Sample Influx into Forensic Chemistry
Based on the results presented in Figures 1-9, it is evident that forensic chemistry
Figure 9. Analysis cost for the case files without re-sampling (Y2014).
sample influx is not only the highest, but also the widely spread, also with large number of case files. Based on conditions of re-sampling, only forensic chemistry samples satisfy the conditions. Figure 10 shows the results of six sampling regimes applicable to the sample influx into forensic chemistry laboratory. The plot is one logarithmic scales, being linear except for the non-power function,. The latter was also observed to yield highest number of arbitrary samples to be analyzed per case file. The focus is to attain a minimum sample size but still maintain useful forensic report for further use in the courts and other decision making. It is possible based on Figure 10 to establish a sampling plan to use, which leads to minimum number of samples or minimum analysis costs, for instance, for Sf between 20 and 100 samples per case file, minimum cost is attained by using the first sampling regime, beyond and before
Figure 10. Number of samples required under different arbitrary sampling plans applied to sample influx.
which it does not lead to minimum number of sample. The fact that no re-sampling is required for Sf = 1, can be deduced from the point (1, 1) as a starting point on the graph in Figure 10. Another key point to note the annual variations in Sf are not important in determining the number of arbitrary samples to test, Nas, using any selected re-sampling plan, as data points will only shift along the curve.
On the upper part of Figure 10, the maximum possible number of samples can also be determined based on selecting the sampling regime that leads to maximum samples per case file. Noting that for forensic chemistry samples the average price is used, it is possible to analyze the data to determine the pricing even when different pricing regime is used based on Figure 10. That is, this graph can be used for different types of samples also for different case files where it is required to apply arbitrary sampling.
Figure 11 shows the locus of the minimum samples by selecting sampling regimes that yield minimum values of arbitrary sample to analyze, for a given range of sample influx. The minimum number of arbitrary samples is given by a thick line at the bottom of the curves, which combines five sampling plans without any order.
Based on Figure 11, it is evident that to attain minimum arbitrary samples, five sampling regimes must be used depending on a range of samples influx arriving to the FSL. For example, for, , and, sampling plans giving minimum samples are 3rd, 5th and 4th, respectively. On the other hand, for, , and samples per case file, the plans giving minimum samples to be analyzed are 2nd, 1st and 4th, in this order, respectively. The 4th sampling plan reappears again at the higher
Figure 11. Locus of the minimum number of samples required using selected arbitrary sampling plans.
Sf values beyond 100 samples/case file. Thus, the minimum number of arbitrary samples, for a wide range of sample influx, can be represented using a mathematical expression shown in Equation (12), also denoted as arbitrary sample minimization function:
4.7. Optimization of Analysis by Arbitrary Re-Sampling Forensic Chemistry Sample Influx
After establishing the samples to analyze for different sampling plans, it was possible to determine the cost of analysis per case file using the average cost per sample (from Table 2). Figure 12 shows the analysis costs after re-sampling of forensic chemistry samples. This cost is more effective and saves a large portion of FSL running cost, while giving useful forensic report.
Figure 13 shows the locus of minimum analysis cost per case file, by purposeful selecting re-sampling schemes. Results show that the minimum cost of analysis is obtained when 2nd, 1st and 4th re-sampling regimes are applied for, and, respectively. In Figure 13, the locus of minimum cost has only three sampling plans compared to those used to
Figure 12. Cost of analysis of the case files when arbitrary sampling is applied for forensic chemistry case files.
Figure 13. Determination of the minimum cost of analysis per case file using selected arbitrary sampling plans.
determine arbitrary number samples.
Based on results presented in Figure 13, the minimum function for the cost per case file, can be presented as per Equation (13):
4.8. Analysis of Savings from Arbitrary Re-Sampling of Forensic Chemistry Sample Influx
Comparing the costs of analysis before and after re-sampling, it was possible to estimate the saving that can be realized by the FSL, as presented in Figure 14. Analysis of savings aims at attaining maximum valves, different from cost analysis which aims at minimum cost. Thus, based on Figure 14, the maximum saving locus was determined, leading to results presented using a dotted line. Results show that the difference in saving is more pronounced at low sample influx values (Sf = 1 to 10 samples per case file), beyond which the difference due to the choice of sampling plan is negligible.
Based on the above discussion, it can be concluded that, although the number of case files received into forensic chemistry and forensic biology/DNA was comparable in 2014 and 2015, the number of samples was totally different, at a ratio of 17:2:1 in 2014 and 14:2.5:1 in 2015 for forensic chemistry, forensic biology/DNA and forensic toxicology, respectively. The cost for analysis was highest for forensic chemistry, due to large number of samples received, being about
Figure 14. Saving realized from arbitrary re-sampling for forensic chemistry sample influx.
35:6:1 and 28:12:1 for three forensic science disciplines (forensic chemistry, biology/DNA and toxicology) for the two consecutive years, 2014 and 2015, respectively. Being the highest for all disciplines of FSL, the analysis cost for forensic chemistry necessitates implementation of re-sampling plans whenever possible. Based on statistical analysis of sample influx data, the average number of samples per case file was highest for forensic chemistry (30 samples per case file) compared to forensic biology/DNA and toxicology (about 3 - 5 samples per case file) on average. The PDFs of sample influx data for forensic biology/DNA are characterized by high positive skewness compared to other disciplines, while the forensic chemistry has higher maximum value of Sf, reaching a maximum of 600 samples per case file. The PDFs of forensic toxicology and DNA show peaks at 1 and 3 samples per case file, while forensic chemistry discipline shows a single maximum at Sf = 3 samples per case file with lowest skewness.
Using different arbitrary sampling rules to the sample influx data for forensic chemistry (where multiple similar items are received), gives minimum possible number of samples to be analyzed, locus of which was presented mathematically and graphically for the first time in this paper. Moreover, using the case file analysis cost for forensic chemistry samples determined using different arbitrary sampling techniques, the minimum analysis cost per case file for three different ranges of sample influx was determined and presented both graphically and mathematically. The maximum saving based on comparison between analysis cost with and without arbitrary re-sampling, was presented graphically. It was further observed that, the effect of sampling plan on savings diminished with increasing number of samples per case file.
The author is grateful to the management and staff of the Government Chemist Laboratory Authority (GCLA) for support during the course of this study.
 Chen, T.F. (2009) Building a Platform Business Model 2.0 to Creating Real Business Value with Web 2.0 for Web Information Services Industry. International Journal of Electronic Business Management, 7, 168-180.
 Lovrich, N.P., Gaffney, M.J., Pratt, T.C., Johnson, C.L., Asplen, C.H., Hurst, L.H. and Schellberg, T.M. (2004) National Forensic DNA Study Report: Final Report. Washington State University, Division of Governmental Studies and Services, Pullman, Washington DC, Document 203970.
 Bashinski, J. (2010) DNA Laboratory Monthly Statistics. Sacramento, California, Office of the Attorney General, California Department of Justice, Sacramento.
 Hurst, L. and Lothridge, K. (2010) DNA Evidence and Offender Analysis Measurement: DNA Backlogs, Capacity and Funding. U.S. Department of Justice, Washington DC, 230328.
 Peterson, J.L. and Hickman, M.J. (2005) Census of Publicly Funded Forensic Crime Laboratories, 2002. Bureau of Justice Statistics Bulletin, NCJ 207205.
 Nelson, M. (2011) Making Some of DNA Backlog in 2010-Myths vs, Reality. Special Report, US-Department of Justice, National Institute of Justice (NIJ). www.ojp.usdoj.gov/mj
 Pratt, T.C., Gaffney, M.J., Lovrich, N.P. and Johnson, C.L. (2006) This Isn’t CSI: Estimating the National Backlog of Forensic DNA Cases and the Barriers Associated with Case Processing. Criminal Justice Policy Review, 17, 32-47.
 Strom, K.J. and Hickman, M.J. (2010) Unanalyzed Evidence in Law-Enforcement Agencies: A National Examination of Forensic Processing in Police Departments. Criminology and Public Policy, 9, 381-404.
 Wang, J.F. (2016) Rapid Detection of Human Follicular Tissues at the Crime Scene: A Portable and Digital Solution to Current Backlog of Rape Kits. Journal of Forensic Medicine, 1, e101.