A few weeks ago, a process engineer hoping to glean a model of yield as function of 8 factors asked me to explain the failure by analysis of variance (ANOVA) to produce p values. See this deficiency on the left side of the software output shown in Figure 1. On the right side notice the dire warning about the fit statistics. The missing p’s and other non-available (“NA”) stats created great concern about the validity of the entire analysis.
Figure 1: Alarming results in ANOVA and fit statistics
The tip-off for what went wrong can be found in the footnote: “Case(s) with leverage of 1.” After poring over the inputs, which stemmed from existing data—not from a designed experiment, I discovered that many of the rows had been duplicated. Removing these ‘dups’ left only 9 unique runs to fit a linear model featuring 8 coefficients for the 8 factors (main-effect slopes) plus 1 coefficient required for the intercept. The statistical software did the best it could from this ‘mission impossible.’ It did nothing wrong.
Creating total leverage as in this multifactor case can be likened to fitting a line to two points. It leaves no degrees of freedom (df) for estimating error (see this shown in the Figure 1 ANOVA). Thus, the F-test cannot be performed and, therefore, no p values can be estimated.
A model can be generated (barely!), but the lack of statistical tests provides no confidence in the outcome, literally (zero).
The remedy is very simple: Collect more data!
Leverage is a numerical value between 0 and 1 that indicates the potential for a design point to influence the model fit. It’s strictly a function of the design itself—not the responses. Thus, leverage can be assessed before running the experiment.
A leverage of 1 means that the model will exactly fit the observation. That is never good because, unless that point falls exactly where it ought to be, your predictive model will be off kilter.
Leverage (“L”) is an easy statistic to master. It equals the number of coefficients in your model divided by the number of unique runs (dups do not count!).
You have seen what happens when all the runs are completely leveraged (L=1). But even one run at a leverage of 1 creates issues. For example, consider a hypothetical experiment aimed at establishing a linear fit of a key process attribute Y on a single factor X. The researchers intend to make 20 runs at two levels. However, due to circumstances beyond their control, they only achieve one run at high level. The 10 points at the low end come in at a leverage of 0.1 each, so none of them individually create much influence on the fitting. That’s good. But the single point at the highest level exhibits a leverage of 1, so it will be exactly fitted wherever it may go. That’s not good, but it may be OK if the result is where it ought to be. However, if something unusual happens at high level, there will be no way of knowing. I would be very skeptical of such an experiment—best to go for a complete ‘do over.’
Watch for leverages close to 1.0. Consider replicating these points, or make sure they are run very carefully.
Some designs, such as standard two-level factorials with no center points, produce runs with equal leverage. However, others do not. For example, a two-level design on 4 factors with 4 center points features 16 runs with a leverage of 0.9875—far exceeding the center-point leverages of 0.05. Nevertheless, applying generally accepted guidelines that leverages less than 2 times the average cause no great concern, this design gets a pass—the average leverage being 0.8. A two-level design with center points is like a teeter-totter, points at the center are at the fulcrum and thus create very low leverage.
I advise you focus only on runs with leverage greater than 2 times the average leverage (or any with leverage of 1, of course). It is best to identify high-leverage points before running the experiment via a design evaluation and, if affordable, replicate them, thus reducing their leverage.
Do not be greatly concerned if leverages get flagged after you reduce insignificant terms from your model. For example, see the case study by our founder Pat Whitcomb in his article on “Bad Leverages” in the March 1998, Stat-Teaser—a must read if you want to get a good grasp on leverage.
Keep in mind that, despite being flagged for high leverage (2x average), a design point may generate a response that typifies how the process behaves at that setting. In that case it does not invalidate the model. Apply your subject matter and/or ask an expert colleague to be the judge of that.
If you use standard DOE templates or optimal tools to lay out an experiment, it is unlikely that your design will include points with leverage over twice the average leverage. But, if you override the defaults and warnings in your software, issues with leverage can arise. For example, I often see published factorial designs with only 1 center point—not the 3 or 4 that our software advises. This creates a leverage of 1 for the curvature test—not good. Believe it or not, as a peer reviewer for a number of technical journals I’ve also seen many manuscripts that lay out the recommended number of center points for standard designs (e.g., 4 for a two-level factorial). But they all show the same results. As already explained, when it comes to leverage do not be duped by ‘dups.’
I am particularly wary of historical data with runs done haphazardly (no plan). These often create a cloud of points at one end with very few at the opposite extreme. For example, see the scatter plot in Figure 2 (real data from a study of infection rates after varying number of days at various hospitals in the USA).
Figure 2: A real-life dataset with a badly leveraged point
In this case, the point at the upper right exhibits a leverage of 0.99 versus all the other 12 points averaging 0.17. If possible, replicating such a high-leverage point would be very helpful, thus reducing its leverage by half. Better yet, do two more replicates to reduce this problematic point’s leverage by one-third. Though not emerging as an outlier in the diagnostics (very unlikely for a highly leveraged point--it will be closely fitted), this particular result must be carefully evaluated and ignored if determined to be exceptional.
Pay attention to leverage, ideally before you complete your experiment, but if you are developing a model from existing data, do so in the diagnostics from your statistical software. Beware of totally leveraged runs—this being the worse-case scenario. If not quite this bad, watch for leverages more than twice the average—if possible, replicate them. Otherwise, apply engineering and scientific expertise to decide if the results can be accepted.
Here's the latest Publication Roundup! In these monthly posts, we'll feature recent papers that cited Design-Expert® or Stat-Ease® 360 software. Please submit your paper to us if you haven't seen it featured yet!
Microwave-assisted extraction of bioactive compounds from Urtica dioica using solvent-based process optimization and characterization
Scientific Reports volume 15, Article number: 25375 (2025)
Authors: Anjali Sahal, Afzal Hussain, Ritesh Mishra, Sakshi Pandey, Ankita Dobhal, Waseem Ahmad, Vinod Kumar, Umesh Chandra Lohani, Sanjay Kumar
Mark's comments: Kudos to this team for deploying a Box-Behnken response-surface-method design--convenient by only requiring 3 levels of each of their 3 factors (power, time and sample-to-solvent ratio)--to optimize their process. Given all the raw data I was able to easily copy it out and import it into my Stat-Ease software and check into the modeling--no major issues uncovered. The authors did well by diagnosing residuals and making use of our numerical optimization tools to find the most desirable factor combination for their multiple-response goals.
Be sure to check out this important study, and the other research listed below!
Here's the latest Publication Roundup! In these monthly posts, we'll feature recent papers that cited Design-Expert® or Stat-Ease® 360 software. Please submit your paper to us if you haven't seen it featured yet!
Orange-Fleshed Sweet Potatoes, Grain Amaranth, Biofortified Beans, and Maize Composite Flour Formulation Optimization and Product Characterization
Food Science and Nutrition, Volume 13, Issue 6. June 2025.
Authors: Julius Byamukama, Robert Mugabi, Dorothy Nakimbugwe, John Muyonga
Mark's comments: "It's good to see response surface methods for optimization of food recipes via mixture design. I appreciate publications that include all the data needed to assess the predictive modeling. Kudus to EU for funding research like this that alleviates malnutrition in vulnerable populations."
Be sure to check out this important study, and the other research listed below!
Here's the latest Publication Roundup! In these monthly posts, we'll feature recent papers that cited Design-Expert® or Stat-Ease® 360 software. Please submit your paper to us if you haven't seen it featured yet!
Enhanced 4-chlorophenol adsorption from aqueous solution using eco-friendly nanocomposite
Ecological Engineering & Environmental Technology, 26(5), pp.174-189
Authors: Fadia A. Sulaiman, Rasha Khalid Sabri Mhemid, Noor A. Mohammed
Mark's comments: It is great to see the application of response surface methods (RSM) to reduce the release of toxic chlorophenols to our environment, particularly via such an eco-friendly process utilizing a natural polymer--xanthan gum. The 3D graphics are compelling and well supported by the reported statistics. I also appreciate that all the raw data is including, making it possible for me to reproduce the results.
Be sure to check out this important study, and the other research listed below!
Ideally all variables other than those included in an experiment are held constant or blocked out in a controlled fashion. However, sometimes a variable that one knows will create an important effect, such as ambient temperature or humidity, cannot be controlled. In such cases it pays to collect measurements run by run. Then the results can be analyzed with and without this ‘covariate.’
Douglas Montgomery provides a great example of analysis of covariance in section 15.3 of his textbook Design and Analysis of Experiments. It details a simple comparative experiment aimed at assessing the breaking strength in pounds of monofilament-fiber produced by three machines. The process engineer collected five samples at random from each machine, measuring the diameter of each (knowing this could affect the outcome) and testing them out. The results by machine are shown below with the diameters, measured in mils (thousandths of an inch), provided in the parentheses:
The data on diameter can be easily captured via a second response column alongside the strength measures. Montgomery reports that “there is no reason to believe that machines produce fibers of different diameters.” Therefore, creating a new factor column, copying in the diameters and regressing out its impact on strength leads to a clearer view of the differences attributed to the machines.
I will now show you the procedure for handling a covariate with Stat-Ease software. However, before doing so, analyze the experiment as planned and save this work so you can do a before and after comparison.
Figure 1 illustrates how to insert a new factor. As seen in the screenshot, I recommend this be done before the first controlled factor.
Figure 1: Inserting a new factor column for the covariate entered initially as a response
The Edit Info dialog box then appears. Type in the name and units of measure for the covariate and the actual range from low to high.
Figure 2: Detailing the covariate as a factor, including the actual range
Press “Yes” to confirm the change in actual values when the warning pops up.
Figure 3: Warning about actual values.
After the new factor column appears, the rows will be crossed out. However, when you copy over the covariate data, the software stops being so ‘cross’ (pun intended).
Press ahead to the analysis. Include only the main effect of the covariate in your model. The remainder of the terms involving controlled factors may go beyond linear if estimable. As a start, select the same terms as done before adding the covariate.
In this case, the model must be linear due to there being only one factor (machine) and it being categorical. The p-value on the effect increases from 0.0442 (significant at p<0.05) with only the machine modeled—not the diameter—to 0.1181 (not significant!) with diameter included as a covariate. The story becomes even more interesting by viewing the effects plots.
Figure 4: No covariate.
Figure 5: With covariate accounted for.
You can see that the least significant difference (LSD) bars decrease considerably from Figure 4 to Figure 5 without and with the covariate; respectively. That is a good sign—the fitting becomes far more precise by taking diameter (the covariate) into account. However, as Montgomery says, the process engineer reaches “exactly the opposite conclusion”—Machine 3 looking very weak (literally!) without considering the monofilament diameter, but when doing the covariate analysis, it becomes more closely aligned with the other two machines.
In conclusion, this case illustrates the value of recording external variables run-by-run throughout your experiment whenever possible. They then can be studied via covariate analysis for a more precise model of your factors and their effects.
This case is a bit tricky due to the question of whether fiber strength by machine differs due to them producing differing diameters, in which case this should be modeled as the primary response. A far less problematic example would be an experiment investigating the drying time of different types of paint in an uncontrolled environment. Obviously, the type of paint does not affect the temperature or humidity. By recording ambient conditions, the coating researcher could then see if they varied greatly during the experiment and, if so, include the data on these uncontrolled variables in the model via covariate analysis. That would be very wise!
PS: Joe Carriere, a fellow consultant at Stat-Ease, suggested I discuss this topic—very appealing to me as a chemical process engineer. He found the monofilament machine example, which I found very helpful (also good by seeing agreement in statistical results between our software and the one used by Montgomery).
PPS: For more advice on covariates, see this topic Help.