Note
Screenshots may differ slightly depending on software version.
If you still have the Longley data active in Stat-Ease® software from Part 1 of this tutorial, continue on. If you exited the program, re-start it and use the Help, Tutorial Data menu and select Employment. Under the Design branch of the program, click Evaluation. The software brings up a quadratic polynomial model by default. The screen shot shows the Response field set at “Design Only” as opposed to the Employment response. In other words, it will evaluate the entire matrix of factors, regardless whether response data are present. The other option (response by response) comes in handy when experimenters end up with missing data, thus degrading the “designed-for” model.
Press the Results tab.
This model is badly aliased. For example, the effect of A is confounded with -24.5 CD, etc. Go back to Model and reduce the Order to Linear.
Press Results again, then move to the Alias Matrix pane and note “No aliases found…” Much better!
Move over to the Degrees of Freedom pane to evaluate them.
Looking over the annotations provided by the software (activated via View, Show Annotation), notice this design flunks the recommendation for pure error df. Of course this really is not a designed experiment, but rather historical data collected at happenstance.
Study the Model Terms section of the evaluation. Do any of the statistics pass the tests suggested for a good design? No!
Now move on to the Leverage report. These statistics come out surprisingly good – none exceeds twice the average.
More statistics are available by going back to Model, selecting Options, and turning on (checkmarks) Matrix Measure and Highlight Correlation Values (if not already selected).
Click OK and view the Results. Now look at the Matrix pane and go to the Matrix Measures tab to see new statistics.
Notice the condition number (12,220) far exceeds the level considered to represent severe multicollinearity for a design matrix (1000 or fewer). Check out the Correlation Matrix and Pearson’s r panes to see specific correlations and reveal why.
Note
Click the blue layout icons on your toolbar to select different pane layouts. Click and drag the tab for each pane to the different sections to customize your view.
The Correlation Matrix shows how the factors are correlated with one another on a scale of -1 (perfect negative correlation) to +1 (perfect positive correlation). These correlations are shown in a grid form and color coded to see at a glance where there may be issues. Remember, we don’t want our factors to be correlated. We want independent estimates of how they affect the responses. Therefore, white boxes on the grid are good. By just glancing at this grid, you can see there are a lot of correlations among factors (dark blue and red colors). It’s no wonder Longley picked this data set to test regression software! The Pearson’s r matrix shows Pearson’s correlation coefficients. It’s just a different way of calculating correlation. You can learn more about that by clicking on the tips () icon.
Now, just for fun, press the Graphs tab and select Perturbation from the toolbar.
Notice factors B and F exhibit the most dramatic tracks for standard error. On the Graphs Toolbar select 3D Surface. On the Factors Tool, right-click factor F:Time and change it to X1 axis.
There’s no sense doing anything more. By now it’s clear that this ‘design’ fails all the tests for a good experiment, but that’s generally the nature of the beast for happenstance data.