This basic structure is entirely general (figure 1 A), and is
well described in textbooks such as (Duda *et al.* , 2001 )
and (Hastie *et al.* , 2001 ).

Since there are two possible classes, the outcome of any predictions relative to the ‘true’ class membership is usually set out as a binary matrix, the so-called confusion matrix (figure 1 B), consisting of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN).

Some metrics derived from the confusion matrix of figure 1 ,
where *N* is the total number of samples and a,b,c,d refer
to numbers rather than percentages.

Some methods such as Principal Components Analysis (Jolliffe, 1986
) and a variety of clustering methods (Everitt, 1993 ; Handl *et
al.* , 2005 ) use only the x-data as defined in figure 1
and are fundamentally designed for what Tukey called Exploratory
Data Analysis (Tukey, 1977 ).

Either way, the final model can be expressed in terms of a multivariate classifier as defined in figure 1 A.

**
This image is from the article titled "Statistical strategies for avoiding false discoveries in metabolomics and related experiments"
