Build confusion matrix from two vector

Issue

I need to define a function that generates a confusion matrix. So I have two vectors, y_label and y_predict, the element values of which are either 0, 1, 2. The goal of the function is to create a count of labels:

  | 0 | 1 | 2 |
--------------
0 |   |   |   |
--------------
1 |   |   |   |
--------------
2 |   |   |   |
--------------

For example, cm[0,1] should contain counts of elements where y_label[i] = 0 and y_predict[i] = 1, for every i.

So far, this is what I’ve done:

def get_confusion_matrix(y_label, y_fit):

    cm = np.ndarray([3,3])

    for i in range(3):
        for j in range(3):
            cm[i, j] = ....

    return cm

Of course, I can easily do multiple-level for loops to count, but I want to avoid that if there are short cuts in Python / numpy.

I’m thinking also of making y_label and y_predict merged to become an array of tuples, then using dict-zip technique, similar to here:

How to count the occurrence of certain item in an ndarray in Python?

But the solution is still a bit hazy on my head. Please confirm if this is also possible.

Solution

Here’s a quick way to create the confusion matrix, using numpy.add.at.

First, here’s some sample data:

In [93]: y_label
Out[93]: array([2, 2, 0, 0, 1, 0, 0, 2, 1, 1, 0, 0, 1, 2, 1, 0])

In [94]: y_predict
Out[94]: array([2, 1, 0, 0, 0, 0, 0, 1, 0, 2, 2, 1, 0, 0, 2, 2])

Create the array cm containing zeros, and then add 1 at each index (y_label[i], y_predict[i]):

In [95]: cm = np.zeros((3, 3), dtype=int)

In [96]: np.add.at(cm, (y_label, y_predict), 1)

In [97]: cm
Out[97]: 
array([[4, 1, 2],
       [3, 0, 2],
       [1, 2, 1]])

Answered By – Warren Weckesser

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published