# Build confusion matrix from two vector

## Issue

I need to define a function that generates a confusion matrix. So I have two vectors, `y_label` and `y_predict`, the element values of which are either 0, 1, 2. The goal of the function is to create a count of labels:

``````  | 0 | 1 | 2 |
--------------
0 |   |   |   |
--------------
1 |   |   |   |
--------------
2 |   |   |   |
--------------
``````

For example, `cm[0,1]` should contain counts of elements where y_label[i] = 0 and y_predict[i] = 1, for every i.

So far, this is what I’ve done:

``````def get_confusion_matrix(y_label, y_fit):

cm = np.ndarray([3,3])

for i in range(3):
for j in range(3):
cm[i, j] = ....

return cm
``````

Of course, I can easily do multiple-level `for` loops to count, but I want to avoid that if there are short cuts in Python / numpy.

I’m thinking also of making `y_label` and `y_predict` merged to become an array of tuples, then using dict-zip technique, similar to here:

How to count the occurrence of certain item in an ndarray in Python?

But the solution is still a bit hazy on my head. Please confirm if this is also possible.

## Solution

Here’s a quick way to create the confusion matrix, using `numpy.add.at`.

First, here’s some sample data:

``````In [93]: y_label
Out[93]: array([2, 2, 0, 0, 1, 0, 0, 2, 1, 1, 0, 0, 1, 2, 1, 0])

In [94]: y_predict
Out[94]: array([2, 1, 0, 0, 0, 0, 0, 1, 0, 2, 2, 1, 0, 0, 2, 2])
``````

Create the array `cm` containing zeros, and then add 1 at each index `(y_label[i], y_predict[i])`:

``````In [95]: cm = np.zeros((3, 3), dtype=int)

In [96]: np.add.at(cm, (y_label, y_predict), 1)

In [97]: cm
Out[97]:
array([[4, 1, 2],
[3, 0, 2],
[1, 2, 1]])
``````