# Heatmap of common values between columns in dataframe

## Issue

I have a Pandas dataframe `df` with two columns, `A` and `B` such that all values in `B` would be different for a value in `A`, but two different values in `A` can have duplicate values in `B`.

``````     A     B
ABC1   XYZ XYZ 123
ABC2   XYZ XYZ 123
ABC1   XYZ XYZ 135
ABC3   XYZ PQR 123
ABC2   XYZ PQR 123
ABC3   XYZ XYZ 135
ABC4   XYZ XYZ 135
ABC2   XYZ PQR 987
ABC4   XYZ PQR 123
ABC5   PQR PQR 567
``````

How can I create a heatmap from this dataframe that shows me the number of common `B` values between any two `A` values? I presume I’ll have to create a pivot that might look something like this –

``````         ABC1   ABC2   ABC3   ABC4   ABC5
ABC1      2      1     ...
ABC2      1      3
ABC3      0      1            ...
ABC4      0      0     ...            ...
ABC5      0      0
``````

Any help on how this can be done efficiently?

## Solution

I think we can use `get_dummies` and `np.tensordot` for this.

First, let’s find out for each value of `A` which `B` are present

``````>>> common = pd.get_dummies(df['A']).groupby(df['B']).max()
>>> common
Bar  Foo
B
2    1    0
3    0    1
4    0    1
5    1    1
6    1    0
9    1    0
``````

Then to compute the intersections of each rows we need the dot product of every pair of rows, for that we can use:

``````>>> np.tensordot(common, common, (0, 0))
array([[4, 1],
[1, 3]], dtype=uint8)
>>> pd.DataFrame(np.tensordot(common, common, (0, 0)), columns=common.columns, index=common.columns)
Bar  Foo
Bar    4    1
Foo    1    3
``````

The diagonal shows the number of unique `B` values for each `A` column.