Python Pandas : Pivot table : aggfunc concatenate instead of np.size or np.sum

Issue

I have some entries in dataframe like :

name, age, phonenumber
 A,10, Phone1
 A,10,Phone2
 B,21,PhoneB1
 B,21,PhoneB2
 C,23,PhoneC

Here is what I am trying to achieve as result of pivot table:

 name, age, phonenumbers, phonenocount
 A,10, "Phone1,Phone2" , 2
 B,21,  "PhoneB1,PhoneB2", 2
 C,23, "PhoneC" , 1

I was trying something like :

pd.pivot_table(phonedf, index=['name','age','phonenumbers'], values=['phonenumbers'], aggfunc=np.size)

however I want the phone numbers to be concatenated as part of aggfunc.
Any Suggestions ?

Solution

You can use agg function after the groupby:

df.groupby(['name', 'age'])['phonenumber'].\
    agg({'phonecount': pd.Series.nunique, 
         'phonenumber': lambda x: ','.join(x)
        }
       )

#               phonenumber  phonecount
# name  age     
#    A   10   Phone1,Phone2           2
#    B   21 PhoneB1,PhoneB2           2
#    C   23          PhoneC           1

Or a shorter version according to @root and @Jon Clements:

df.groupby(['name', 'age'])['phonenumber'].\
   agg({'phonecount': 'nunique', 'phonenumber': ','.join})

Answered By – Psidom

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published