convert and reformat multiple month abbreviations into integer

Issue

Given the below column:

     col 
  0  NaN
  1  Jan,Apr,Jul,Oct
  2  Jan,Jun,Jul
  3  Apr,May,Oct,Nov
  4  NaN
  ...

How to convert the months abbreviation to month number (int)? and replace tghe comma with a hyphen? The output should look like:

     col 
  0  NaN
  1  01-04-07-10
  2  01-06-07
  3  04-05-10-11
  4  NaN
  ...

Solution

Use Series.replace with dictionary with months and if necesary is added , for replace to -:

d = {',':'-', 'Jan': '01', 'Feb': '02', 'Mar': '03', 'Apr': '04', 
              'May': '05', 'Jun': '06', 'Jul': '07', 'Aug': '08', 
              'Sep': '09', 'Oct': '10', 'Nov': '11', 'Dec': '12'}

df['col'] = df['col'].replace(d, regex=True)
print (df)

0          NaN
1  01-04-07-10
2     01-06-07
3  04-05-10-11
4          NaN

Answered By – jezrael

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published