Issue
I have a dataframe like below
df = pd.DataFrame({'col1': ['A', 'A', 'B', 'C', 'D', 'D'],
'col2': [1,0,1,0,0,1]})
col1 | col2 |
---|---|
A | 1 |
A | 0 |
B | 1 |
C | 0 |
D | 0 |
D | 1 |
If there’s one row from col2 = 1, I want to make all col2 rows = 1 for the same value col1 value. The table I’d want is as below:
col1 | col2 |
---|---|
A | 1 |
A | 1 |
B | 1 |
C | 0 |
D | 1 |
D | 1 |
Any help is much appreciated!
Solution
If you only have 0 and 1 in the column, you can use groupby
transform
:
df['col2'] = df.col2.groupby(df.col1).transform('max')
df
col1 col2
0 A 1
1 A 1
2 B 1
3 C 0
4 D 1
5 D 1
In general you can pass a lambda to return 1 if any value is 1:
df['col2'] = df.col2.groupby(df.col1).transform(lambda g: 1 if (g == 1).any() else g)
df
col1 col2
0 A 1
1 A 1
2 B 1
3 C 0
4 D 1
5 D 1
Answered By – Psidom
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0