Issue
I’m new at coding and feel like to really understand it, I have to truly grasp the concepts.
Quality of life edit:
Why do we do df[df[‘col a’]] == x? INSTEAD of df[‘col a’] == x? when making a search? I understand that on the second expression I would be looking at column names that equal X but I’d love to know what does the addition of making it a list (df[]) does for the code
I would love to know the difference between those two and what I am actually doing when I nest the column on a list.
any help is appreciated thank you so much!
Solution
In general, df[index]
selects slices from a dataframe based on an index.
Pandas supports several different indexing methods. The expression in your question chains two of them together. First, the inner index df['col_a']
selects all values in column col_a
. These are evaluated in a boolean expression that returns a series that is "masked" with True where the values in the column meet a condition and False elsewhere. The outer part then uses boolean indexing to select all rows in the entire dataframe that meet this condition.
Example:
df = pd.DataFrame({'column1': [0, 1, 2, 3, 4], 'column2': ['x', 'x', 'x', 'y', 'y']})
[In] df
[Out]
column1 column2
0 a x
1 b x
2 c x
3 d y
4 e y
Selecting a single column:
[In] df['column2']
[Out]
0 x
1 x
2 x
3 y
4 y
Name: column2, dtype: object
Creating a mask:
[In] df['column2'] == 'x'
[Out]
0 True
1 True
2 True
3 False
4 False
Name: column2, dtype: bool
Selecting all rows that have value x
in column column2
:
[In] df[df['column2'] == 'x']
[Out]
column1 column2
0 a x
1 b x
2 c x
Answered By – lua
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0