A Byte of my 2.2-lb Brain

Just sharing stuff…

pandas groupby Function

Grouping

To group data frame elements by column, say group data frame df by Column 1, use

grouped_df = df.groupby("Column 1")

To group by multiple columns, the argument for the `pandas.groupby()` function would now be a list of column names. Note that the order of the column names in the list matters: group by the first element in the list first, then the second, then third, etc. So, if I want to group elements in my df by Column 1 and then by Column 2, the expression is just

grouped_df = df.groupby(["Column 1", "Column 2"])

Accessing the Groups

The groups are identified by keys. To access the groupings by key, you first need to know what the keys are. To do this, just execute the following.

groupings = dict(list(grouped_df))

Once you have the keys, you may already access the group elements. For example, if you want to see the elements under group key1, the statements to use are:

key1 = groupings.keys()[0]
print grouped_df.get_group(key1)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Information

This entry was posted on April 20, 2015 by in Geek and tagged , .
%d bloggers like this: