Introducing Radical.sh

Forget Code launches a powerful code generator for building API's

Group by column in Pandas

Pandas support group by one or more columns with group_by method.

Syntax : Group by a column name in pandas
dataset.group_by('column_name')

Group by method returns grouped data frame object, and other aggregation operations can be performed on grouped data frame

Example : Get count(*) for every group in pandas
import pandas
data = pandas.DataFrame({'Student Name' : ['Anil', 'Musk','Bill'], 
                        'Class' : [1,2,2], 
                        'Age' : [6, 7, 8 ]})
data.groupby("Class").size()
data.groupby("Class").size().to_frame('count').reset_index()


to_frame converts the size() object into pandas data frame, so that further pandas operations can be applied. By default to_frame converts first column into index column, so reset_index() method is called again to create additional index column. If reset index is not used, and further pandas operations are performed on Class column, you will get key error.