Creating and Using Pivot Tables in Pandas

Pandas provide a way to summarize and analyze your data set in a spreadsheet-like format. They are incredibly versatile for aggregating, sorting, and filtering data. Let's explore how to use pivot tables in Pandas, from basic to advanced concepts.
20 Videos
No Coding Experience Required
45 Assignments
Self Paced
An abstract design featuring smooth curves and geometric shapes, creating a minimalist aesthetic.

Sign Up For Free

Join now for expert-led courses, hands-on exercises, and a supportive learning community!

Pivot tables

in Pandas provide a way to summarize and analyze your data set in a spreadsheet-like format. They are incredibly versatile for aggregating, sorting, and filtering data. Let's explore how to use pivot tables in Pandas, from basic to advanced concepts.

Advanced: Custom Aggregates with aggfunc

You're not limited to predefined functions. You can define your own aggregation function.

# Custom aggregation function to find the range of salaries
def range_salary(series):
    return series.max() - series.min()

pivot_custom_agg = pd.pivot_table(df,
                                  values='Salary',
                                  index='Department',
                                  aggfunc=range_salary)
print(pivot_custom_agg)

Scenario: Sales Data Analysis

Imagine we have a dataset containing sales transactions over a year for a multi-branch company. This dataset includes the transaction date, branch location, department, salesperson, and sales amount. We want to analyze monthly sales trends per branch, performance by department within each branch, and identify top-performing salespersons.

import pandas as pd
import numpy as np

# Sample data
data = {
    'Date': pd.date_range(start='2023-01-01', periods=120, freq='D'),
    'Branch': np.random.choice(['North', 'South', 'East', 'West'], 120),
    'Department': np.random.choice(['Electronics', 'Clothing', 'Home & Garden'], 120),
    'Salesperson': np.random.choice(['Alice', 'Bob', 'Charlie', 'Diane'], 120),
    'Sales': np.random.randint(100, 1000, size=120)
}

df = pd.DataFrame(data)

# Ensure 'Date' is datetime type
df['Date'] = pd.to_datetime(df['Date'])
df.head()

Analyzing Monthly Sales Trends Per Branch

monthly_sales = pd.pivot_table(df,
                               values='Sales',
                               index=[pd.Grouper(key='Date', freq='M'), 'Branch'],
                               aggfunc=np.sum)
print(monthly_sales)

Performance by Department Within Each Branch

department_performance = pd.pivot_table(df,
                                        values='Sales',
                                        index=['Branch', 'Department'],
                                        columns=pd.Grouper(key='Date', freq='M'),
                                        aggfunc=np.sum,
                                        fill_value=0)
print(department_performance)

Identifying Top-Performing Salespersons

top_salespersons = pd.pivot_table(df,
                                   values='Sales',
                                   index=['Branch', 'Salesperson'],
                                   aggfunc={'Sales': [np.sum, np.mean, 'count']},
                                   fill_value=0).sort_values(by=['sum'], ascending=False)
print(top_salespersons)
# Pivot table to calculate monthly sales
monthly_growth = pd.pivot_table(df,
                                values='Sales',
                                index=pd.Grouper(key='Date', freq='M'),
                                aggfunc=np.sum)

# Calculate month-over-month growth rate
monthly_growth['Growth Rate'] = monthly_growth.pct_change().fillna(0)
print(monthly_growth)
# Sample DataFrames
df_patients = pd.DataFrame({
    'Patient ID': [101, 102, 103],
    'Name': ['Alice', 'Bob', 'Charlie'],
    'DOB': ['1990-01-01', '1985-05-12', '1978-11-30'],
    'Gender': ['F', 'M', 'M']
})

df_records = pd.DataFrame({
    'Record ID': ['R1', 'R2', 'R3'],
    'Patient ID': [101, 102, 101],
    'Diagnosis': ['Flu', 'Cold', 'Allergy'],
    'Treatment Date': ['2023-03-01', '2023-03-05', '2023-03-08']
})

# Merging DataFrames
df_merged_healthcare = pd.merge(df_patients, df_records, on='Patient ID')

df_merged_healthcare

Conclusion

These examples demonstrate the power of pivot tables in Pandas for advanced data analysis. By manipulating and aggregating data in various ways, we can extract meaningful insights about sales trends, department performance, and individual contributions within a company. The ability to customize aggregation functions and leverage hierarchical indexes allows for in-depth exploration of complex datasets.

Lesson Assignment
Challenge yourself with our lab assignment and put your skills to test.
# Python Program to find the area of triangle

a = 5
b = 6
c = 7

# Uncomment below to take inputs from the user
# a = float(input('Enter first side: '))
# b = float(input('Enter second side: '))
# c = float(input('Enter third side: '))

# calculate the semi-perimeter
s = (a + b + c) / 2

# calculate the area
area = (s*(s-a)*(s-b)*(s-c)) ** 0.5
print('The area of the triangle is %0.2f' %area)
Sign up to get access to our code lab and run this code.
AI icon

AI Assistant For Help

Enhance your learning experience with our AI Learning Assistant. This sophisticated tool seamlessly evaluates your progress, course materials, and code, providing customized feedback and suggestions on the spot.
development icon

Flexible Mobile Coding

Engage with your coding tasks anytime, anywhere. Our adaptable, mobile optimized IDE lets you execute programming tasks directly from any web enabled device.
web
search icon

Project Development Support

Navigate through project challenges effortlessly with AI- powered support and swift access to a resource- rich community network.
file sharing icon

On-Demand Documentation

Quickly access integrated, context-specific documentation directly within the learning platform, streamlining your study process without the need to switch applications.
An abstract design featuring smooth curves and geometric shapes, creating a minimalist aesthetic.

Ready to become a Data Scientist that industry loves to hire? Apply Now. 

Explore Courses