The pandas cheatsheet covers DataFrame creation, reading data, selection (loc/iloc), groupby, merge, pivot, apply, and data cleaning. Search by operation and copy any example with one click.
No results found
How to Use This Pandas Cheatsheet
Pandas is the go-to Python library for data analysis, providing DataFrame and Series objects for structured data manipulation. This cheatsheet covers the operations you'll use daily for data processing and analysis.
DataFrames vs Series
A DataFrame is a 2D table (columns + index). A Series is a 1D column. Most operations work on both. Selecting a single column returns a Series: df['col']. Selecting multiple columns returns a DataFrame: df[['col1', 'col2']].
loc vs iloc
Use df.loc for label-based selection (row index + column names). Use df.iloc for integer position-based selection. df.loc[condition, 'col'] is the most common pattern for filtered column selection.
Method Chaining
Pandas operations return DataFrames, enabling method chaining: df.dropna().groupby('dept')['salary'].mean().sort_values(ascending=False). This reads like a SQL query — filter, group, aggregate, order.
Frequently Asked Questions
Is this pandas cheatsheet free?
Yes, completely free with no signup. All examples are copyable.
What is the difference between loc and iloc in pandas?
loc is label-based selection: df.loc[row_label, col_label] uses actual index values and column names. iloc is integer-based: df.iloc[0, 1] uses position (0-indexed). Use loc when your index has meaningful labels, iloc when you need positional access.
How do I handle missing data in pandas?
Use df.isna() or df.isnull() to detect NaN values. df.dropna() removes rows with any NaN. df.fillna(value) replaces NaN with a value or uses method='ffill' for forward fill. df.interpolate() fills gaps with interpolated values. Always check for NaN before analysis.
What is the difference between merge and join in pandas?
pd.merge() is the main function for database-style joins on columns or indexes. df.join() is a convenience method that joins on the index by default. Use merge for column-based joins (like SQL), join for index-based joins when columns overlap.
How do I apply a function to every row or column in pandas?
df.apply(func, axis=0) applies to each column, df.apply(func, axis=1) applies to each row. For element-wise operations use df.applymap(func) or df.map(func). For simple transformations, vectorized operations (df['col'] * 2) are faster than apply.