pandas是Python/ target=_blank class=infotextkey>Python中的一个数据分析库,它提供了各种用于数据操作和数据分析的函数和数据结构。Pandas是专门为处理表格和混杂数据设计的,非常适合于清洗、整理和处理数据。他的主要功能包括:
下面是pandas的20个常用示例:
import pandas as pd
df = pd.read_csv('path/to/file.csv')
import pandas as pd
df = pd.read_excel('path/to/file.xlsx')
import pandas as pd
df = pd.read_json('path/to/file.json')
df_new = df[df['column_name'] == 'column_value']
df_new = df[['column_name1', 'column_name2']]
df_new = df.drop_duplicates()
df_new = df.fillna('missing')
df_new['column_name'] = df_new['column_name'].astype('int64')
df_new = pd.merge(df1, df2, on='column_name', how='inner')
df_new = pd.merge(df1, df2, on='column_name', how='left')
df_new = pd.merge(df1, df2, on='column_name', how='right')
grouped = df.groupby('column_name')
grouped = df.groupby('column_name')['column_name1'].sum()
grouped = df.groupby('column_name').agg({'column_name1': 'mean', 'column_name2': 'max'})
pivot = pd.pivot_table(df, values='value', index='index_column', columns='column_name')
pivot = pd.pivot_table(df, values='value', index='index_column', columns='column_name', aggfunc='mean')
df_new = df.sort_values('column_name', ascending=False)
mean = df['column_name'].mean()
std = df['column_name'].std()
min_value = df['column_name'].min()
max_value = df['column_name'].max()
median = df['column_name'].median()
q1 = df['column_name'].quantile(0.25)
q3 = df['column_name'].quantile(0.75)
df_new = df.rename(columns={'old_column_name': 'new_column_name'})
df_new = df.drop(['column_name'], axis=1)
注意事项: