一起学习,一起成长!
DataFrame数据结构对查询方式是数据处理与分析中经常使用对。比如,我们通常使用对Excel数据文件,通常都是这种数据结构。所以,该数据结构对数据查询或称数据过滤方式极为重要,具体内容如下:
In [41]: dates=pd.date_range('1/1/2019',periods=100,freq='W-WED')
In [42]: long_df=DataFrame(np.random.randn(100,4),index=dates,columns=['字段1','字段2','字段3','字段4'])
In [43]: long_df['1/2019']
Out[43]:
字段1 字段2 字段3 字段4
2019-01-02 -0.612222 -1.040934 2.082731 1.348500
2019-01-09 -0.335291 0.381831 0.744737 0.651845
2019-01-16 0.286719 0.661222 0.064456 1.137021
2019-01-23 1.217550 -0.099077 1.297057 -0.570431
2019-01-30 -0.852166 -0.794072 1.374697 0.260344
In [44]: long_df[['字段1','字段2']]['1/2019']
Out[44]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
2019-01-16 0.286719 0.661222
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
In [45]: long_df[['字段1','字段2']]['1/2/2019':'1/10/2019']
Out[45]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
In [46]: long_df[['字段1','字段2']][:'1/10/2019']
Out[46]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
In [47]: long_df[['字段1','字段2']][:'2/10/2019']
Out[47]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
2019-01-16 0.286719 0.661222
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
2019-02-06 -0.029565 0.044062
In [53]: long_df[['字段1','字段2']][datetime(2019,1,5):datetime(2019,2,15)]
Out[53]:
字段1 字段2
2019-01-09 -0.335291 0.381831
2019-01-16 0.286719 0.661222
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
2019-02-06 -0.029565 0.044062
2019-02-13 0.783809 -0.098006
In [58]: long_df[['字段1','字段2']][:datetime(2019,2,6)]
Out[58]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
2019-01-16 0.286719 0.661222
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
2019-02-06 -0.029565 0.044062
In [60]: long_df[['字段1','字段2']].truncate(after='2/6/2019')
Out[60]:
字段1 字段2
2019-01-02 -0.612222 -1.040934
2019-01-09 -0.335291 0.381831
2019-01-16 0.286719 0.661222
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
2019-02-06 -0.029565 0.044062
In [62]: long_df[['字段1','字段2']].truncate(before='1/20/2019',after='2/6/2019')
Out[62]:
字段1 字段2
2019-01-23 1.217550 -0.099077
2019-01-30 -0.852166 -0.794072
2019-02-06 -0.029565 0.044062