如何使用Python将两个数据集进行关联并拼接?
- 内容介绍
- 文章标签
- 相关推荐
本文共计201个文字,预计阅读时间需要1分钟。
使用`pd.merge`连接两个数据集`t2`和`t3`,以`user_id`和`coupon_id`作为连接键,结果存储在变量`t3`中。接着,计算`t3`中`this_month_user_receive_same_coupon_lastone`的值。代码如下:
pythont3=t3.merge(t2, on=['user_id', 'coupon_id'], how='left')t3['this_month_user_receive_same_coupon_lastone']=t3.mean()
python_两个数据集拼接&join操作
t3 = dataset3[['user_id','coupon_id','date_received']]t3 = pd.merge(t3,t2,on=['user_id','coupon_id'],how='left')
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
t3['this_month_user_receive_same_coupon_firstone'] = t3.date_received - t3.min_date_received
#根据多个字段就进行merge
other_feature3 = pd.merge(t1,t,on='user_id')
other_feature3 = pd.merge(other_feature3,t3,on=['user_id','coupon_id'])
other_feature3 = pd.merge(other_feature3,t4,on=['user_id','date_received'])
other_feature3 = pd.merge(other_feature3,t5,on=['user_id','coupon_id','date_received'])
other_feature3 = pd.merge(other_feature3,t7,on=['user_id','coupon_id','date_received'])
other_feature3.to_csv('data/other_feature3.csv',index=None)
#拼接数据集
#两个数据框合并为一个
df_train_stmt = pd.concat([df_train_stmt,df_train_stmt_test],axis = 0)
本文共计201个文字,预计阅读时间需要1分钟。
使用`pd.merge`连接两个数据集`t2`和`t3`,以`user_id`和`coupon_id`作为连接键,结果存储在变量`t3`中。接着,计算`t3`中`this_month_user_receive_same_coupon_lastone`的值。代码如下:
pythont3=t3.merge(t2, on=['user_id', 'coupon_id'], how='left')t3['this_month_user_receive_same_coupon_lastone']=t3.mean()
python_两个数据集拼接&join操作
t3 = dataset3[['user_id','coupon_id','date_received']]t3 = pd.merge(t3,t2,on=['user_id','coupon_id'],how='left')
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
t3['this_month_user_receive_same_coupon_firstone'] = t3.date_received - t3.min_date_received
#根据多个字段就进行merge
other_feature3 = pd.merge(t1,t,on='user_id')
other_feature3 = pd.merge(other_feature3,t3,on=['user_id','coupon_id'])
other_feature3 = pd.merge(other_feature3,t4,on=['user_id','date_received'])
other_feature3 = pd.merge(other_feature3,t5,on=['user_id','coupon_id','date_received'])
other_feature3 = pd.merge(other_feature3,t7,on=['user_id','coupon_id','date_received'])
other_feature3.to_csv('data/other_feature3.csv',index=None)
#拼接数据集
#两个数据框合并为一个
df_train_stmt = pd.concat([df_train_stmt,df_train_stmt_test],axis = 0)

