如何使用Python将两个数据集进行关联并拼接?

2026-05-21 15:211阅读0评论SEO资源
  • 内容介绍
  • 文章标签
  • 相关推荐

本文共计195个文字,预计阅读时间需要1分钟。

如何使用Python将两个数据集进行关联并拼接?

使用`pandas`的`merge`函数连接两个数据集,并根据`user_id`和`coupon_id`字段进行左连接。以下为简化的代码:

pythont3=pd.merge(t3, t2, on=['user_id', 'coupon_id'], how='left')t3=pd.merge(t3, t2, on=['user_id', 'coupon_id'], how='left')

如何使用Python将两个数据集进行关联并拼接?


python_两个数据集拼接&join操作

t3 = dataset3[['user_id','coupon_id','date_received']]
t3 = pd.merge(t3,t2,on=['user_id','coupon_id'],how='left')
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
t3['this_month_user_receive_same_coupon_firstone'] = t3.date_received - t3.min_date_received

#根据多个字段就进行merge
other_feature3 = pd.merge(t1,t,on='user_id')
other_feature3 = pd.merge(other_feature3,t3,on=['user_id','coupon_id'])
other_feature3 = pd.merge(other_feature3,t4,on=['user_id','date_received'])
other_feature3 = pd.merge(other_feature3,t5,on=['user_id','coupon_id','date_received'])
other_feature3 = pd.merge(other_feature3,t7,on=['user_id','coupon_id','date_received'])
other_feature3.to_csv('data/other_feature3.csv',index=None)

#拼接数据集
#两个数据框合并为一个
df_train_stmt = pd.concat([df_train_stmt,df_train_stmt_test],axis = 0)


本文共计195个文字,预计阅读时间需要1分钟。

如何使用Python将两个数据集进行关联并拼接?

使用`pandas`的`merge`函数连接两个数据集,并根据`user_id`和`coupon_id`字段进行左连接。以下为简化的代码:

pythont3=pd.merge(t3, t2, on=['user_id', 'coupon_id'], how='left')t3=pd.merge(t3, t2, on=['user_id', 'coupon_id'], how='left')

如何使用Python将两个数据集进行关联并拼接?


python_两个数据集拼接&join操作

t3 = dataset3[['user_id','coupon_id','date_received']]
t3 = pd.merge(t3,t2,on=['user_id','coupon_id'],how='left')
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
t3['this_month_user_receive_same_coupon_firstone'] = t3.date_received - t3.min_date_received

#根据多个字段就进行merge
other_feature3 = pd.merge(t1,t,on='user_id')
other_feature3 = pd.merge(other_feature3,t3,on=['user_id','coupon_id'])
other_feature3 = pd.merge(other_feature3,t4,on=['user_id','date_received'])
other_feature3 = pd.merge(other_feature3,t5,on=['user_id','coupon_id','date_received'])
other_feature3 = pd.merge(other_feature3,t7,on=['user_id','coupon_id','date_received'])
other_feature3.to_csv('data/other_feature3.csv',index=None)

#拼接数据集
#两个数据框合并为一个
df_train_stmt = pd.concat([df_train_stmt,df_train_stmt_test],axis = 0)