Concatenate rows of two dataframes in pandas

PythonPandasDataframe

Python Problem Overview


I need to concatenate two dataframes df_a anddf_b having equal number of rows (nRow) one after another without any consideration of keys. This function is similar to cbind in R programming language. The number of columns in each dataframe may be different.

The resultant dataframe will have the same number of rows nRow and number of columns equal to the sum of number of columns in both the dataframes. In othe words, this is a blind columnar concatenation of two dataframes.

import pandas as pd
dict_data = {'Treatment': ['C', 'C', 'C'], 'Biorep': ['A', 'A', 'A'], 'Techrep': [1, 1, 1], 'AAseq': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'mz':[500.0, 500.5, 501.0]}
df_a = pd.DataFrame(dict_data)
dict_data = {'Treatment1': ['C', 'C', 'C'], 'Biorep1': ['A', 'A', 'A'], 'Techrep1': [1, 1, 1], 'AAseq1': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'inte1':[1100.0, 1050.0, 1010.0]}
df_b = pd.DataFrame(dict_data)

Python Solutions


Solution 1 - Python

call concat and pass param axis=1 to concatenate column-wise:

In [5]:

pd.concat([df_a,df_b], axis=1)
Out[5]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  

There is a useful guide to the various methods of merging, joining and concatenating online.

For example, as you have no clashing columns you can merge and use the indices as they have the same number of rows:

In [6]:

df_a.merge(df_b, left_index=True, right_index=True)
Out[6]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  

And for the same reasons as above a simple join works too:

In [7]:

df_a.join(df_b)
Out[7]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  

Solution 2 - Python

Thanks to @EdChum I was struggling with same problem especially when indexes do not match. Unfortunatly in pandas guide this case is missed (when you for example delete some rows)

import pandas as pd
t=pd.DataFrame()
t['a']=[1,2,3,4]
t=t.loc[t['a']>1] #now index starts from 1

u=pd.DataFrame()
u['b']=[1,2,3] #index starts from 0

#option 1
#keep index of t
u.index = t.index 

#option 2
#index of t starts from 0
t.reset_index(drop=True, inplace=True)

#now concat will keep number of rows 
r=pd.concat([t,u], axis=1)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1140126View Question on Stackoverflow
Solution 1 - PythonEdChumView Answer on Stackoverflow
Solution 2 - PythonYury WalletView Answer on Stackoverflow