How to insert a string column to another string column in pandas dataframe?

Question

I have a dataset with over 100,000rows and 300 columns,

Here is the sample dataset:

pd.options.display.max_colwidth = 1000

df = pd.DataFrame({'EVENT_DTL':['1. Name : John Johns \n2. Date : 05 March 2013 \n3. founded : 75075 Plano, Dallas Texas \n4. Charactor : Impersive \n5. Corona corelation : Cannot be found',
                               '1. Name : Mark Dwaine \n2. Date : 13 January 2020 \n3. founded : 45184 Miami, Florida \n4. Charactor : Slow learner \n5. Corona corelation : Suicide because of the economic difficulty',
                               '1. Name : Janny chung \n2. Date : 11 December 2011 \n3. founded : 77543 Bay area, San Fransisco \n4. Charactor : Always ambitious \n5. Corona corelation : Cannot be found but probably related to epidemic',
                               '1. Name : Sally \n2. Date : 11 December 2021 \n3. founded : 75074 Saginow, Fort Worth \n4. Charactor : energetic \n5. Corona corelation : Her friends guess it is because of corona'],
                   'EVENT_DTL_2':['He is always fast mover','He is brillient, smart','she is kind of person who is always eager to learn new subejct','he was a lunatic, his neighber said']})
df.loc[2,'EVENT_DTL_2'] = np.nan

df

I'm trying to insert 'EVENT_DTL_2' to 'EVENT_DTL' but next to the \n4. Charactor : xxx substring

The desired output is:

df2 = pd.DataFrame({'EVENT_DTL':['1. Name : John Johns \n2. Date : 05 March 2013 \n3. founded : 75075 Plano, Dallas Texas \n4. Charactor : Impersive He is always fast mover\n5. Corona corelation : Cannot be found',
                               '1. Name : Mark Dwaine \n2. Date : 13 January 2020 \n3. founded : 45184 Miami, Florida \n4. Charactor : Slow learner He is brillient, smart\n5. Corona corelation : Suicide because of the economic difficulty',
                               '1. Name : Janny chung \n2. Date : 11 December 2011 \n3. founded : 77543 Bay area, San Fransisco \n4. Charactor : Always ambitious \n5. Corona corelation : Cannot be found but probably related to epidemic',
                               '1. Name : Sally \n2. Date : 11 December 2021 \n3. founded : 75074 Saginow, Fort Worth \n4. Charactor : energetic he was a lunatic, his neighber said\n5. Corona corelation : Her friends guess it is because of corona'],
                   'EVENT_DTL_2':['He is always fast mover','He is brillient, smart',np.nan,'he was a lunatic, his neighber said']})
df2

I need a efficient way since I need to apply the method the very large dataset.

mozway · Accepted Answer · 2022-11-15 07:42:34Z

1

You can split and merge again:

df2 = df['EVENT_DTL'].str.split('(?<=\n4\.)', expand=True)
df['EVENT_DTL'] = df2[0]+' '+df['EVENT_DTL_2']+' '+df2[1]

answered Nov 15, 2022 at 7:42

mozway

267k13 gold badges55 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to insert a string column to another string column in pandas dataframe?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related