Python Loop for Multiple Row String Concatination

Question

I am looking to create a loop in python that will concatenate multiple rows of strings together. I have created the table that I have now listed as "Before" and then the table I am trying to create "After". Any thoughts on how to do this? I am currently using the following code to get just one string but I need to be able to loop the entire data frame:

df.str.cat(sep='')

Before:

Text       |    Channel  |  Destination   | Amount  | Total
string1           NaN           NaN           NaN      NaN
string2           DKI           US             34       5   
string3           NaN           NaN           NaN      NaN
string4           DKI           CA             39       20

After:

Text           |    Channel  |  Destination   | Amount  | Total
string1string2        DKI           US            34       5
string3string4        DKI           CA            39       20

Please show your current attempts, and clarify the logic by which you want to concatenate the strings (why do All and purpose go together, for instance?) — sacuL
– sacuL, Commented Jun 27, 2018 at 14:00
@sacul I am trying to just concatenate strings. I have updated the tables — jumpman23
– jumpman23, Commented Jun 27, 2018 at 14:02
Possible duplicate of How to concatenate values of all rows in a dataframe into a single row without altering the columns? — Nae
– Nae, Commented Jun 27, 2018 at 14:03
@sacul The way to determine string1 and string2 go together is concatenating everything from the NaN down to where the first number is. Then especially restarting the concatenate. — jumpman23
– jumpman23, Commented Jun 27, 2018 at 14:05

jezrael · Accepted Answer · 2018-06-27 14:28:51Z

2

Create helper Series by shift, check non NaNs by notna and create groups by cumsum.

Then aggregate dy dict of functions, remove index name and for same columns order add reindex:

a = df['total'].shift().notna().cumsum()
#for oldier pandas versions
#a = df['total'].shift().notnull().cumsum()
d = {'row':'first', 'total':'last', 'Text':''.join}

df = df.groupby(a).agg(d).rename_axis(None).reindex(columns=df.columns)
print (df)
   row            Text  total
0    1  string1string2    3.0
1    3  string3string4    1.0

edited Jun 27, 2018 at 14:28

answered Jun 27, 2018 at 14:05

jezrael

868k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

jezrael Over a year ago

@ScottBoston - I test it with multiple NaNs

Scott Boston Over a year ago

Gotcha. I see. I knew there was a reason you're doing it that way. Thank you.

jumpman23 Over a year ago

@jezrael What is I have multiple metrics to group and then also multiple text columns?

jezrael Over a year ago

@jumpman23 - Then change aggregate dictionary like d = { 'Text':''.join, 'Channel':'last', 'Destination':'last', 'Amount':'last', 'Total':'last'}

jumpman23 Over a year ago

@jezrael Doesn't a in your code need to be adjusted as well?

Collectives™ on Stack Overflow

Python Loop for Multiple Row String Concatination

1 Answer 1

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Linked

Related