How to explode mutiple colulms using pandas dataframe

Question

df=spark.sql("select key, name, subjects from table")

df in from above select statement :

key name    subjects
12  x,y,z   1,2,3
20  a,b     8,7

df out :

tried converting to list , explode. Still throwing error. pls help the efficient way to achieve this ?

Related to this question.

Quang Hoang
– Quang Hoang

2021-02-04 04:42:42 +00:00
Commented Feb 4, 2021 at 4:42 — Quang Hoang
– Quang Hoang, Commented Feb 4, 2021 at 4:42

Chris · Accepted Answer · 2021-02-04 04:43:30Z

2

One way using pandas.DataFrame.apply:

# df["name"] = df["name"].str.split(",")
# df["subjects"] = df["subjects"].str.split(",")
# If not already split

new_df = df.apply(pd.Series.explode)
print(new_df)

Output:

   key name subjects
0   12    x        1
0   12    y        2
0   12    z        3
1   20    a        8
1   20    b        7

answered Feb 4, 2021 at 4:43

Chris

29.8k3 gold badges34 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

codek · Accepted Answer · 2021-02-08 04:29:48Z

0

Thanks chris. It is getting exploded. Still facing the error - Cannot reindex from a duplicate axis. Concat with ignore index is not working .Is it possible to generate temp unique indexes as key is duplicated during explode. pandasversion -1.0.5

df["name"] = df["name"].str.split(",") 
df["subjects"] = df["subjects"].str.split(",") 
new_df= df.apply(pd.Series.explode).reindex()

edited Feb 8, 2021 at 4:29

answered Feb 7, 2021 at 15:02

codek

653 silver badges9 bronze badges

Collectives™ on Stack Overflow

How to explode mutiple colulms using pandas dataframe

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related