Explode Maptype column in pyspark

Question

I have a dataframe like this

data = [(("ID1", {'A': 1, 'B': 2}))]
df = spark.createDataFrame(data, ["ID", "Coll"])
df.show()

+---+----------------+
| ID|            Coll|
+---+----------------+
|ID1|[A -> 1, B -> 2]|
+---+----------------+

df.printSchema()
root
 |-- ID: string (nullable = true)
 |-- Coll: map (nullable = true)
 |    |-- key: string
 |    |-- value: long (valueContainsNull = true)

I want to explode the 'Coll' column such that

+---+-----------+
| ID| Key| Value|
+---+-----------+
|ID1|   A|     1|
|ID1|   B|     2| 
+---+-----------+

I am trying to do this in pyspark

I am successful if I use only one column, however I want the ID column as well

df.select(explode("Coll").alias("x", "y")).show()

+---+---+
|  x|  y|
+---+---+
|  A|  1|
|  B|  2|
+---+---+

Shaido · Accepted Answer · 2019-03-07 09:47:11Z

6

Simply add the ID column to the select and it should work:

df.select("id", explode("Coll").alias("x", "y"))

answered Mar 7, 2019 at 9:47

Shaido

28.6k26 gold badges76 silver badges82 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Explode Maptype column in pyspark

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related