I am trying to split an Dataframe into multiple arrays according to their id.
So I have a table
id|name
12|a
12|b
12|c
13|z
13|y
13|z
and I want to get multiple vectors that look like:
<a,b,c> <x,y,z>
So I have managed to get all the different IDs using:
val ids=dataframe.select("id").distinct.collect.flatMap(_.toSeq)
and that would return 12 and 13. And I have tried to get for each one of them the names:
val namesArray=ids.map(id=>dataframe.where($"id"===id))
but that doesnt seem to be the way since it is returning the column names and I should find a way to get only the name out of it.