I have input like this .
Input:
|customerId|Header |Line |
|1001 |1001aa |1001aa1 |
|1001 |1001aa |1001aa2 |
|1001 |1001aa |1001aa3 |
|1001 |1001aa |1001aa4 |
|1002 |1002bb |1002bb1 |
|1002 |1002bb |1002bb2 |
|1002 |1002bb |1002bb3 |
|1002 |1002bb |1002bb4 |
|1003 |1003cc |1003cc1 |
|1003 |1003cc |1003cc2 |
|1003 |1003cc |1003cc3 |
+----------+-----------+---------+
Using Dataframe and UDF I am able to do this
But I would like to have those column names as well with struct Datatype. Any help is appreciated.
val udfHeaderLineList1 = udf((header:String,line:Seq[String])=>{
line.map(records=>List(header,records)).toList
})
val eventingDFtable = my_dataframe_data_Table.
groupBy(col("customerId"), col("header")).
agg(collect_list(col("Line")).alias("Line")).
withColumn("TransHeaderStruct",udfHeaderLineList1(col("header"),col("Line"))).printSchema
