1

In scala I have a List[String] which I want to add as a new Column to an existing DataFrame.

Original DF:

Name  | Date
======|===========
Rohan | 2007-12-21
...   | ...
...   | ...

Suppose want to add a new Column of Department

Expected DF:

Name | Date       | Department
=====|============|============
Rohan| 2007-12-21 | Comp
...  | ...        | ...
...  | ...        | ...

How can I do this in Scala?

7
  • Is there any relation between the columns ? Any rules ? Commented Oct 3, 2016 at 9:15
  • So you want to join them on? I wouldn't think you just want to add some random values. Commented Oct 3, 2016 at 10:16
  • Probaly you want to join your df on name with another df. Commented Oct 3, 2016 at 10:37
  • @eliasah No rules just a new column of data. Commented Oct 3, 2016 at 12:59
  • @Reactormonk Just need to add a new data column not sure if I need to use joins Commented Oct 3, 2016 at 13:00

2 Answers 2

1

You can do it with one way like just create the dataframe of name and listvalues and join both the dataframe with name column

Sign up to request clarification or add additional context in comments.

Comments

1

This solved my issue

val newrows = dataset.rdd.zipWithIndex.map(_.swap)
      .join(spark.sparkContext.parallelize(results).zipWithIndex.map(_.swap))
      .values
      .map { case (row: Row, x: String) => Row.fromSeq(row.toSeq :+ x) }

Still need some exact explanation of it.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.