7

I'm trying to take a hardcoded String and turn it into a 1-row Spark DataFrame (with a single column of type StringType) such that:

String fizz = "buzz"

Would result with a DataFrame whose .show() method looks like:

+-----+
| fizz|
+-----+
| buzz|
+-----+

My best attempt thus far has been:

val rawData = List("fizz")
val df = sqlContext.sparkContext.parallelize(Seq(rawData)).toDF()

df.show()

But I get the following compiler error:

java.lang.ClassCastException: org.apache.spark.sql.types.ArrayType cannot be cast to org.apache.spark.sql.types.StructType
    at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:413)
    at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:155)

Any ideas as to where I'm going awry? Also, how do I set "buzz" as the row value for the fizz column?


Update:

Trying:

sqlContext.sparkContext.parallelize(rawData).toDF()

I get a DF that looks like:

+----+
|  _1|
+----+
|buzz|
+----+

2 Answers 2

9

Try:

sqlContext.sparkContext.parallelize(rawData).toDF()

In 2.0 you can:

import spark.implicits._

rawData.toDF

Optionally provide a sequence of names for toDF:

sqlContext.sparkContext.parallelize(rawData).toDF("fizz")
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @LostInOverflow (+1) - I think I'm almost there, please see my update. I am getting a single-row DF, with the correct value in it ("buzz" string), but the column name is "_1"...thoughts?
Dataframe is like dataset in tabular format with column/title. In the first case, you created dataframe with no column name specified, so it assigns default columns as "_1", "_2".
How would this work in Java? sparkContext.parallelize takes two additional parameters: numSlices and ClassTag. The 2nd isn't clear to me.
0

In Java, the following works:

List<String> textList = Collections.singletonList("yourString");
SQLContext sqlContext = new SQLContext(sparkContext);
Dataset<Row> data = sqlContext
      .createDataset(textList, Encoders.STRING())
      .withColumnRenamed("value", "text");

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.