0

I am have two data frames with max timestamp value in each.

val Table1max=spark.read.format("parquet").option("header","true").load(s"${SourcePath}/ab12")
Table1max.createOrReplaceTempView("temp") 

val table2max=spark.read.format("parquet").option("header","true").load(s"${SourcePath}/abc")
table2max.createOrReplaceTempView("temp1")

Then select max update date from both

val table1maxvalue = spark.sql(s"select max(UPDATE_DATE) from temp")
val table2maxvalue= spark.sql(s"select max(UPDATE_DATE) from temp1")

Here table1maxvalue and table2maxvalue are dataframes.

table1maxvalue
+--------------------+
|    max(UPDATE_DATE)|
+--------------------+
|2022-05-02 01:04:...|
+--------------------+

table2maxvalue

+--------------------+
|    max(UPDATE_DATE)|
+--------------------+
|2022-05-02 01:04:...|
+--------------------+

Now how can I check if table1maxvalue > table2maxvalue it should something. Like

if(table1maxvalue<table2maxvalue){
Do something
}

As it is data frame i am getting this error: value >= is not a member of org.apache.spark.sql.DataFrame

Pls suggest.

1 Answer 1

1

You are trying to compare a dataFrame to another data Frame. You actually need to reference the first row, and then retrieve the value from that row.

In this case you can use the following:

table1maxvalue //Data frame
.head()        //get the first row
.getDate(0)    //get the first column as a date.
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.