I have a dataframe with N columns and I want to create a new column with the number of columns that have a NULL value. I tried to create an UDF but it's not working because of I can't set an array of parameters.
val simpleData = Seq(
("row1", "NULL" , "NULL" , "NULL" , "NULL" , "NULL", "1"),
("row2", "1", "NULL", "2023", "NULL", "01", "NULL"))
val myDs = simpleData.toDF("row", "field1", "field2", "field3", "field4", "field5", "field6")
myDs.show()
val windowcols = myDs.columns.filterNot(List("row").contains(_))
def countNullsUDF: UserDefinedFunction = udf { (values: List[String]) =>
values.filter( value => value == "NULL").length
}
myDs.withColumn("columnsWithNull", countNullsUDF(windowcols)).show(10, false)
is it possible to pass it an Array of columns or similar? I didn't get it.