0

I would like to check if value from a string type column (not a string itself) doesn't exist in another array type column.

I tries something like this:

~col('string_column').isin(col('array_column'))

but since they are not the same type it fails

and also:

~col('array_column').contains(col('string_column'))

but this fails due to datatype mismatch.

How can I solve this?

0

1 Answer 1

2

You can use array_contains with the columns:

import pyspark.sql.functions as F

df2 = df.withColumn('contain', ~F.expr('array_contains(array_column, string_column)'))
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.