-2

I have a dataset with two columns: Country, Adclicks. How do I find the country with the most ad clicks?

Country | Ad Click
USA       1
USA       0
USA       1
PR        0
PR        0
PR        1
1
  • 1) groupBy the country + sum 2) agg w/ max on the clicks Commented Nov 30, 2021 at 20:20

1 Answer 1

0

Assuming your DataFrame if defined as the variable "df" then something like this:

import pyspark.sql.functions as psf

# Get aggregate sum
s = df.groupby("Country").agg({'Ad Click': 'sum'})

# Get and display top country
s.registerTempTable("sums_table")
query = """
    SELECT Country
    FROM sums_table
    WHERE `sum(Ad Click)` = (
        SELECT MAX(`sum(Ad Click)`)
        FROM sums_table)
"""
top_country = spark.sql(query).collect()
print(top_country[0]["Country"])
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.