1

I'm still in a learning stage of python. In the following example (taken from Method 3 of this article), the name of the User Defined Function (UDF) is Total(...,...). But the author is calling it with a name new_f(...,...).

Question: In the code below, how do we know that the function call new_f(...,...) should call the function Total(...,...)? What if there was another UDF function, say, Sum(...,...). In that case, how the code would have known whether call new_f(...,...) means calling Total(...,...) or Sum(...,...)?

# import the functions as F from pyspark.sql
import pyspark.sql.functions as F
from pyspark.sql.types import IntegerType
  
# define the sum_col
def Total(Course_Fees, Discount):
    res = Course_Fees - Discount
    return res
  
# integer datatype is defined
new_f = F.udf(Total, IntegerType())
  
# calling and creating the new
# col as udf_method_sum
new_df = df.withColumn(
  "Total_price", new_f("Course_Fees", "Discount"))
  
# Showing the Dataframe
new_df.show()
2
  • 1
    new_f = F.udf(Total, IntegerType()) assigns the name new_f to that user defined function Commented May 20, 2022 at 21:15
  • @C.Nivs Got it. Thank you. Should be an Answer Commented May 20, 2022 at 21:18

1 Answer 1

1
new_f = F.udf(Total, IntegerType()) 

assigns the name new_f to that user defined function

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.