1

I am trying to export the results from a spark.sql query in Databricks to a folder in Azure Data Lake Store - ADLS

The tables that I'm querying are also in ADLS.

I have accessed the files in ADLS from Databricks with the following commnad:

base = spark.read.csv("adl://carlslake.azuredatalakestore.net/landing/",inferSchema=True,header=True)
base.createOrReplaceTempView('basetable')

I am querying the table with the following command:

try:
  dataframe = spark.sql("select * from basetable where LOAD_ID = 1199")
except:
  print("Exception occurred 1166")
else:
  print("Table Load_id 1166")

I am then attempting to export the results to the folder in Azure using the following:

try:
 dataframe.coalesce(1).write.option("header","true").mode("overwrite").csv("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles/")
  rename_file("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles", "adl://carlslake.azuredatalakestore.net/landing/RAW", "csv", "Delta_LoyaltyAccount_merged")
except:
  print("Exception Occurred 1166")
else:
  print("Delta File Created")

There are two weird issues here:

  1. I have specified to query on load_id = 1199, and although there isn't a load_id = 1199 the query is still successful.

  2. I would would like the second "try" statement to fail if the first "try" failed, but the second try statement runs regards of the first "try" statement.

Can someone let me know where I'm going wrong?

The table can be viewed here thetable

4
  • I think I may have figured out why the second "try" statement failed, but I still can't understand why spark.sql statement is successful even though there isn't a load_id = 1199. Having said that, if I replace load_id = 1199 with load_id = abcde it fails as I would expect. Very strange Commented Jan 5, 2019 at 16:38
  • I think I know why its still running even though I enter an incorrect load_id. I think its due to the fact that although there isn't an load_id = 1199 its still being seen as successful because a table is being returned, albeit without any data in the columns. If I'm correct, can someone let me know how to prevent the query from returning any result if the wrong load_id is entered? Commented Jan 5, 2019 at 17:17
  • As a suggestion, I was thinking of using absolute value in databricks. Will that work? Commented Jan 5, 2019 at 20:34
  • Hi guys, I'm trying to help myself here. So I've tried the following: try: dataframe = spark.sql("select * from basetable where LOAD_ID = 1166") except: print("Exception occurred 1166") if dataframe.count == 0: print("No data rows") else: dataframe.coalesce(1).write.option("header","false").mode("overwrite").csv("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles/") rename_file("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles", "adl://carlslake.azuredatalakestore.net/landing/RAW", "csv", "Delta_LoyaltyAccount_merged")#create base Union Commented Jan 5, 2019 at 20:57

1 Answer 1

2

Just thought I would share with you the answer;

try:
  dataframe = spark.sql("select * from basetable where LOAD_ID = 1166")
except:
  print("Exception occurred 1166")
if dataframe.count() == 0:
  print("No data rows 1166")
else:
  dataframe.coalesce(1).write.option("header","true").mode("overwrite").csv("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles/")
  rename_file("adl://carlslake.azuredatalakestore.net/jfolder2/outputfiles", "adl://carlslake.azuredatalakestore.net/landing/RAW", "csv", "Delta_LoyaltyAccount_merged")

I hope it works for you too.

Sign up to request clarification or add additional context in comments.

1 Comment

So which questions in particular here on SO are you talking about on the SU Meta post that you tried to fix? Give me a link to the question and I'll give you an example of what I think could be adjusted. Also, look over some of my questions here on SO and see the level of detail I provide. When it's time to ask as question for me, I'm usually at a point where I need good help so I always try my best to show my work and what I tried and be clear with what I need to accomplish, end result examples, etc. Maybe I can help clarify for you with an edit example of one of your questions for you?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.