1

I've been following this tutorial which lets me connect to Databricks from Python and then run delta table queries. However, I've stumbled upon a problem. When I run it for the FIRST time, I get the following error:

Container container-name in account storage-account.blob.core.windows.net not found, and we can't create it using anoynomous credentials, and no credentials found for them in the configuration.

When I go back to my Databricks cluster and run this code snippet

from pyspark import SparkContext
spark_context =SparkContext.getOrCreate()

if StorageAccountName is not None and StorageAccountAccessKey is not None:
  print('Configuring the spark context...')
  spark_context._jsc.hadoopConfiguration().set(
    f"fs.azure.account.key.{StorageAccountName}.blob.core.windows.net",
    StorageAccountAccessKey)

(where StorageAccountName and AccessKey are known) then run my Python app once again, it runs successfully without throwing the previous error. I'd like to ask, is there a way to run this code snippet from my Python app and at the same time reflect it on my Databricks cluster?

2
  • are you running this code via Databrics Connect, or directly on the cluster? Commented Nov 29, 2021 at 9:49
  • The code snippet is directly on the cluster. How could I run it from my PyCharm? @AlexOtt Commented Nov 29, 2021 at 10:13

1 Answer 1

1

You just need to add these configuration options to the cluster itself as it's described in the docs. You need to set following Spark property, the same as you do in your code:

fs.azure.account.key.<storage-account-name>.blob.core.windows.net <storage-account-access-key>

For security, it's better to put access key into secret scope, and refer it from Spark configuration (see docs)

Sign up to request clarification or add additional context in comments.

10 Comments

This should go through databricks-connect or just pyspark?
It’s should be done on cluster that you are querying
directly on databricks? sorry if the question seems kind of dumb, I'm the new guy when it comes to databricks
Yes, go to your cluster, click Edit, scroll down to "Advanced options", put that configuration into "Spark" part
yes. this option provides the way for clusters to authenticate to the storage account. But I would recommend to use SAS key instead of storage key
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.