Connect to Azure SQL Database from Databricks Notebook

Question

I wanted to load the data from Azure Blob storage to Azure SQL Database using Databricks notebook . Could anyone help me in doing this

docs.databricks.com/spark/latest/data-sources/azure/… and docs.databricks.com/spark/latest/data-sources/azure/… — pvy4917
– pvy4917, Commented Oct 19, 2018 at 16:20

Kyle Bunting · Accepted Answer · 2018-10-25 20:55:18Z

1

I'm new to this, so I cannot comment, but why use Databricks for this? It would be much easier and cheaper to use Azure Data Factory.

https://learn.microsoft.com/en-us/azure/data-factory/tutorial-copy-data-dot-net

If you really need to use Databricks, you would need to either mount your Blob Storage account, or access it directly from your Databricks notebook or JAR, as described in the documentation (https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html).

You can then read the files into DataFrames for whatever format they are in, and use the SQL JDBC connector to create a connection for writing the data to SQL (https://docs.azuredatabricks.net/spark/latest/data-sources/sql-databases.html).

answered Oct 25, 2018 at 20:55

Kyle Bunting

1764 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Rodney Over a year ago

One of the possible reasons (to be confirmed) to use Databricks over ADF would be the fact that ADF wants a specific table in SQL DB to be defined with schema. I am importing data from an Api in databricks and there are 200 columns. I don't want to specify a schema. I am hoping in databricks I can just create the table dynamically in SQL and infer the schema from the dataframe. It will be used as a holding table for Power BI (something like learn.microsoft.com/en-us/azure/hdinsight/spark/…)

Kyle Bunting Over a year ago

@Rodney, that makes sense, but I'd be curious to see if you can actually get the data types to work using a dynamic schema. Type inference does not always work as you would hope, especially if your source data has a lot of null values or possible bad data. If your only reason for using SQL is as a temp table, an alternative approach you could consider is to use a hive or Delta table in Databricks for storing the data, and then query it directly from Power BI.

Rodney Over a year ago

Yes, it's kind of a temp solution - I just had to get something quickly into the DB using the same types from my Dataframe ideally. It did work, but ultimately I will use ADF as it is a LOT faster and provides logging etc. not to mention cheaper. Just good to know there's the spark connector for those edge cases...

Collectives™ on Stack Overflow

Connect to Azure SQL Database from Databricks Notebook

1 Answer 1

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Related