0

i've created a data governance tool, in one module of which i schedule a job that runs a few sql queries. It is now running synchronously but i want to convert it to a way that it runs faster.

the question is, should i use Parallel or Task or Async/Await method in order to achieve what i want?

EDIT:these queries are completely different from each other. They each run a different data quality checks on different columns or tables. These are all select queries, and i use the ExecuteScalar method returning only the count of the rows that breaks my quality rules, such as "how many null values in ABC column" I scheduled them to run at night as soon as DWH is populated

15
  • Whatever you want. Commented Oct 15, 2018 at 8:31
  • 1
    async won't make anything run faster, it will release a thread that would be blocked otherwise. Individual queries will be slower than their sync versions though. Parallel execution will decrease performance due to increased locking. Batching queries or passing multiple values in IN() or VALUES() clauses is a lot faster. SqlBulkCopy for bulk inserts is another fast option Commented Oct 15, 2018 at 8:31
  • It depends on what are these SQL queries. Could you share more info on these queries and why do they need to be run on schedule? Commented Oct 15, 2018 at 8:33
  • hi, i edited my post, giving details about the queries Commented Oct 15, 2018 at 8:37
  • 1
    @ExcelinEfendisi I understand the question because I also work with data warehouses and need to run quality checks. How you start the queries won't affect total execution time as much as finding which queries can be run in parallel and which don't. You can run all of them in parallelf with Parallel.ForEach(jobs,job=>runJob(job));, var jobTasks=allJobs.Select(job=>Task.Run(()=>runJob(job)) or jobTasks =await Task.WhenAll(allJobs.Select(job=>runJobAsync(job))). Should you do so though? Or would this result in slower execution due to blocking? All queries run on the disks, RAM Commented Oct 15, 2018 at 10:01

1 Answer 1

0

At the first I suggest you to optimize your SQL queries. After that, you need to use Async/await because you don't need to block thread.

Take a look on the following topic:

Async/await and parallel in C#

Sign up to request clarification or add additional context in comments.

5 Comments

actually, my code runs at the background since it is scheduled to run at night. so i dont have the concern of blocking the thread. i just want my code to complete as soon as possbile
To my mind at the first you have to optimize sql queries by creating some indexes and other techniques.
@MohammadMohabbati this isn't the generic "make my queries run faster" question. The OP wants to run quality checks after an ETL job and asks how to avoid running them sequentially. Those types of queries crunch a lot of data and can end up using a lot of RAM, IO bandwidth. Some of them could be run in parallel to reduce the total time, as long as they don't contend with each other for resources.
@MohammadMohabbati which means, how you run the queries isn't as important as what those queries are and which tables they touch.
@PanagiotisKanavos you are right. I didn't understand the question quite well.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.