Skip to main content
Tweeted twitter.com/StackSoftEng/status/791327918257532928
added 5 characters in body
Source Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains Stages and Activities; a line is drawn between two Stage nodes, and the resulting Activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, Task records represent the work to be executed. Each Task contains some relevant metadata in XML form. The Task records traverse the directed graph, and the code is executed as the Task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A Schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns Schedule records that are due, and then for each record so returned we stand up an Activity instance and execute it, handing it the Task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to oneonce per minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains Stages and Activities; a line is drawn between two Stage nodes, and the resulting Activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, Task records represent the work to be executed. Each Task contains some relevant metadata in XML form. The Task records traverse the directed graph, and the code is executed as the Task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A Schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns Schedule records that are due, and then for each record so returned we stand up an Activity instance and execute it, handing it the Task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains Stages and Activities; a line is drawn between two Stage nodes, and the resulting Activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, Task records represent the work to be executed. Each Task contains some relevant metadata in XML form. The Task records traverse the directed graph, and the code is executed as the Task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A Schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns Schedule records that are due, and then for each record so returned we stand up an Activity instance and execute it, handing it the Task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to once per minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

added 15 characters in body
Source Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains stagesStages and activities;Activities; a line is drawn between two stageStage nodes, and the resulting activityActivity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, taskTask records represent the work to be executed. Each taskTask contains some relevant metadata in XML form. The taskTask records traverse the directed graph, and the code is executed as the taskTask passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A scheduleSchedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns scheduleSchedule records that are due, and then for each record so returned we stand up an activityActivity instance and execute it, handing it the taskTask record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains stages and activities; a line is drawn between two stage nodes, and the resulting activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, task records represent the work to be executed. Each task contains some relevant metadata in XML form. The task records traverse the directed graph, and the code is executed as the task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns schedule records that are due, and then for each record so returned we stand up an activity instance and execute it, handing it the task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the result, I set the interval back to 10 times per second, and begin the process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains Stages and Activities; a line is drawn between two Stage nodes, and the resulting Activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, Task records represent the work to be executed. Each Task contains some relevant metadata in XML form. The Task records traverse the directed graph, and the code is executed as the Task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A Schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns Schedule records that are due, and then for each record so returned we stand up an Activity instance and execute it, handing it the Task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

deleted 34 characters in body
Source Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683

How to create an eventa timed-drivenevent architecture using a SQL database

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains stages and activities; a line is drawn between two stage nodes, and the resulting activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, task records represent the work to be executed. Each task contains some relevant metadata in XML form. The task records traverse the directed graph, and the code is executed as the task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns schedule records that are due, and then for each record so returned we stand up an activity instance and execute it, handing it the task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the result, I set the interval back to 10 times per second, and begin the process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to changemake it into an event"event-driven pattern," instead, so that the database is only hit when something needs to be executeda schedule record is due, andwithout having to constantly poll the polling is no longer requireddatabase?

How to create an event-driven architecture using a SQL database

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains stages and activities; a line is drawn between two stage nodes, and the resulting activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, task records represent the work to be executed. Each task contains some relevant metadata in XML form. The task records traverse the directed graph, and the code is executed as the task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns schedule records that are due, and then for each record so returned we stand up an activity instance and execute it, handing it the task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the result, I set the interval back to 10 times per second, and begin the process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to change it into an event-driven pattern instead, so that the database is only hit when something needs to be executed, and the polling is no longer required?

How to create a timed-event architecture using a SQL database

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains stages and activities; a line is drawn between two stage nodes, and the resulting activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, task records represent the work to be executed. Each task contains some relevant metadata in XML form. The task records traverse the directed graph, and the code is executed as the task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns schedule records that are due, and then for each record so returned we stand up an activity instance and execute it, handing it the task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to one minute. If a record appears in the result, I set the interval back to 10 times per second, and begin the process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

added 39 characters in body
Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683
Loading
added 39 characters in body
Source Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683
Loading
Source Link
Robert Harvey
  • 200.7k
  • 55
  • 470
  • 683
Loading