DEV Community

[Share] Data-Driven Design: Leveraging Lessons from Game Development in Everyday Software

Originally posted on Methodox Wiki: Data-Driven Design.

Overview

Modern software often needs to adapt quickly - whether that means processing new data sets, adjusting to user preferences, or deploying new features safely without downtime. To achieve such flexibility, software engineers increasingly adopt a methodology known as Data-Driven Design (DDD).

Originally popularized by game development, Data-Driven Design emerged prominently in the 1990s as large studios confronted a challenging problem: the need to iterate rapidly on complex and interactive content. Game developers realized it was costly and slow to rebuild and redeploy an entire game every time designers wanted to tweak gameplay mechanics, adjust character behaviors, or revise in-game dialogues.

Jason Gregory's influential book Game Engine Architecture highlighted how AAA games effectively tackled this challenge by externalizing game logic into structured data files. Instead of embedding behaviors directly in C++ code, developers loaded data such as AI rules, game levels, item descriptions, and story dialogues from easily editable files like YAML, JSON, or custom formats. This dramatically accelerated iteration, empowering non-programmers - artists, designers, and writers - to directly experiment and refine experiences without requiring code recompilation or redeployment.

Although originally rooted in game development, Data-Driven Design has proven invaluable across software domains, ranging from web development, data analytics, and DevOps automation to no-code and low-code platforms. The fundamental principle remains the same: separate generic engines from domain-specific data.

The Core Idea Behind Data-Driven Design

Before we dive deeper, let's provide a clear working definition that illustrates why Data-Driven Design is relevant to developers and system administrators today:

Data-Driven Design means that the software's behavior is governed by external data rather than hard-coded logic. The code provides generic mechanisms for processing that data, but the specifics of “what to do” or “how to behave” live in data files (such as YAML, JSON, or databases) that can be changed independently from the source code itself.

To clarify further, here's a quick comparison:

Pattern What drives behavior? Practical Example
Hard-coded Embedded conditional statements in code if user.is_premium: enable_feature()
Config-driven Simple flags or settings in config files max_connections = 10 in config.ini
Data-driven Entire behaviors defined in structured data (YAML, JSON, SQLite) YAML defining a workflow or SQLite storing business rules

Below are some common misconceptions and anti-patterns to help clarity things:

  • "It's just a config file." – If removing the file breaks the program, it's not mere config; it's content that defines runtime behaviour.
  • "DDD = No code." – Wrong: the goal is to keep the engine generic and thin, but domain logic still has to live somewhere (often in a domain-specific language interpreted by the engine).
  • Premature complexity – Don't invent a custom DSL when a handful of YAML documents plus a small interpreter class will do.
  • Debugging blind – Always log "which data row caused this action?" so you can trace bugs quickly.

High-Level Architecture Overview

A typical Data-Driven Design architecture clearly separates:

  1. Authoring Layer

    • Human-readable formats like YAML or JSON.
    • Editable directly by users or via automated processes.
  2. Validation & Build Layer

    • Schema definitions or migrations that ensure data consistency.
  3. Runtime Loader & Hot Reload

    • Reads and validates data at runtime.
    • Supports hot-reload for rapid iterations.
  4. Generic Runtime System

    • Executes logic based purely on loaded data.

This creates a robust pipeline where data edits alone trigger different application behaviors.

Data Files (YAML/SQLite)
    │ load & validate
    ▼
Generic Runtime Engine
    │ executes behaviors based on data
    ▼
Dynamic Behavior in Application
Enter fullscreen mode Exit fullscreen mode

Choosing Your Data Container

Here's a quick guide to choosing your data format based on needs:

Format Advantages Typical Use Case
YAML/JSON/TOML Human-readable, simple to version control Small-to-medium complexity workflows, configs
SQLite Relational queries, consistency, single file Complex rule sets, relational data, analytics
Binary Formats Performance-critical loading Embedded systems, high-performance scenarios

Typically, you start with YAML or JSON, then upgrade to SQLite as complexity grows.

Practical Example: Data-Driven Workflows with Divooka

Divooka is the visual programming language that naturally embodies Data-Driven Design principles through its composable dataflow nodes. Let's illustrate this clearly with a straightforward example:

Scenario:

A data analyst wants to build an automated workflow to:

  • Load a CSV file containing customer orders.
  • Filter orders exceeding a certain threshold.
  • Send a summary via a web API.

In Divooka, the process is streamlined as follows:

1. Load Data with Path and File Nodes

  • Path node specifies file location (e.g., /data/orders.csv).
  • Load from CSV file read node directly loads CSV data into a structured DataGrid.
[Path Node] "/data/orders.csv"
    │
    └─► [Load from CSV Node]
           │
           └─► DataGrid
Enter fullscreen mode Exit fullscreen mode

Equivalent Python-like pseudocode:

orders_df = pd.read_csv("/data/orders.csv")
Enter fullscreen mode Exit fullscreen mode

2. Filtering Data

  • Filter node takes the DataGrid and filters based on condition (e.g., order_total > 100).
DataGrid
    │
    └─► [Filter Node] condition: order_total > 100
           │
           └─► Filtered DataGrid
Enter fullscreen mode Exit fullscreen mode

Equivalent pseudocode:

high_value_orders = orders_df[orders_df["order_total"] > 100]
Enter fullscreen mode Exit fullscreen mode

3. Sending Data via HTTP API

  • String node specifies the webhook URI.
  • HTTP Send Request node sends the filtered data automatically serialized as JSON.
[String Node] "https://hooks.example.com/orders"
    │
    └─► [Send Request Node] method: POST, body: Filtered DataGrid
Enter fullscreen mode Exit fullscreen mode

Equivalent pseudocode:

import requests
payload = high_value_orders.to_dict(orient='records')
requests.post("https://hooks.example.com/orders", json=payload)
Enter fullscreen mode Exit fullscreen mode

Why Divooka Is Naturally Data-Driven:

  • High-level abstraction nodes (Path, String, File I/O, Web Request) handle data transparently.
  • Nodes clearly define dependencies and flow, automatically adjusting behavior when input data changes.
  • Entire workflows can be modified without touching underlying engine code or restarting the runtime environment.

Best Practices and Pitfalls

When embracing Data-Driven Design, keep the following in mind:

Practice Recommendation
Schema Validation Always enforce schemas (e.g., JSON Schema or SQLite migrations) to prevent errors
Hot Reloading Clearly separate parsing from data activation steps
Security Considerations Never trust external data implicitly; always validate or sanitize it before use
Performance Optimization Cache loaded data where appropriate to avoid redundant parsing

Try It Yourself: Starter Projects

To practice Data-Driven Design, consider these beginner-friendly projects:

  • Feature Flag Dashboard: YAML-based flags control feature visibility.
  • Email Routing Automation: SQLite stores routing rules to sort incoming emails.
  • Procedural Content Generator: YAML files configure parameters for generated data outputs.

Each project reinforces that your engine stays generic and data explicitly drives all behaviors.

Conclusion

Data-Driven Design originated as a powerful solution to rapid iteration in the game development world, enabling large teams to build and adjust complex software efficiently. Today, its principles transcend industries, benefiting system administrators, data analysts, web developers, and more.

In tools like Divooka, Data-Driven Design emerges naturally - dataflow-based nodes inherently separate behavior from implementation, ensuring flexibility and robustness. By clearly distinguishing between generic program logic and externally-driven data, you create maintainable, adaptable, and resilient software.

References

See also:

Top comments (0)