DEV Community

Cover image for Build a Real-Time Customer Segmentation API with Tinybird
Cameron Archer for Tinybird

Posted on

Build a Real-Time Customer Segmentation API with Tinybird

In today's data-driven world, understanding customer behavior is paramount for creating personalized experiences and targeted marketing campaigns. A real-time customer segmentation API serves as a critical tool for analyzing and segmenting customers based on their behavior and attributes. This tutorial will guide you through creating such an API using Tinybird, focusing on retrieving customer segments, analyzing engagement patterns, and summarizing segment data to support marketing efforts. Tinybird is a data analytics backend for software developers. You use Tinybird to build real-time analytics APIs without needing to set up or manage the underlying infrastructure. Tinybird offers a local-first development workflow, git-based deployments, resource definitions as code, and features for AI-native developers. By leveraging Tinybird's data sources and pipes, you can ingest, transform, and serve large volumes of data through APIs with minimal latency, enabling real-time decision-making and personalization at scale.

Understanding the data

Imagine your data looks like this: a collection of customer profiles and a log of their activities. The customer profiles include demographics and a unique customer ID, while the activity log tracks events such as purchases and page views. To store this data in Tinybird, you create two data sources: customers and customer_events. Here's how you define their schemas:

Datasource for customer information:

DESCRIPTION >
    Customer information including demographic data and customer IDs

SCHEMA >
    `customer_id` String `json:$.customer_id`,
    `name` String `json:$.name`,
    `email` String `json:$.email`,
    `age` Int32 `json:$.age`,
    `gender` String `json:$.gender`,
    `location` String `json:$.location`,
    `signup_date` DateTime `json:$.signup_date`,
    `lifetime_value` Float64 `json:$.lifetime_value`,
    `timestamp` DateTime `json:$.timestamp`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "customer_id, timestamp"
Enter fullscreen mode Exit fullscreen mode

Datasource for customer events:

DESCRIPTION >
    Customer activity events like purchases, logins, page views, etc. SCHEMA >
    `event_id` String `json:$.event_id`,
    `customer_id` String `json:$.customer_id`,
    `event_type` String `json:$.event_type`,
    `event_value` Float64 `json:$.event_value`,
    `product_id` String `json:$.product_id`,
    `category` String `json:$.category`,
    `timestamp` DateTime `json:$.timestamp`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "customer_id, timestamp"
Enter fullscreen mode Exit fullscreen mode

These schemas are designed to optimize query performance, with sorting keys based on customer ID and timestamp to facilitate fast data retrieval. For data ingestion, Tinybird's Events API allows you to stream JSON/NDJSON events from your application frontend or backend with a simple HTTP request. This real-time nature of the Events API ensures low latency and immediate availability of data for querying. Here's a sample ingestion code for each of the data sources:

Ingesting customer data:

curl -X POST "https://api.europe-west2.gcp.tinybird.co/v0/events?name=customers&utm_source=DEV&utm_campaign=tb+create+--prompt+DEV" \
  -H "Authorization: Bearer $TB_ADMIN_TOKEN" \
  -d '{
    "customer_id": "cust123",
    "name": "John Doe",
    ... }'
Enter fullscreen mode Exit fullscreen mode

Ingesting customer event data:

curl -X POST "https://api.europe-west2.gcp.tinybird.co/v0/events?name=customer_events&utm_source=DEV&utm_campaign=tb+create+--prompt+DEV" \
  -H "Authorization: Bearer $TB_ADMIN_TOKEN" \
  -d '{
    "event_id": "evt456",
    ... }'
Enter fullscreen mode Exit fullscreen mode

Beyond the Events API, Tinybird also supports data ingestion via Kafka connectors for event/streaming data and the Data Sources API and S3 connectors for batch/file data.

Transforming data and publishing APIs

In Tinybird, pipes are used to transform data and publish APIs. Pipes can perform batch transformations, create real-time Materialized views, and expose the transformed data through API endpoints.

customer_segment_lookup

This endpoint categorizes customers into value and loyalty segments based on lifetime value and signup date.

DESCRIPTION >
    Endpoint to look up a customer's segment based on their customer ID

SQL >
    SELECT 
        c.customer_id,
        ... CASE
            WHEN DATEDIFF('day', c.signup_date, now()) > 365 THEN 'Loyal'
            ... FROM customers c
    WHERE c.customer_id = {{String(customer_id, '')}}

TYPE endpoint
Enter fullscreen mode Exit fullscreen mode

This SQL logic leverages conditional statements to segment customers, using the customer_id as a query parameter to retrieve individual customer segments.

customer_engagement_analysis

This endpoint analyzes customer engagement by counting activities and summing event values within a specified date range and for certain event types.

DESCRIPTION >
    Analyze customer engagement based on events, with filters for date range and event types

SQL >
    SELECT 
        c.customer_id,
        ... FROM customers c
    LEFT JOIN customer_events e ON c.customer_id = e.customer_id
    ... LIMIT {{Int32(limit, 100)}}

TYPE endpoint
Enter fullscreen mode Exit fullscreen mode

The SQL logic for this endpoint includes joining tables, conditional filtering, and aggregation functions to provide insights into customer engagement.

customer_segments_summary

Summarizes customer segments, showing counts and average lifetime value by segment, with optional filters for location and age.

DESCRIPTION >
    Summary of customer segments showing counts and average lifetime value by segment

SQL >
    SELECT 
        CASE
            ... FROM customers
    ... ORDER BY avg_lifetime_value DESC

TYPE endpoint
Enter fullscreen mode Exit fullscreen mode

This pipe uses SQL CASE statements for segmenting, and GROUP BY to aggregate data, facilitated by query parameters for dynamic filtering. Example API call:

curl -X GET "https://api.europe-west2.gcp.tinybird.co/v0/pipes/customer_segments_summary.json?location=New+York&min_age=25&max_age=45&token=%24TB_ADMIN_TOKEN&utm_source=DEV&utm_campaign=tb+create+--prompt+DEV"
Enter fullscreen mode Exit fullscreen mode

Deploying to production

Deploying your project to Tinybird Cloud is as simple as running tb --cloud deploy in your command line. This command deploys your data sources and pipes, creating production-ready, scalable API Endpoints. Tinybird manages your resources as code, making it easy to integrate with CI/CD pipelines for automated deployments. To secure your APIs, Tinybird uses token-based authentication, ensuring that only authorized requests can access your data. Here's how you might call a deployed endpoint:

curl -X GET "https://api.europe-west2.gcp.tinybird.co/v0/pipes/customer_engagement_analysis.json?token=%24TB_ADMIN_TOKEN&event_type=purchase&start_date=2023-01-01+00%3A00%3A00&end_date=2023-12-31+23%3A59%3A59&limit=50&utm_source=DEV&utm_campaign=tb+create+--prompt+DEV"
Enter fullscreen mode Exit fullscreen mode

Conclusion

Throughout this tutorial, you've learned how to ingest, transform, and expose customer data through real-time APIs using Tinybird. By building a customer segmentation API, you can analyze customer behavior, segment customers based on their attributes and activities, and support targeted marketing efforts and personalized experiences. Tinybird simplifies the process of working with large volumes of data in real-time, enabling you to focus on creating value from your data rather than managing infrastructure. Sign up for Tinybird to build and deploy your first real-time data APIs in a few minutes.

Top comments (0)