Skip to content

SQL reference

R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying Apache Iceberg tables stored in R2 Data Catalog. This page documents the supported SQL syntax.


Query syntax

SELECT column_list | expression | aggregation_function
FROM namespace_name.table_name
[JOIN namespace_name.table_name ON condition]
[WHERE conditions]
[GROUP BY column_list]
[HAVING conditions]
[ORDER BY expression [ASC | DESC]]
[LIMIT number]

Schema discovery commands

SHOW DATABASES

Lists all available namespaces.

SHOW DATABASES;

SHOW NAMESPACES

Alias for SHOW DATABASES. Lists all available namespaces.

SHOW NAMESPACES;

SHOW TABLES

Lists all tables within a specific namespace.

SHOW TABLES IN namespace_name;

DESCRIBE

Describes the structure of a table, showing column names and data types.

DESCRIBE namespace_name.table_name;

SELECT clause

Syntax

SELECT column_specification [, column_specification, ...]

Column specification

  • Column name: column_name
  • All columns: *
  • Qualified wildcard: table_name.*
  • Column alias: column_name AS alias
  • Expressions: arithmetic, function calls, CASE expressions, and casts

Examples

SELECT * FROM my_namespace.sales_data LIMIT 10
SELECT customer_id, region, total_amount FROM my_namespace.sales_data LIMIT 10
SELECT region, total_amount * 1.1 AS total_with_tax FROM my_namespace.sales_data LIMIT 10

Common table expressions (CTEs)

CTEs let you define named temporary result sets using WITH that you can reference in the main query. CTEs can reference different tables and can include JOINs. A CTE can also be joined with other CTEs or regular tables in the main query.

Syntax

WITH cte_name AS (
SELECT ...
FROM namespace_name.table_name
[WHERE ...]
)
SELECT ... FROM cte_name

Chained CTEs

A CTE can reference a previously defined CTE.

WITH filtered AS (
SELECT customer_id, department, total_amount
FROM my_namespace.sales_data
WHERE total_amount > 0
),
summary AS (
SELECT department,
COUNT(*) AS order_count,
round(AVG(total_amount), 2) AS avg_amount
FROM filtered
GROUP BY department
)
SELECT *
FROM summary
WHERE order_count > 100
ORDER BY avg_amount DESC

CTE joined with another table

WITH enterprise_zones AS (
SELECT zone_id, domain, plan
FROM my_namespace.zones
WHERE plan = 'enterprise'
)
SELECT ez.domain, f.action, COUNT(*) AS cnt
FROM enterprise_zones ez
INNER JOIN my_namespace.firewall_events f ON ez.zone_id = f.zone_id
GROUP BY ez.domain, f.action
ORDER BY cnt DESC
LIMIT 20

Two CTEs joined together

WITH top_zones AS (
SELECT zone_id, COUNT(*) AS req_count
FROM my_namespace.http_requests
GROUP BY zone_id
ORDER BY req_count DESC
LIMIT 50
),
zone_threats AS (
SELECT zone_id, COUNT(*) AS threat_count
FROM my_namespace.firewall_events
WHERE risk_score > 0.5
GROUP BY zone_id
)
SELECT tz.zone_id, tz.req_count, COALESCE(zt.threat_count, 0) AS threat_count
FROM top_zones tz
LEFT JOIN zone_threats zt ON tz.zone_id = zt.zone_id
ORDER BY tz.req_count DESC
LIMIT 20

FROM clause

Syntax

SELECT * FROM namespace_name.table_name

R2 SQL queries can reference one or more tables. Tables are specified as namespace_name.table_name. Multiple tables can be combined using JOINs or comma-separated syntax. Refer to the JOIN clause section for details.


JOIN clause

R2 SQL supports joining multiple Iceberg tables in a single query. All join types use standard SQL syntax.

Supported join types

Join typeSyntaxDescription
Inner joinINNER JOIN ... ONReturns rows that match in both tables
Left outer joinLEFT JOIN ... ONReturns all rows from the left table, NULLs for non-matching right
Right outer joinRIGHT JOIN ... ONReturns all rows from the right table, NULLs for non-matching left
Full outer joinFULL OUTER JOIN ... ONReturns all rows from both tables, NULLs where no match
Cross joinCROSS JOINCartesian product of both tables
Implicit joinFROM t1, t2 WHERE t1.id = t2.idComma-separated tables with join condition in WHERE

Syntax

-- Explicit JOIN
SELECT columns
FROM namespace.table1 alias1
[INNER | LEFT | RIGHT | FULL OUTER | CROSS] JOIN namespace.table2 alias2
ON alias1.column = alias2.column
[WHERE conditions]
-- Implicit join
SELECT columns
FROM namespace.table1 alias1, namespace.table2 alias2
WHERE alias1.column = alias2.column

Multi-way joins

You can join three or more tables in a single query:

SELECT z.domain, h.method, f.action, COUNT(*) AS cnt
FROM my_namespace.zones z
INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id
INNER JOIN my_namespace.firewall_events f ON z.zone_id = f.zone_id
WHERE h.status_code >= 400
GROUP BY z.domain, h.method, f.action
ORDER BY cnt DESC
LIMIT 20

Self-joins

A table can be joined with itself using different aliases:

SELECT f1.source_ip, f1.zone_id AS zone1, f2.zone_id AS zone2
FROM my_namespace.firewall_events f1
INNER JOIN my_namespace.firewall_events f2
ON f1.source_ip = f2.source_ip
AND f1.zone_id < f2.zone_id
WHERE f1.action = 'block'
LIMIT 20

Join conditions

  • Join conditions use the ON clause with equality (=) or expression-based predicates.
  • Functions are supported in join predicates (for example, ON LOWER(a.col) = LOWER(b.col)).
  • Multiple conditions can be combined with AND.

Best practices for joins

  • Include WHERE filters to reduce intermediate result sizes, especially for multi-way joins.
  • Join large fact tables through a shared dimension table rather than directly cross-joining two large tables.
  • Use LIMIT to cap result sizes.

Subqueries

R2 SQL supports subqueries in multiple positions within a query.

Subqueries in FROM (derived tables)

A subquery in the FROM clause creates a derived table that can be referenced in the outer query:

SELECT sub.domain, sub.total_requests
FROM (
SELECT z.domain, COUNT(*) AS total_requests
FROM my_namespace.zones z
INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id
GROUP BY z.domain
) sub
WHERE sub.total_requests > 1000
ORDER BY sub.total_requests DESC
LIMIT 20

Derived tables can be joined with other derived tables or regular tables:

SELECT req.domain, req.total_reqs, fw.total_events
FROM (
SELECT zone_id, domain, COUNT(*) AS total_reqs
FROM my_namespace.zones z
INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id
GROUP BY zone_id, domain
) req
INNER JOIN (
SELECT zone_id, COUNT(*) AS total_events
FROM my_namespace.firewall_events
GROUP BY zone_id
) fw ON req.zone_id = fw.zone_id
ORDER BY fw.total_events DESC
LIMIT 20

IN / NOT IN subqueries

Filter rows based on whether a value exists in the result of a subquery:

-- Find requests from enterprise zones
SELECT method, status_code, COUNT(*) AS cnt
FROM my_namespace.http_requests
WHERE zone_id IN (
SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'
)
GROUP BY method, status_code
ORDER BY cnt DESC
LIMIT 20
-- NOT IN example
SELECT zone_id, COUNT(*) AS cnt
FROM my_namespace.http_requests
WHERE zone_id NOT IN (
SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'
)
GROUP BY zone_id
LIMIT 10

EXISTS / NOT EXISTS subqueries

Test for the existence of rows matching a correlated condition:

-- Find zones with blocked firewall events (semi-join)
SELECT z.domain, z.plan
FROM my_namespace.zones z
WHERE EXISTS (
SELECT 1 FROM my_namespace.firewall_events f
WHERE f.zone_id = z.zone_id AND f.action = 'block'
)
ORDER BY z.domain
LIMIT 20
-- Find zones with NO firewall events (anti-join)
SELECT z.domain, z.plan
FROM my_namespace.zones z
WHERE NOT EXISTS (
SELECT 1 FROM my_namespace.firewall_events f
WHERE f.zone_id = z.zone_id
)
ORDER BY z.domain
LIMIT 20

Scalar subqueries

A subquery that returns a single value can be used in SELECT, WHERE, or HAVING:

-- In SELECT (constant value per row)
SELECT z.domain, z.plan,
(SELECT COUNT(*) FROM my_namespace.zones) AS total_zones
FROM my_namespace.zones z
WHERE z.plan = 'enterprise'
LIMIT 10
-- In WHERE (comparison)
SELECT z.domain, z.plan, z.requests_30d
FROM my_namespace.zones z
WHERE z.requests_30d > (
SELECT AVG(requests_30d) FROM my_namespace.zones
)
ORDER BY z.requests_30d DESC
LIMIT 20

WHERE clause

Syntax

SELECT * FROM namespace_name.table_name WHERE condition [AND | OR condition ...]

Conditions

Comparison operators

=, !=, <>, <, >, <=, >=

Null checks

  • column_name IS NULL
  • column_name IS NOT NULL

Boolean checks

  • IS TRUE, IS FALSE, IS NOT TRUE, IS NOT FALSE
  • IS UNKNOWN, IS NOT UNKNOWN

Range

  • column_name BETWEEN value1 AND value2
  • column_name NOT BETWEEN value1 AND value2

List membership

  • column_name IN ('value1', 'value2')
  • column_name NOT IN ('value1', 'value2')

Pattern matching

  • column_name LIKE 'pattern'
  • column_name NOT LIKE 'pattern'
  • column_name ILIKE 'pattern' (case-insensitive)
  • column_name NOT ILIKE 'pattern'
  • column_name SIMILAR TO 'regex_pattern'

Logical operators

  • AND
  • OR
  • NOT

Examples

SELECT * FROM my_namespace.sales_data
WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
SELECT * FROM my_namespace.sales_data
WHERE status = 200 AND response_time > 1000
SELECT * FROM my_namespace.sales_data
WHERE (region = 'North' OR region = 'South')
AND total_amount IS NOT NULL
SELECT * FROM my_namespace.sales_data
WHERE department ILIKE '%eng%'

GROUP BY clause

Syntax

SELECT column_list, aggregation_function(column)
FROM namespace_name.table_name
[WHERE conditions]
GROUP BY column_list

Examples

SELECT department, COUNT(*) AS dept_count
FROM my_namespace.sales_data
GROUP BY department
SELECT department, category, SUM(total_amount) AS total
FROM my_namespace.sales_data
GROUP BY department, category

HAVING clause

Syntax

SELECT column_list, aggregation_function(column) AS alias
FROM namespace_name.table_name
GROUP BY column_list
HAVING aggregation_function(column) comparison_operator value

Examples

SELECT department, COUNT(*) AS dept_count
FROM my_namespace.sales_data
GROUP BY department
HAVING COUNT(*) > 1000
SELECT region, SUM(total_amount) AS total
FROM my_namespace.sales_data
GROUP BY region
HAVING SUM(total_amount) > 1000000

ORDER BY clause

Syntax

ORDER BY expression [ASC | DESC] [, expression [ASC | DESC], ...]
  • ASC: Ascending order (default)
  • DESC: Descending order
  • Multi-column ordering is supported

Examples

SELECT customer_id, total_amount
FROM my_namespace.sales_data
WHERE total_amount IS NOT NULL
ORDER BY total_amount DESC
LIMIT 50
SELECT department, COUNT(*) AS dept_count
FROM my_namespace.sales_data
GROUP BY department
ORDER BY dept_count DESC, department ASC

LIMIT clause

Syntax

LIMIT number
  • Type: Integer only
  • Default: 500

Examples

SELECT * FROM my_namespace.sales_data LIMIT 100

EXPLAIN

Returns the execution plan for a query without running it.

EXPLAIN SELECT department, COUNT(*) AS dept_count
FROM my_namespace.sales_data
WHERE total_amount IS NOT NULL
GROUP BY department;

EXPLAIN FORMAT JSON

Returns the execution plan as structured JSON for programmatic analysis.

EXPLAIN FORMAT JSON SELECT * FROM my_namespace.sales_data LIMIT 10;

Expressions

Expressions can be used in SELECT, WHERE, GROUP BY, HAVING, and ORDER BY clauses.

Literals

SELECT 42 AS int_val, 3.14 AS float_val, 'hello' AS str_val, TRUE AS bool_val, NULL AS null_val
FROM my_namespace.sales_data LIMIT 1

Arithmetic operators

+, -, *, /, %

SELECT customer_id, total_amount * 1.1 AS total_with_tax, total_amount % 10 AS remainder
FROM my_namespace.sales_data
WHERE total_amount IS NOT NULL
LIMIT 5

String concatenation

SELECT customer_id || ' - ' || region AS label
FROM my_namespace.sales_data
LIMIT 5

CASE expressions

Searched form:

SELECT customer_id,
CASE
WHEN total_amount > 1000 THEN 'high'
WHEN total_amount > 100 THEN 'medium'
ELSE 'low'
END AS tier
FROM my_namespace.sales_data
LIMIT 10

Simple form:

SELECT customer_id,
CASE region
WHEN 'North' THEN 'N'
WHEN 'South' THEN 'S'
ELSE 'Other'
END AS region_code
FROM my_namespace.sales_data
LIMIT 10

Type casting

-- CAST
SELECT CAST(total_amount AS INT) AS amount_int FROM my_namespace.sales_data LIMIT 5
-- TRY_CAST (returns NULL on failure instead of error)
SELECT TRY_CAST(customer_id AS INT) AS id_int FROM my_namespace.sales_data LIMIT 5
-- Shorthand (::)
SELECT total_amount::INT AS amount_int FROM my_namespace.sales_data LIMIT 5

EXTRACT

SELECT EXTRACT(YEAR FROM timestamp) AS yr,
EXTRACT(MONTH FROM timestamp) AS mo,
EXTRACT(DAY FROM timestamp) AS dy
FROM my_namespace.sales_data
LIMIT 1

Data type reference

TypeDescriptionExample Values
integerWhole numbers1, 42, -10, 0
floatDecimal numbers1.5, 3.14, -2.7, 0.0
stringText values'hello', 'GET', '2024-01-01'
booleanBoolean valuestrue, false
timestampRFC3339'2025-09-24T01:00:00Z'
dateDate values'2025-09-24'
structNamed fieldsstruct_col['field_name']
arrayOrdered listarray_col[1] (1-indexed)
mapKey-value pairsmap_keys(map_col)

Operator precedence

  1. Comparison operators: =, !=, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL
  2. AND (higher precedence)
  3. OR (lower precedence)

Use parentheses to override default precedence:

SELECT * FROM my_namespace.sales_data WHERE (status = 404 OR status = 500) AND region = 'North'

Complete query examples

Basic query

SELECT *
FROM my_namespace.sales_data
WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
LIMIT 100

Filtered query with sorting

SELECT customer_id, timestamp, status, total_amount
FROM my_namespace.sales_data
WHERE status >= 400 AND total_amount > 5000
ORDER BY total_amount DESC
LIMIT 50

Aggregation with HAVING

SELECT region, COUNT(*) AS region_count, AVG(total_amount) AS avg_amount
FROM my_namespace.sales_data
WHERE status = 'completed'
GROUP BY region
HAVING COUNT(*) > 1000
ORDER BY avg_amount DESC
LIMIT 20

Conditional categorization

SELECT customer_id,
CASE
WHEN total_amount >= 1000 THEN 'Premium'
WHEN total_amount >= 100 THEN 'Standard'
ELSE 'Basic'
END AS tier,
total_amount
FROM my_namespace.sales_data
WHERE total_amount IS NOT NULL
ORDER BY total_amount DESC
LIMIT 20