Do you really need to use NoSQL databases? Or can a relational database satisfy your semi-structured data requirements?
PostgreSQL is a very complete and popular relational database, but did you know that it also has several functionalities of a non-relational database?
How does PostgreSQL support non-relational data?
PostgreSQL supports non-relational data in 3 different ways:
- XML
- JSON
- JSONB
XML
PostgreSQL supports XML through a native xml
data type. You can store and query XML data using XPath.
CREATE TABLE products (
id SERIAL PRIMARY KEY,
data XML
);
INSERT INTO products (data)
VALUES (
'<product><name>Smartphone</name><price>999.99</price></product>'
);
-- Extract the product name using XPath
SELECT
xpath('//name/text()', data) AS name
FROM products;
XML support is useful when you're integrating with legacy systems or external APIs that still use XML formats. But it's not very efficient compared to JSON for most modern use cases.
JSON
PostgreSQL introduced native support for the json
data type to store plain text JSON structures. It's great for storing semi-structured data.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
profile JSON
);
INSERT INTO users (profile)
VALUES (
'{
"name": "John Doe",
"age": 21,
"skills": ["golang", "typescript", "clickhouse"]
}'
);
-- Querying JSON data
SELECT
profile->>'name' AS name,
profile->'skills' AS skills
FROM users;
The json
type preserves the original formatting, including whitespace and ordering of keys, which is good for logging or raw data storage. However, it’s not the most efficient format for querying, even though writes are fine.
JSONB
jsonb
stands for binary JSON. It stores JSON in a binary format that's optimized for indexing and querying. If you're going to filter or search data inside JSON structures, prefer jsonb
.
CREATE TABLE events (
id SERIAL PRIMARY KEY,
metadata JSONB
);
INSERT INTO events (metadata)
VALUES (
'{
"type": "click",
"user_id": 42,
"timestamp": "2025-06-18T09:00:00Z"
}'
);
-- Filter events by user_id
SELECT *
FROM events
WHERE metadata->>'user_id' = '42';
-- Create a GIN index for better performance
CREATE INDEX idx_metadata ON events USING GIN (metadata);
jsonb
allows indexing with GIN (Generalized Inverted Indexes), which makes querying large JSON documents much faster. It also removes duplicate keys and ignores whitespace. If GIN indexing doesn't solve your case, you can use B-Tree and Hash indexes as well; however, they might not be extremely useful.
Final Comparison
To help you decide which one to use, here's a table comparing these three NoSQL in Postgres:
Feature | JSON | JSONB | XML |
---|---|---|---|
Data type | Text | Binary format | Text |
Storage efficiency | Less efficient | More efficient | Less efficient |
Read performance | Slower | Faster | Slower |
Write performance | Faster (no parsing on insert) | Slower (parsing + conversion) | Slower |
Index support | Limited | Supports GIN, B-Tree and Hash | Functional indexes only |
Key order preserved | ✅ Yes | ❌ No | ✅ Yes |
Duplicate keys allowed | ✅ Yes | ❌ No (last key kept) | ✅ Yes |
Schema enforcement | ❌ None | ❌ None | ❌ None |
Ideal use case | Logging, preserving raw data | Querying, filtering, indexing | Legacy systems, XML APIs |
Conclusion
PostgreSQL has a solid toolset for handling semi-structured data, making it extremely suitable for performing as a NoSQL database when needed. For this reason, if you are planning a new system that will require semi-structured data processing, you should consider using PostgreSQL if it's already part of your team's tech stack. It can be a very good choice without increasing complexity or introducing a new element with a steeper learning curve.
It's important to notice that there are many cases where you will need an actual NoSQL database, like MongoDB, Cassandra, or Scylla, etc. So, remember to evaluate each one's strengths and weaknesses before making your architecture decision.
Top comments (0)