How to index and query a JSON object in PostgreSQL®?
- 4 minutes read - 742 wordsPostgreSQL® offers two types of data types to handle JSON data, JSON
and JSONB
. This doc showcases how to index a JSONB
column with a GIN index.
Some of the examples are based on this blog.
NOTE: To review the differences between
JSON
andJSONB
check out the related article.
NOTE: more info is available in the PostgreSQL JSON functions documentation page
👉 Want a FREE PostgreSQL database?👈
🦀 Check Aiven’s FREE plans! 🦀
⚡️ Want to optimize your SQL query with AI? ⚡️
Check Aiven SQL Query Optimizer!
⚡️ Want a Fully PostgreSQL optimized Database? ⚡️
Check Aiven AI Database Optimizer!
Powered by EverSQL, it provides you index and SQL rewrite recommendations to take your database performance to the next level
The dataset
The dataset is the following:
{
"id": 778,
"shop": "Luigis Pizza",
"name": "Edward Olson",
"phoneNumbers":
["(935)503-3765x4154","(935)12345"],
"address": "Unit 9398 Box 2056\nDPO AP 24022",
"image": null,
"pizzas": [
{
"pizzaName": "Salami",
"additionalToppings": ["🥓", "🌶️"]
},
{
"pizzaName": "Margherita",
"additionalToppings": ["🍌", "🌶️", "🍍"]
}
]
}
Check out the description of the fields
The following examples use a pizza order dataset with an order having:id
: 778shop
: “Luigis Pizza”name
: “Edward Olson”phoneNumbers
:["(935)503-3765x4154","(935)12345"]address
: “Unit 9398 Box 2056\nDPO AP 24022”image
: null- and two pizzas contained in the
pizzas
item:
[
{
"pizzaName": "Salami",
"additionalToppings": ["🥓", "🌶️"]
},
{
"pizzaName": "Margherita",
"additionalToppings": ["🍌", "🌶️", "🍍"]
}
]
If you want to reproduce the examples, check how to recreate the dataset
It can be recreated with the following script:
create table test(id serial, json_data jsonb);
insert into test(json_data) values (
'{
"id": 778,
"shop": "Luigis Pizza",
"name": "Edward Olson",
"phoneNumbers":
["(935)503-3765x4154","(935)12345"],
"address": "Unit 9398 Box 2056\nDPO AP 24022",
"image": null,
"pizzas": [
{
"pizzaName": "Salami",
"additionalToppings": ["🥓", "🌶️"]
},
{
"pizzaName": "Margherita",
"additionalToppings": ["🍌", "🌶️", "🍍"]
}
]
}');
Create a GIN index on the JSONB column
To create a GIN index over the JSONB column:
create index json_data_gin on test using gin (json_data);
Retrieve all rows with a value
To retrieve all rows having the shop
name Luigis Pizza
, you can use the @>
operator:
select * from test where json_data @> '{"shop":"Luigis Pizza"}';
Result
id | json_data
----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | {"id": 778, "name": "Edward Olson", "shop": "Luigis Pizza", "image": null, "pizzas": [{"pizzaName": "Salami", "additionalToppings": ["🥓", "🌶️"]}, {"pizzaName": "Margherita", "additionalToppings": ["🍌", "🌶️", "🍍"]}], "address": "Unit 9398 Box 2056\nDPO AP 24022", "phoneNumbers": ["(935)503-3765x4154", "(935)12345"]}
(1 row)
Checking with EXPLAIN
:
explain analyse select * from test where json_data @> '{"shop":"Luigis Pizza"}';
Result shows the usage of the index:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=41.00..42.01 rows=1 width=394) (actual time=0.398..0.399 rows=1 loops=1)
Recheck Cond: (json_data @> '{"shop": "Luigis Pizza"}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on json_data_gin (cost=0.00..41.00 rows=1 width=0) (actual time=0.389..0.390 rows=1 loops=1)
Index Cond: (json_data @> '{"shop": "Luigis Pizza"}'::jsonb)
Planning Time: 0.087 ms
Execution Time: 0.470 ms
(7 rows)
But, if weyou filter for a specific subitem, like:
explain analyse select * from test where json_data -> 'pizzas' @> '{"pizzaName":"Margherita"}';
You get a sequence scan:
QUERY PLAN
---------------------------------------------------------------------------------------------------
Seq Scan on test (cost=0.00..470.13 rows=69 width=394) (actual time=2.064..2.064 rows=0 loops=1)
Filter: ((json_data -> 'pizzas'::text) @> '{"pizzaName": "Margherita"}'::jsonb)
Rows Removed by Filter: 6939
Planning Time: 0.056 ms
Execution Time: 2.085 ms
(5 rows)
If the above filter is common, you might want to create specific indexes for a single item (pizzas
in this case).
Create an index on a specific JSONB item
To create an index for a specific item like the pizzas
you can:
CREATE INDEX pizzas_gin ON test USING GIN ((json_data -> 'pizzas'));
Retrieve all rows with a specific filter
To filter for a specific item value, like pizzas
having pizzaName
equal to Margherita
you can use the @>
operator:
explain analyse select * from test where json_data -> 'pizzas' @> '{"pizzaName":"Margherita"}';
Results
id | json_data
----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | {"id": 778, "name": "Edward Olson", "shop": "Luigis Pizza", "image": null, "pizzas": [{"pizzaName": "Salami", "additionalToppings": ["🥓", "🌶️"]}, {"pizzaName": "Margherita", "additionalToppings": ["🍌", "🌶️", "🍍"]}], "address": "Unit 9398 Box 2056\nDPO AP 24022", "phoneNumbers": ["(935)503-3765x4154", "(935)12345"]}
(1 row)
Checking with EXPLAIN
:
explain analyse select * from test where json_data @> '{"pizzas": [{"pizzaName":"Margherita"}]}';
Results, the index is used:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=43.00..44.01 rows=1 width=394) (actual time=0.493..0.494 rows=1 loops=1)
Recheck Cond: (json_data @> '{"pizzas": [{"pizzaName": "Margherita"}]}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on json_data_gin (cost=0.00..43.00 rows=1 width=0) (actual time=0.483..0.483 rows=1 loops=1)
Index Cond: (json_data @> '{"pizzas": [{"pizzaName": "Margherita"}]}'::jsonb)
Planning Time: 0.186 ms
Execution Time: 0.523 ms
(7 rows)