Performance Analysis of Updating Postgres JSONB and Columns

3 min readAug 30, 2023

This article focuses on saving and updating data with different types of columns, among them JSONB.

Don’t forget that these attributes and columns have to be only informational, like single-use, and will not have any relational relationship with other tables, constraints, or indexes.

Let’s create a test table with a JSONB column with six attributes and also six other different columns, and insert the random values there.

CREATE TABLE random_data_json (
    id serial PRIMARY KEY,
    json_data jsonb,
    int_col integer,
    text_col text,
    float_col float,
    date_col date,
    boolean_col boolean,
    varchar_col varchar(50)
);

DO
$$
    DECLARE
        i integer;
    BEGIN
        FOR i IN SELECT * FROM generate_series(1, 6000000)
            LOOP


            INSERT INTO random_data_json (json_data, int_col, text_col, float_col, date_col, boolean_col, varchar_col)
SELECT json_build_object(
           'attr1', random() * 100,
           'attr2', random() * 100,
           'attr3', random() * 100,
           'attr4', random() * 100,
           'attr5', random() * 100,
           'attr6', random() * 100,
           'attr7', random() * 100
       ),
       random() * 100,
       md5(random()::text),
       random(),
       current_date - make_interval(days => (random() * 365)::integer),
       random() < 0.5,
       substr(md5(random()::text), 1, 50);



            END LOOP;
    END
$$;

Create the second table named random_data_json2 in the same way.

The easiest method to read and write attributes from JSON in Postgres is:

-- get attribute value from JSONB column
column_name = {"attr1": 23.88174093244453, "attr2": 29.70989357350731, ...}
column_name['attr1'] = 23.88174093244453

-- replace JSONB column attribite value
column_name = {"attr1": 23.88174093244453, "attr2": 29.70989357350731, ...}
column_name['attr2'] = to_jsonb(5)
new_column_value = {"attr1": 23.88174093244453, "attr2": 5, ...}

Here we have a JSONB updating query (1):

update am_promo_testing_purpose.random_data_json a
set json_data['attr1'] = to_jsonb(b.json_data['attr1']),
    json_data['attr2'] = to_jsonb(b.json_data['attr2']),
    json_data['attr3'] = to_jsonb(b.json_data['attr3']),
    json_data['attr4'] = to_jsonb(b.json_data['attr4']),
    json_data['attr5'] = to_jsonb(b.json_data['attr5']),
    json_data['attr6'] = to_jsonb(b.json_data['attr6'])
from am_promo_testing_purpose.random_data_json2 b
where a.id = b.id

And column-based update query (2):

update am_promo_testing_purpose.random_data_json a
set boolean_col = b.boolean_col,
    date_col    = b.date_col,
    float_col   = b.float_col,
    int_col     = b.int_col,
    text_col    = b.text_col,
    varchar_col = b.varchar_col
from am_promo_testing_purpose.random_data_json2 b
where a.id = b.id

Run from the console:

(1) 6,000,000 rows affected in 2 m 16 s 279 ms
(2) 6,000,000 rows affected in 1 m 38 s 760 ms

Explain analyse of query 1 (json)

Update on random_data_json  (cost=33.56..1203601.89 rows=0 width=0) (actual time=112911.158..112911.159 rows=0 loops=1)
  ->  Merge Join  (cost=33.56..1203601.89 rows=6000000 width=44) (actual time=62.869..49522.358 rows=6000000 loops=1)
        Merge Cond: (a.id = b.id)
        ->  Index Scan using random_data_json_pkey on random_data_json a  (cost=0.43..662292.66 rows=6305132 width=221) (actual time=62.795..1744.585 rows=6000001 loops=1)
        ->  Index Scan using random_data_json2_pkey on random_data_json2 b  (cost=0.43..360568.18 rows=6000000 width=221) (actual time=0.034..1842.678 rows=6000000 loops=1)
Planning Time: 0.397 ms
JIT:
  Functions: 8
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 1.220 ms, Inlining 14.693 ms, Optimization 36.733 ms, Emission 27.137 ms, Total 79.783 ms"
Execution Time: 112912.308 ms

Explain analyse of query 2 (columns)

Update on random_data_json a  (cost=33.91..1081753.74 rows=0 width=0) (actual time=70922.676..70922.677 rows=0 loops=1)
  ->  Merge Join  (cost=33.91..1081753.74 rows=6000000 width=95) (actual time=52.548..5256.279 rows=6000000 loops=1)
        Merge Cond: (a.id = b.id)
        ->  Index Scan using random_data_json_pkey on random_data_json a  (cost=0.43..631191.61 rows=6003995 width=10) (actual time=52.513..1536.106 rows=6000001 loops=1)
        ->  Index Scan using random_data_json2_pkey on random_data_json2 b  (cost=0.43..360568.18 rows=6000000 width=93) (actual time=0.028..2077.242 rows=6000000 loops=1)
Planning Time: 0.259 ms
JIT:
  Functions: 8
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 1.015 ms, Inlining 8.876 ms, Optimization 35.409 ms, Emission 23.916 ms, Total 69.216 ms"
Execution Time: 70923.678 ms

JSON Query 1 Plan:

Column Query 2 Plan:

Stats for Query 1 and Query 2:

Conclusion

I believe there’s no need to draw any definitive conclusions at this point, as the outcome may vary based on the results. However, from my perspective, it’s unlikely that you’ll have to update more than one million records. Therefore, you can confidently explore the option of employing the JSON method.

Performance Analysis of Updating Postgres JSONB and Columns

Written by Irakli DD