Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests #7816

Merged
merged 3 commits into from
Dec 30, 2024

Conversation

naisila
Copy link
Member

@naisila naisila commented Dec 27, 2024

DESCRIPTION: Adds JSON_TABLE() support

PG17 has added basic JSON_TABLE() functionality
JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples). In the end, for multi-shard JSON_TABLE commands, we apply the same restrictions as reference tables (e.g., cannot perform a lateral outer join when a distributed subquery references a (reference table)/(json table) etc.)

Relevant PG17 commits:
basic JSON table, nested paths in json table

Onder had previously added json table support for PG15BETA1, but we reverted that commit because json table was reverted in PG15. ce7f1a5
Previous relevant PG15Beta1 commit: postgres/postgres@4e34747c8
Therefore, I referred to Onder's commit for this commit as well, with a few changes due to some differences between PG15/PG17:

  1. In PG15Beta1, we had also PLAN clauses for JSON_TABLE postgres/postgres@fadb48b00, and Onder's commit includes tests for those as well. However, PLAN nodes are not added in PG17. Therefore, I didn't include the json_table_select_only test, which had mostly queries involving PLAN. I only included the last query from that test.

  2. In PG15 timeline (Citus 11.1), we didn't support outer joins where the outer rel is a recurring one and the inner one is a non-recurring one. However, Onur added support for that one in Citus 11.2, therefore I updated the tests from Onder's commit accordingly.

  3. PG17 json table has nested paths and columns, therefore I added a test
    with a distributed table, which is exactly the same as the one in
    sqljson_jsontable in PG17. postgres/postgres@bb766cde6

This pull request also adds some basic tests on validation of SQL/JSON constructor functions JSON(), JSON_SCALAR(), and JSON_SERIALIZE(), and also SQL/JSON query functions JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE(). The relevant PG commits are the following:
JSON(), JSON_SCALAR(), JSON_SERIALIZE()
JSON_EXISTS(), JSON_VALUE(), JSON_QUERY()

Copy link

codecov bot commented Dec 27, 2024

Codecov Report

Attention: Patch coverage is 90.47619% with 2 lines in your changes missing coverage. Please review.

Please upload report for BASE (release-13.0@8a8b2f9). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff               @@
##             release-13.0    #7816   +/-   ##
===============================================
  Coverage                ?   89.63%           
===============================================
  Files                   ?      274           
  Lines                   ?    59749           
  Branches                ?     7452           
===============================================
  Hits                    ?    53557           
  Misses                  ?     4058           
  Partials                ?     2134           

@naisila naisila force-pushed the naisila/json_table branch 3 times, most recently from 825abff to e169a91 Compare December 27, 2024 20:53
@naisila naisila changed the title Adds JSON_TABLE() support Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests Dec 27, 2024
Comment on lines +284 to +302
-- JSON_TABLE can be on the outer side of the join
-- We support outer joins where the outer rel is a recurring one
-- and the inner one is a non-recurring one if we don't reference the outer from the inner
-- https://github.com/citusdata/citus/pull/6512
SELECT *
FROM json_table('[{"a":10,"b":20},{"a":30,"b":40}]'::JSONB, '$[*]'
COLUMNS (id FOR ORDINALITY, column_a int4 PATH '$.a', column_b int4 PATH '$.b', a int4, b int4, c text))
LEFT JOIN LATERAL
(SELECT *
FROM my_films) AS foo on(foo.id = a);
DEBUG: recursively planning right side of the left join since the outer side is a recurring rel
DEBUG: recursively planning the distributed subquery since it is part of a distributed join node that is outer joined with a recurring rel
DEBUG: generating subplan XXX_1 for subquery SELECT id, js FROM pg17_json.my_films
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT "json_table".id, "json_table".column_a, "json_table".column_b, "json_table".a, "json_table".b, "json_table".c, foo.id, foo.js FROM (JSON_TABLE('[{"a": 10, "b": 20}, {"a": 30, "b": 40}]'::jsonb, '$[*]' AS json_table_path_0 COLUMNS (id FOR ORDINALITY, column_a integer PATH '$."a"', column_b integer PATH '$."b"', a integer PATH '$."a"', b integer PATH '$."b"', c text PATH '$."c"')) LEFT JOIN LATERAL (SELECT intermediate_result.id, intermediate_result.js FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(id bigint, js jsonb)) foo ON ((foo.id OPERATOR(pg_catalog.=) "json_table".a)))
id | column_a | column_b | a | b | c | id | js
---------------------------------------------------------------------
1 | 10 | 20 | 10 | 20 | | |
2 | 30 | 40 | 30 | 40 | | |
(2 rows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank for adjusting tests for #6512

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding support for outer joins where the outer rel is a recurring one and the inner one is a non-recurring one if we don't reference the outer from the inner :D

@onurctirtir onurctirtir self-requested a review December 30, 2024 10:09
@@ -1204,7 +1213,8 @@ DeferErrorIfUnsupportedTableCombination(Query *queryTree)
*/
if (rangeTableEntry->rtekind == RTE_RELATION ||
rangeTableEntry->rtekind == RTE_SUBQUERY ||
rangeTableEntry->rtekind == RTE_RESULT)
rangeTableEntry->rtekind == RTE_RESULT ||
IsJsonTableRTE(rangeTableEntry))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because as of today we don't have any conditions that could prevent us from pushing down a json table expr (e.g., we cannot pushdown a VALUES expr when it contains mutable function call as we check below), we treat json table exprs as if they're simple set returning things, like relations and subqueries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(like a self-note for me to remember the reasoning here if I again need to look into this in the future)

Copy link
Member

@onurctirtir onurctirtir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me!

PG17 has added basic JSON_TABLE() functionality
JSON_TABLE() allows JSON data to be converted into a relational view
and thus used, for example, in a FROM clause, like other tabular
data.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot perform a lateral outer join
when a distributed subquery references a (reference table)/JSON_TABLE etc.)

Relevant PG commit:
postgres/postgres@de3600452

Onder had previously added json table support for PG15BETA1,
but we reverted that commit because json table was reverted in PG15.
ce7f1a5
Therefore, I referred to that commit for this commit as well,
with a few changes due to some differences between PG15/PG17:

1) In PG15Beta1, we had also PLAN clauses for JSON_TABLE, and Onder's commit
includes tests for those as well. However, PLAN nodes are not added in PG17.
Therefore I didn't include the json_table_select_only test, which had mostly
queries involving PLAN.

2) In PG15 timeline (Citus 11.1), we didn't support outer joins where the
outer rel is a recurring one and the inner one is a non-recurring one.
However, Onur added support for that one in Citus 11.2, therefore I updated
the tests from Onder's commit accordingly.

3) PG17 json table has nested paths and columns, therefore I added a test
with a distributed table, which is exactly the same as the one in
sqljson_jsontable in PG17. postgres/postgres@bb766cde6
@naisila naisila merged commit 9e3c329 into release-13.0 Dec 30, 2024
151 of 159 checks passed
@naisila naisila deleted the naisila/json_table branch December 30, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants