-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand functions while resolving dependencies #5744
Conversation
1db9b25
to
b5e1eb9
Compare
b5e1eb9
to
6535298
Compare
Codecov Report
@@ Coverage Diff @@
## master #5744 +/- ##
=======================================
Coverage 92.56% 92.56%
=======================================
Files 223 223
Lines 47315 47363 +48
=======================================
+ Hits 43797 43843 +46
- Misses 3518 3520 +2 |
*/ | ||
if (deprec->classid == AttrDefaultRelationId && | ||
deprec->objsubid == 0 && | ||
deprec->refobjsubid != 0 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the meaning of deprec->objsubid == 0 && deprec->refobjsubid != 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
objsubid is nonzero only for table columns and zero for anything else. Since we are trying to find a dependency from the column of a table to function we've added such a check. We may omit deprec->objsubid == 0
part , but I'm more inclined to keep it to not to get any unintentional dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you maybe add this to code? Because this is something we'd forget :)
bool IsColumnDependency = deprec->refobjsubid != 0
6535298
to
9d5ea21
Compare
/* | ||
* Add columns with constraint check | ||
*/ | ||
if (deprec->classid == ConstraintRelationId && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I check pg_catalog, I see that "table constraint"
, "default value"
, "rule"
, "index"
can be possible when refobjsubid != 0
.
Do we need anything for the latter two?
select pg_identify_object_as_address(classid, objid, objsubid), pg_identify_object_as_address(refclassid, refobjid, refobjsubid) from pg_depend where refobjsubid != 0 ORDER BY objid DESC;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we could add statistics here?
-- create statistics based on a valid function
BEGIN;
CREATE OR REPLACE FUNCTION non_sense_func_for_index()
RETURNS int
LANGUAGE plpgsql IMMUTABLE AS
$$
BEGIN
return 1;
END;
$$;
CREATE TABLE t3 (
a int,
b int
);
-- build ndistinct statistics on the pair of expressions (per-expression
-- statistics are built automatically)
CREATE STATISTICS s3 ON (a > non_sense_func_for_index()) FROM t3;
SELECT create_distributed_table('t3', 'a');
ERROR: function public.non_sense_func_for_index() does not exist
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
CONTEXT: while executing command on localhost:9700
Where we seem to have a dependency:
│ pg_identify_object_as_address │ ("statistics object","{public,s3}",{}) │
│ pg_identify_object_as_address │ ("table column","{public,t3,a}",{}) │
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly GENERATED ALWAYS:
-- generated always
BEGIN;
CREATE OR REPLACE FUNCTION non_sense_func_for_generated_always()
RETURNS int
LANGUAGE plpgsql IMMUTABLE AS
$$
BEGIN
return 1;
END;
$$;
CREATE TABLE people (
id int,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm / non_sense_func_for_generated_always()) STORED
);
SELECT create_distributed_table('people', 'id');
DETAIL: on server onderkalaci@localhost:9701 connectionId: 8
ERROR: function public.non_sense_func_for_generated_always() does not exist
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
CONTEXT: while executing command on localhost:9700
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic for finding dependent functions has been updated and tests above has been added.
9d5ea21
to
5ac0d4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe go over the CREATE TABLE syntax once more and check if we need any more objects that might depend on a function?
My notes are the following, which you might double check:
- Partition by a function such as use something other than
lower
CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (lower(peaktemp::text));
- function on primary key:
BEGIN;
CREATE TABLE TABLE (
column_1 data_type,
column_2 data_type,
…
PRIMARY KEY (column_1, column_2 * func())
);
- Does this work fine with Citus local tables and/or foreign tables?
- Exclude constraints (seems to work fine)
- Indexes depending on functions (seems to work fine, would be good to have a test)
- Can we have functions as part of foreign keys? (e.g., ALTER TABLE table ADD CONSTRAINT fkey FOREIGN KEY (measureid) REFERENCES colocated_partitioned_table(func(measureid));
- Can we have unique constraints with function? Probably works fine, good to manaully test:
CREATE TABLE products (
product_no integer,
name text,
price numeric,
UNIQUE (func(product_no))
);
/* | ||
* Add columns with constraint check | ||
*/ | ||
if (deprec->classid == ConstraintRelationId && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly GENERATED ALWAYS:
-- generated always
BEGIN;
CREATE OR REPLACE FUNCTION non_sense_func_for_generated_always()
RETURNS int
LANGUAGE plpgsql IMMUTABLE AS
$$
BEGIN
return 1;
END;
$$;
CREATE TABLE people (
id int,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm / non_sense_func_for_generated_always()) STORED
);
SELECT create_distributed_table('people', 'id');
DETAIL: on server onderkalaci@localhost:9701 connectionId: 8
ERROR: function public.non_sense_func_for_generated_always() does not exist
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
CONTEXT: while executing command on localhost:9700
63f44d5
to
201d6a8
Compare
7bf3629
to
5ec5e6a
Compare
INSERT INTO table_1_for_rule VALUES(7,7); | ||
CREATE TABLE table_2_for_rule(id int, col_1 int); | ||
INSERT INTO table_2_for_rule VALUES(7,7); | ||
CREATE RULE rule_1 AS ON UPDATE TO table_1_for_rule DO ALSO UPDATE table_2_for_rule SET col_1 = col_1 * func_for_rule(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess with MX the rule should also exists on the workers (which does not happen now). Not related to here, just noting which is already listed here: #4812
It is good that we are prepared from the function propagation front
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems rules show up as Normal dependency. Lets fix that and improve the test here:
SELECT objid, pg_identify_object_as_address(refclassid, refobjid, refobjsubid),deptype from pg_depend where objid = 17001;
┌───────┬───────────────────────────────────────────────────────┬─────────┐
│ objid │ pg_identify_object_as_address │ deptype │
├───────┼───────────────────────────────────────────────────────┼─────────┤
│ 17001 │ (table,"{public,table_1_for_rule}",{}) │ a │
│ 17001 │ ("table column","{public,table_2_for_rule,col_1}",{}) │ n │
│ 17001 │ (table,"{public,table_1_for_rule}",{}) │ n │
│ 17001 │ (function,"{public,func_for_rule}",{}) │ n │
└───────┴───────────────────────────────────────────────────────┴─────────┘
(4 rows)
Time: 2.108 ms
--------------------------------------------------------------------- | ||
(0 rows) | ||
|
||
CREATE TABLE table_to_check_func_index_dep (id int, col_2 int); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly sharing for reference.
As far as I can tell, this test already works on the master branch via the text search configuration changes.
And, note that the dependency type is Normal.
SELECT pg_identify_object_as_address(classid, objid, objsubid),deptype from pg_depend where refobjid = 'func_for_index_predicate'::regproc;
┌───────────────────────────────────────────────────────────────┬─────────┐
│ pg_identify_object_as_address │ deptype │
├───────────────────────────────────────────────────────────────┼─────────┤
│ (index,"{public,table_to_check_func_index_dep_col_2_idx}",{}) │ n │
└───────────────────────────────────────────────────────────────┴─────────┘
(1 row)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one more test idea, other than this, looks pretty good:
BEGIN;
CREATE OR REPLACE FUNCTION func_in_transaction_def_with_seq(val bigint)
RETURNS int
LANGUAGE SQL AS
$$
SELECT 2 * val;
$$;
CREATE OR REPLACE FUNCTION func_in_transaction_def_with_func(val bigint)
RETURNS int
LANGUAGE SQL AS
$$
SELECT func_in_transaction_def_with_seq(val);
$$;
-- Function shouldn't be propagated within transaction
SELECT pg_identify_object_as_address(classid, objid, objsubid) from citus.pg_dist_object where objid = 'func_in_transaction_def_with_seq'::regproc::oid;
CREATE SEQUENCE myseq;
CREATE TABLE table_to_prop_func(id int, col_1 int default func_in_transaction_def_with_func(func_in_transaction_def_with_seq(nextval('myseq'))));
SELECT create_distributed_table('table_to_prop_func','id');
-- Function should be marked as distributed after distributing the table that depends on it
SELECT pg_identify_object_as_address(classid, objid, objsubid) from citus.pg_dist_object where objid = 'func_in_transaction_def_with_seq'::regproc::oid;
COMMIT;
* function we've added deprec->refobjsubid != 0 check. | ||
*/ | ||
if (deprec->refobjsubid != 0 && | ||
deprec->deptype == DEPENDENCY_AUTO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for reference, we discussed that we can also follow DEPENDENCY_NORMAL
here. DEPENDENCY_AUTO will follow all the things that are dependent via column. Still, there seems no harm following DEPENDENCY_NORMAL just in case there are some complex SQL that we don't know
INSERT INTO table_1_for_rule VALUES(7,7); | ||
CREATE TABLE table_2_for_rule(id int, col_1 int); | ||
INSERT INTO table_2_for_rule VALUES(7,7); | ||
CREATE RULE rule_1 AS ON UPDATE TO table_1_for_rule DO ALSO UPDATE table_2_for_rule SET col_1 = col_1 * func_for_rule(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems rules show up as Normal dependency. Lets fix that and improve the test here:
SELECT objid, pg_identify_object_as_address(refclassid, refobjid, refobjsubid),deptype from pg_depend where objid = 17001;
┌───────┬───────────────────────────────────────────────────────┬─────────┐
│ objid │ pg_identify_object_as_address │ deptype │
├───────┼───────────────────────────────────────────────────────┼─────────┤
│ 17001 │ (table,"{public,table_1_for_rule}",{}) │ a │
│ 17001 │ ("table column","{public,table_2_for_rule,col_1}",{}) │ n │
│ 17001 │ (table,"{public,table_1_for_rule}",{}) │ n │
│ 17001 │ (function,"{public,func_for_rule}",{}) │ n │
└───────┴───────────────────────────────────────────────────────┴─────────┘
(4 rows)
Time: 2.108 ms
04b92aa
to
d484744
Compare
d484744
to
f17872a
Compare
Expand functions while resolving dependencies.
In order to handle function dependencies with relations, we need to expand relation graph to include functions on relation nodes as there is no direct dependency between relation and functions. We follow a similar logic to sequences.