Skip to content

Commit

Permalink
[SPARK-48033][SQL] Fix RuntimeReplaceable expressions being used in…
Browse files Browse the repository at this point in the history
… default columns

### What changes were proposed in this pull request?

Currently, default columns that have a default of a `RuntimeReplaceable` expression fails.

This is because the `AlterTableCommand` constant folds before replacing expressions with the actual implementation. For example:
```
sql(s"CREATE TABLE t(v VARIANT DEFAULT parse_json('1')) USING PARQUET")
sql("INSERT INTO t VALUES(DEFAULT)")
```
fails because `parse_json` is `RuntimeReplaceable` and is evaluated before the analyzer inserts the correct expression into the plan

To fix this, we run the `ReplaceExpressions` rule before `ConstantFolding`

### Why are the changes needed?

This allows default columns to use expressions that are `RuntimeReplaceable`

This is especially important for Variant types because literal variants are difficult to create - `parse_json` will likely be used the majority of the time.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

added UT

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #46269 from richardc-db/fix_default_cols_runtime_replaceable.

Authored-by: Richard Chen <r.chen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
richardc-db authored and cloud-fan committed Apr 30, 2024
1 parent fe05eb8 commit da92293
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import org.apache.spark.sql.catalyst.analysis._
import org.apache.spark.sql.catalyst.catalog.{CatalogDatabase, InMemoryCatalog, SessionCatalog}
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions.{Literal => ExprLiteral}
import org.apache.spark.sql.catalyst.optimizer.ConstantFolding
import org.apache.spark.sql.catalyst.optimizer.{ConstantFolding, ReplaceExpressions}
import org.apache.spark.sql.catalyst.parser.{CatalystSqlParser, ParseException}
import org.apache.spark.sql.catalyst.plans.logical._
import org.apache.spark.sql.catalyst.trees.TreePattern.PLAN_EXPRESSION
Expand Down Expand Up @@ -289,7 +289,7 @@ object ResolveDefaultColumns extends QueryErrorsBase
val analyzer: Analyzer = DefaultColumnAnalyzer
val analyzed = analyzer.execute(Project(Seq(Alias(parsed, colName)()), OneRowRelation()))
analyzer.checkAnalysis(analyzed)
ConstantFolding(analyzed)
ConstantFolding(ReplaceExpressions(analyzed))
} catch {
case ex: AnalysisException =>
throw QueryCompilationErrors.defaultValuesUnresolvedExprError(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -279,4 +279,12 @@ class ResolveDefaultColumnsSuite extends QueryTest with SharedSparkSession {
checkAnswer(sql("select CAST(c as STRING) from t"), Row("2018-11-17 13:33:33"))
}
}

test("SPARK-48033: default columns using runtime replaceable expression works") {
withTable("t") {
sql("CREATE TABLE t(v VARIANT DEFAULT parse_json('1')) USING PARQUET")
sql("INSERT INTO t VALUES(DEFAULT)")
checkAnswer(sql("select v from t"), sql("select parse_json('1')").collect())
}
}
}

0 comments on commit da92293

Please sign in to comment.