Skip to content

Commit

Permalink
[SPARK-50515][CORE] Add read-only interface to SparkConf
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
This PR lifts read-only APIs of `SparkConf` into a new `ReadOnlySparkConf`. `SparkContext` now exposes a new read-only API to the conf through `SparkContext.getReadOnlyConf`, which can be used by clients outside the `spark` package if they require only read-only access. The new API avoids copying the entire (potentially large) conf as in `SparkContext.getConf`. This PR also changes all appropriate call sites to use the new API.

### Why are the changes needed?
Cloning the entire conf adds unnecessary CPU overhead due to copying, and GC overhead due to cleanup. Both affect tail latencies on certain workloads.

### Does this PR introduce _any_ user-facing change?
It adds a new public API `SparkContext.getReadOnlyConf`.

### How was this patch tested?
It is a refactoring PR, so we rely on existing tests.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #49100 from pmenon/read-only-confs.

Authored-by: Prashanth Menon <prashanth.menon@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
pmenon authored and cloud-fan committed Dec 14, 2024
1 parent f8de6c7 commit 2b9eb08
Show file tree
Hide file tree
Showing 8 changed files with 239 additions and 192 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ private[spark] class DirectKafkaInputDStream[K, V](
ppc: PerPartitionConfig
) extends InputDStream[ConsumerRecord[K, V]](_ssc) with Logging with CanCommitOffsets {

private val initialRate = context.sparkContext.getConf.getLong(
private val initialRate = context.sparkContext.getReadOnlyConf.getLong(
"spark.streaming.backpressure.initialRate", 0)

val executorKafkaParams = {
Expand Down
Loading

0 comments on commit 2b9eb08

Please sign in to comment.