Avoid having to have an implicit SparkSession in OpWorkFlow(Model)#468
Avoid having to have an implicit SparkSession in OpWorkFlow(Model)#468
Conversation
… moving job grouping down to reader/writer
Codecov Report
@@ Coverage Diff @@
## master #468 +/- ##
==========================================
+ Coverage 81.25% 86.99% +5.74%
==========================================
Files 345 345
Lines 11616 11614 -2
Branches 376 374 -2
==========================================
+ Hits 9438 10104 +666
+ Misses 2178 1510 -668
Continue to review full report at Codecov.
|
| case Failure(error) => throw new RuntimeException(s"Failed to load Workflow from path '$path'", error) | ||
| case Success(wf) => wf | ||
| implicit val spark: SparkSession = this.sparkSession | ||
| JobGroupUtil.withJobGroup(OpStep.ModelIO) { |
There was a problem hiding this comment.
nit: we could just pass sparkSession explicitly instead of creating an implicit val.
| * @param path path to save the model and its stages | ||
| * @param overwrite should overwrite the destination | ||
| */ | ||
| def save(model: OpWorkflowModel, path: String, overwrite: Boolean = true): Unit = { |
There was a problem hiding this comment.
the scope of this PR did not require it, we should not do breaking changes such as removing public methods.
There was a problem hiding this comment.
Oh, I missed that one! good catch @gerashegalov
There was a problem hiding this comment.
@nicodv please revert this part of the change.
There was a problem hiding this comment.
@gerashegalov / @tovbinm So we are OK with adding an implicit SparkSession to OpWorkflowModel.save() instead? (And having inconsistent approaches between model loading/saving.)
There was a problem hiding this comment.
@nicodv this is not the choice we have to make. We can make it consistent by just changing saveImpl in class OpWorkflowModelWriter
override protected def saveImpl(path: String): Unit = {
JobGroupUtil.withJobGroup(OpStep.ModelIO) {
sc.parallelize(Seq(toJsonString(path)), 1)
.saveAsTextFile(OpWorkflowModelReadWriteShared.jsonPath(path), classOf[GzipCodec])
}(sparkSession)
}If you do this and undo the save method move we should be ok
There was a problem hiding this comment.
Ah, I overlooked the saveImpl method. New PR that reverts this: #470
Related issues
N/A
Describe the proposed solution
Avoid having to provide a
SparkSessionimplicit toOpWorkflow.loadModel/OpWorkflowModel.saveto keep backward compatibility.Describe alternatives you've considered
N/A
Additional context
Small follow-up on #467