Skip to content

Allow customizing serialization for feature generator extract functions#352

Merged
tovbinm merged 4 commits intomasterfrom
mt/serialize-parametrized-extract
Jul 6, 2019
Merged

Allow customizing serialization for feature generator extract functions#352
tovbinm merged 4 commits intomasterfrom
mt/serialize-parametrized-extract

Conversation

@tovbinm
Copy link
Copy Markdown
Collaborator

@tovbinm tovbinm commented Jul 1, 2019

Related issues
It's not possible to save a model with a feature generator stage that has an extract function with ctor arguments.

Describe the proposed solution
Allow customizing read/write for extract functions, e.g.

// extract function IntMultiplyExtractor has an argument 'multiplier' passed during training time
val multiplier = 10
val multiplied = FeatureBuilder.Integral[Int].extract(new IntMultiplyExtractor(multiplier)).asPredictor

// one can now specify a read/write implementation to serialize for the extractor instance
@ReaderWriter(classOf[IntMultiplyExtractorReadWrite])
class IntMultiplyExtractor(val multiplier: Int) extends Function1[Int, Integral] with Serializable {
  def apply(i: Int): Integral = (i * multiplier).toIntegral
}
class IntMultiplyExtractorReadWrite extends ValueReaderWriter[IntMultiplyExtractor] {
  implicit val formats = DefaultFormats
  def read(valueClass: Class[IntMultiplyExtractor], json: JValue): Try[IntMultiplyExtractor] = Try {
    new IntMultiplyExtractor((json \ "multiplier").extract[Int])
  }
  def write(value: IntMultiplyExtractor): Try[JValue] = Try {
    "multiplier" -> JInt(value.multiplier)
  }
}

Describe alternatives you've considered
NA

@tovbinm tovbinm requested a review from leahmcguire as a code owner July 1, 2019 23:14
@tovbinm tovbinm changed the title Allow customizing serialization for FeatureGenerator extract function Allow customizing serialization for FeatureGenerator extract functions Jul 1, 2019
@tovbinm tovbinm changed the title Allow customizing serialization for FeatureGenerator extract functions Allow customizing serialization for feature generator extract functions Jul 1, 2019
@codecov
Copy link
Copy Markdown

codecov bot commented Jul 1, 2019

Codecov Report

Merging #352 into master will increase coverage by 0.05%.
The diff coverage is 94.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #352      +/-   ##
==========================================
+ Coverage   86.68%   86.73%   +0.05%     
==========================================
  Files         335      336       +1     
  Lines       10781    10799      +18     
  Branches      347      346       -1     
==========================================
+ Hits         9345     9367      +22     
+ Misses       1436     1432       -4
Impacted Files Coverage Δ
...op/stages/DefaultOpPipelineStageReaderWriter.scala 69.23% <ø> (-0.77%) ⬇️
...m/salesforce/op/stages/FeatureGeneratorStage.scala 95.08% <100%> (+5.08%) ⬆️
...alesforce/op/stages/DefaultValueReaderWriter.scala 100% <100%> (ø)
...sforce/op/stages/OpPipelineStageReaderWriter.scala 85.71% <76.92%> (+1.09%) ⬆️
...la/com/salesforce/op/features/FeatureBuilder.scala 35.66% <0%> (+1.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6cc43f...31fa50d. Read the comment docs.

Copy link
Copy Markdown
Contributor

@michaelweilsalesforce michaelweilsalesforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this internally. Seems to work as expected.

@tovbinm tovbinm merged commit c25f4eb into master Jul 6, 2019
@tovbinm tovbinm deleted the mt/serialize-parametrized-extract branch July 6, 2019 00:20
This was referenced Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants