Threshold metrics calculation fix when unseen labels are present by Jauntbox · Pull Request #293 · salesforce/TransmogrifAI

Jauntbox · 2019-04-17T22:31:26Z

Related issues
Issue exposed by fixes in #263

Describe the proposed solution
Simple one-line fix to threshold metrics calculation. When the true label score is calculated, it will either come from the list of labels the model was trained on (which may be pruned by DataCutter) or be 0 (eg. if it corresponds to a label the model was not trained on).

Describe alternatives you've considered
n/a

Additional context
Merge #263 first since this PR was branched off of it.
Unit test resides in MultiClassificationModelSelectorTest

This reverts commit d6d9ded

This reverts commit 618bd92

test table num_vals

… for test

…grifAI into km/label-trim-metrics

gerashegalov

LGTM, minor comment

gerashegalov · 2019-04-19T20:51:21Z

core/src/main/scala/com/salesforce/op/evaluators/OpMultiClassificationEvaluator.scala

      val label: Label = scoresAndLabels._2.toInt
-      val trueClassScore: Double = scores(label)
+      // The label may be unseen during model training, so treat scores for unseen classes as all being zero
+      val trueClassScore: Double = if (scores.isDefinedAt(label)) scores(label) else 0.0


how about scores.lift(label).getOrElse(0.0)

Ahh, I was trying to remember how to do that!

…label-trim-metrics

gerashegalov added 30 commits April 14, 2019 15:46

label metadata after data cut

a0f5bc7

idempotent trim and special-case DataCutter

0a7d872

ignore previous labels in metadata

8f900a1

more logging

0b520ee

num_vals instead of vals

0a775eb

Revert "num_vals instead of vals"

9da41b2

This reverts commit d6d9ded

fix metadata labels

3e73aaa

sync labels

bde7cf0

add mkstring

c5d3272

cleanup and fixes

2ad6876

WIP

e084d4e

Revert "cleanup and fixes"

937dc17

This reverts commit 618bd92

rewwork labels dropped

a803644

restore accidental revert

ae1b58d

re-add num_vals

4f7e2ef

WIP

30f961d

WIP2

81f5ee9

tests passing

acff707

debug for leah

aee7bbe

if inverted

f140aa9

LR repro'ed

1b3bc27

WIP

5074ff5

test vectors

bcb93fd

test both indexed and non-idexed data with data cutter

8c6d685

Leah's comments

f1bb79d

test table num_vals

param for number dropped lables to log

da400d4

WIP

5a5b19d

WIP

74e9dd1

WIP

4a4171f

WIP

2e467c1

gerashegalov and others added 19 commits April 17, 2019 22:58

WIP

b9eb963

WIP

ae3e5c5

WIP

a44707b

formatting

831e114

Replace cacheValidatedDFForTesting with the extended DataCutter class…

a1f37c3

… for test

mt feedback WIP

b3db9c4

build failure fix

dd16900

whitespace

0dd9fb6

column index constants

37b2dce

Param rename

cc2d4f0

CodeFactor issue

e973f46

Merge branch 'master' into gera/factorlabeltrim

27992db

Merged with Gera's branch

3029177

Fixed merge

8197332

Merge branch 'master' into km/label-trim-metrics

6bf7d72

Merge with master

0d8941b

Merge branch 'km/label-trim-metrics' of github.com:salesforce/Transmo…

a13cf54

…grifAI into km/label-trim-metrics

Put blank line back in

6970775

Put blank line back in, but not with space

9307140

Jauntbox added the ready for review label Apr 19, 2019

leahmcguire approved these changes Apr 19, 2019

View reviewed changes

gerashegalov approved these changes Apr 19, 2019

View reviewed changes

Jauntbox added 2 commits April 19, 2019 15:22

Address comment

69fad18

Merge branch 'master' of github.com:salesforce/TransmogrifAI into km/…

14b442d

…label-trim-metrics

gerashegalov mentioned this pull request Apr 19, 2019

test tovbinm/TransmogrifAI#1

Closed

Merge branch 'master' of github.com:salesforce/TransmogrifAI into km/…

321bc7e

…label-trim-metrics

Jauntbox merged commit 9bc75a9 into master Apr 22, 2019

Jauntbox deleted the km/label-trim-metrics branch April 22, 2019 18:09

Jauntbox mentioned this pull request May 2, 2019

Release 0.5.3 #305

Merged

michaelweilsalesforce mentioned this pull request Jul 10, 2019

0.6.0 Release #360

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threshold metrics calculation fix when unseen labels are present#293

Threshold metrics calculation fix when unseen labels are present#293
Jauntbox merged 84 commits intomasterfrom
km/label-trim-metrics

Jauntbox commented Apr 17, 2019

Uh oh!

gerashegalov left a comment

Uh oh!

gerashegalov Apr 19, 2019

Uh oh!

Jauntbox Apr 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Jauntbox commented Apr 17, 2019

Uh oh!

gerashegalov left a comment

Choose a reason for hiding this comment

Uh oh!

gerashegalov Apr 19, 2019

Choose a reason for hiding this comment

Uh oh!

Jauntbox Apr 19, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants