-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68
Comments
Hee, thanks for reporting. I happen to know the maintainer of this project is on paternity leave :) @sbjelogr Perhaps you can have a look ? |
Sure! I think you just need to add a negative sign somewhere in the equation. I followed the code and added a negative sign when multiplying WoE and Coef. and was able to get the same results in the example. I'm not really familiar yet with OOP so I just created a custom function |
Thanks for this issue. I think the ScoreCardPoints is actually quite broken and I propose to remove it. Looking at a minimal example, we see that the @sbjelogr @timvink It seems the import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from skorecard import datasets
from skorecard import Skorecard
from skorecard.bucketers import OrdinalCategoricalBucketer
from skorecard.rescale import calibrate_to_master_scale, ScoreCardPoints
X, y = datasets.load_uci_credit_card(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X[["EDUCATION", "MARRIAGE"]], y)
o = OrdinalCategoricalBucketer(variables=["EDUCATION"])
sc = Skorecard(
bucketing=o,
variables=["EDUCATION"],
calculate_stats=True
)
sc.fit(X_train, y_train)
scp = ScoreCardPoints(skorecard_model=sc, pdo=25, ref_score=400, ref_odds=20)
sc.bucket_table("EDUCATION")
woe_enc = scp.skorecard_model.pipeline_.named_steps["encoder"]
woe_dict = woe_enc.mapping
woe_dict['EDUCATION'] EDUCATION See that the WoE for -1 and 2 is bad. |
@orchardbirds, they are not exactly the same.
In addition it provides an extra tabular representation of the points per feature per bucket |
@idellang, I will be investigating this issue in the coming days. Keep you posted |
Please excuse the way that I reported this issue. This is my first time reporting a GitHub issue. I get different results from the ScoreCardPoints object. The scores using calibrate_to_master_scale on the proba_train are different from the score using scp.transform(X_train). I believe the calibrate_to_master_scale scores were right.
EDIT: I tried following the last tutorial example 'Scorecard Model' and I encounter the same problem. Going through the example, I noticed that the coefficients from scorecard.get_stats() are negative and the scorecard.woe_transform(X_test) are positive values but I get positive coefficients and negative scorecard.woe_transform(X_test).
Check the following images. In this example, I used a single categorical variable educational attainment versus default rate. Thank you!
The text was updated successfully, but these errors were encountered: