You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been experimenting and modifying the LightGBMLSS code. One thing I have done is implemented the option of using a different evaluation metric than the one defined by the distribution class.
If in our use case we care more about the point predictions than the prediction of the distribution we might choose to evaluate the model based on the MSE of the torch_distribution.mean vs the target.
Or we might create an evaluation metric such as the MSE of abs(torch_distribution.mean- target) vs torch_distribution.std
Or we might optimize with NLL but evaluate using CRPS.
We can also provide multiple metrics and compare the different values for the early_stopping_round.
Some observations of trying this
Optimizing using NLL and evaluating with MSE of the torch_distribution.mean vs the target, resulted in more boosting rounds. It seems like there was a point that the accuracy of the distributions where decreasing but the loc parameter was still improving.
If we evaluate on a different metric to that we optimize the early stopping is sometimes triggered when using the training set as an evaluation set. This seems quite useful if we are lacking in data as we dont necessarily require a hold out set.
Next Steps
@StatMixedML I'm happy to put this together in a PR, but only want to do so if you have time to review etc
The text was updated successfully, but these errors were encountered:
@Fish-Soup Thanks for opening the issue and please do excuse my late reply.
Using a different metric to that we optimize sounds interesting, even though I don't agree that we should use MSE for evaluating the accuracy of a distributional model.
I have been experimenting and modifying the LightGBMLSS code. One thing I have done is implemented the option of using a different evaluation metric than the one defined by the distribution class.
If in our use case we care more about the point predictions than the prediction of the distribution we might choose to evaluate the model based on the MSE of the torch_distribution.mean vs the target.
Or we might create an evaluation metric such as the MSE of abs(torch_distribution.mean- target) vs torch_distribution.std
Or we might optimize with NLL but evaluate using CRPS.
We can also provide multiple metrics and compare the different values for the early_stopping_round.
Some observations of trying this
Optimizing using NLL and evaluating with MSE of the torch_distribution.mean vs the target, resulted in more boosting rounds. It seems like there was a point that the accuracy of the distributions where decreasing but the loc parameter was still improving.
If we evaluate on a different metric to that we optimize the early stopping is sometimes triggered when using the training set as an evaluation set. This seems quite useful if we are lacking in data as we dont necessarily require a hold out set.
Next Steps
@StatMixedML I'm happy to put this together in a PR, but only want to do so if you have time to review etc
The text was updated successfully, but these errors were encountered: