Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Benchmark for custom model #2

Open
AnFreTh opened this issue Nov 14, 2024 · 1 comment
Open

Run Benchmark for custom model #2

AnFreTh opened this issue Nov 14, 2024 · 1 comment

Comments

@AnFreTh
Copy link

AnFreTh commented Nov 14, 2024

Hi,

first of all, great Code/Paper!
I wanted to run a simple model to compare it with your results. Is there an easy way to for example run a simple sklearn model such that I can directly compare to the results reported in your paper?

@puhsu
Copy link
Contributor

puhsu commented Nov 14, 2024

Hi! I'm glad to hear that you are interested in the benchmark!

We plan to add a more streamlined way to setup and process the datasets.

Until then the steps to use TabReD are as follows:

  1. Create an env (following readme instruction)
  2. Run mkdir data
  3. Run python preprocessing/<dataset-name>.py for each dataset (it should be quick, longest parts are the downloads)
  4. Modify any of the bin scripts appropriate. I suggest looking at the bin/xgboost.py and switching the model for the one you would like to test (as the GBDT implementations are sklearn-api compatible).

I'll keep the issue open for now, until we make a more streamlined setup for dataset preparation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants