Chess SAE trainer - tensor dim mismatch #2

Ivan-Z · 2024-08-14T04:29:49Z

Trying to repro the chess SAE trainining:

python circuits/sae_training/chess_sae_trainer.py --save_dir=/tmp/sae_debug

After modifying this line to pass the meta.pkl from circuits/resources/meta.pkl
https://github.com/adamkarvonen/SAE_BoardGameEval/blob/master/circuits/sae_training/chess_sae_trainer.py#L65

I get:

/home/tmp/SAE_BoardGameEval/circuits/dictionary_learning/buffer.py", line 404, in refresh
    self.activations = t.cat([self.activations, hidden_states.to(self.device)], dim=0)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 2048 but got size 512 for tensor number 1 in the list.

The text was updated successfully, but these errors were encountered:

adamkarvonen · 2024-08-14T17:11:30Z

Apologies for this! I fixed chess_sae_trainer.py. There was a mismatch between submodule_type (mlp, dim 2048) and submodule (resid_post, dim 512).

However, after further thought I believe train_saes_parallel.py is the better default training script. I have archived chess_sae_trainer.py and othello_sae_trainer.py, and updated the training README with instructions for using train_saes_parallel.py.

I have tested this script on both ChessGPT and OthelloGPT using a variety of SAE training types (TopK, P_Anneal, Gated, and Standard).

Please let me know if you have any other issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chess SAE trainer - tensor dim mismatch #2

Chess SAE trainer - tensor dim mismatch #2

Ivan-Z commented Aug 14, 2024

adamkarvonen commented Aug 14, 2024 •

edited

Loading

Chess SAE trainer - tensor dim mismatch #2

Chess SAE trainer - tensor dim mismatch #2

Comments

Ivan-Z commented Aug 14, 2024

adamkarvonen commented Aug 14, 2024 • edited Loading

adamkarvonen commented Aug 14, 2024 •

edited

Loading