Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Fix __str__ method in intervenable base model #194

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ramvenkat98
Copy link

Description

Currently when we print or stringify a model, it usually throws an error because the str method tries to use an attribute of the class (intervention_types) that does not exist. This replaces that with the correct way to get intervention types, so that printing out the model (and converting it to a string) works.

Testing Done

Local Testing

Used this small script:

import torch
import pyvene as pv

_, tokenizer, gpt2 = pv.create_gpt2()

config = pv.IntervenableConfig({
    "layer": 0,
    "component": "mlp_input"},
    pv.AdditionIntervention
)

pv_gpt2 = pv.IntervenableModel(config, model=gpt2)

print(pv_gpt2)

On the base version, the output is an exception:

nnsight is not detected. Please install via 'pip install nnsight' for nnsight backend.
/nlp/scr/ram1998/miniconda3/envs/pyreft_dev/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1617: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be deprecated in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
loaded model
Traceback (most recent call last):
  File "/juice2/scr2/ram1998/pyvene_print_test.py", line 14, in <module>
    print(pv_gpt2)
  File "/nlp/scr/ram1998/miniconda3/envs/pyreft_dev/lib/python3.12/site-packages/pyvene/models/intervenable_base.py", line 255, in __str__
    "intervention_types": self.intervention_types,
                          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/nlp/scr/ram1998/miniconda3/envs/pyreft_dev/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'IntervenableModel' object has no attribute 'intervention_types'. Did you mean: '_intervention_state'?

On the test version, the output is correct:

nnsight is not detected. Please install via 'pip install nnsight' for nnsight backend.
loaded model
{
    "model_type": "GPT2Model",
    "intervention_types": [
        "AdditionIntervention"
    ],
    "alignables": [
        "layer.0.comp.mlp_input.unit.pos.nunit.1#0"
    ],
    "mode": "parallel"
}

Unit Testing

Added a print statement to the test_less_lazy_demo method (and removed a duplicate of that method). Verified that the print statement gives the correct output. Full unit test logs (including the print statement) are:

###############################
running following processes

	python -m unittest discover -s pyvene -p "*TestCase.py"


###############################
command outputs: 


nnsight is not detected. Please install via 'pip install nnsight' for nnsight backend.
'pyvene' is not installed.
PASS: pyvene is not installed. Testing local dev code.
=== Test Suite: VanillaInterventionWithTransformerTestCase ===
loaded model
./juice2/scr2/ram1998/pyvene/pyvene/models/intervenable_base.py:69: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
  logging.warn(
WARNING:root:Detected use_fast=True means the intervention location will be static within a batch.

In case multiple location tags are passed only the first one will be considered
.WARNING:root:Detected use_fast=True means the intervention location will be static within a batch.

In case multiple location tags are passed only the first one will be considered
.WARNING:root:Detected use_fast=True means the intervention location will be static within a batch.

In case multiple location tags are passed only the first one will be considered
.loaded model
`GPT2SdpaAttention` is used but `torch.nn.functional.scaled_dot_product_attention` does not support `output_attentions=True` or `head_mask`. Falling back to the manual attention implementation, but specifying the manual implementation will be required from Transformers version v5.0.0 onwards. This warning can be removed using the argument `attn_implementation="eager"` when loading the model.
.loaded model
.loaded model
.loaded model
.loaded model
.loaded model
.loaded model
IntervenableConfig
{
    "model_type": "None",
    "representations": [
        {
            "layer": 0,
            "component": "mlp_output",
            "unit": "pos",
            "max_number_of_units": 1,
            "low_rank_dimension": null,
            "intervention_type": null,
            "intervention": null,
            "subspace_partition": null,
            "group_key": null,
            "intervention_link_key": null,
            "moe_key": null,
            "source_representation": "PLACEHOLDER",
            "hidden_source_representation": null,
            "latent_dim": null
        },
        {
            "layer": 1,
            "component": "mlp_output",
            "unit": "pos",
            "max_number_of_units": 1,
            "low_rank_dimension": null,
            "intervention_type": null,
            "intervention": null,
            "subspace_partition": null,
            "group_key": null,
            "intervention_link_key": null,
            "moe_key": null,
            "source_representation": "PLACEHOLDER",
            "hidden_source_representation": null,
            "latent_dim": null
        },
        {
            "layer": 2,
            "component": "mlp_output",
            "unit": "pos",
            "max_number_of_units": 1,
            "low_rank_dimension": null,
            "intervention_type": null,
            "intervention": null,
            "subspace_partition": null,
            "group_key": null,
            "intervention_link_key": null,
            "moe_key": null,
            "source_representation": "PLACEHOLDER",
            "hidden_source_representation": null,
            "latent_dim": null
        },
        {
            "layer": 3,
            "component": "mlp_output",
            "unit": "pos",
            "max_number_of_units": 1,
            "low_rank_dimension": null,
            "intervention_type": null,
            "intervention": null,
            "subspace_partition": null,
            "group_key": null,
            "intervention_link_key": null,
            "moe_key": null,
            "source_representation": "PLACEHOLDER",
            "hidden_source_representation": null,
            "latent_dim": null
        }
    ],
    "intervention_types": "<class 'pyvene.models.interventions.VanillaIntervention'>",
    "mode": "parallel",
    "sorted_keys": "None",
    "intervention_dimensions": "None"
}
{
    "model_type": "GPT2Model",
    "intervention_types": [
        "VanillaIntervention",
        "VanillaIntervention",
        "VanillaIntervention",
        "VanillaIntervention"
    ],
    "alignables": [
        "layer.0.comp.mlp_output.unit.pos.nunit.1#0",
        "layer.1.comp.mlp_output.unit.pos.nunit.1#0",
        "layer.2.comp.mlp_output.unit.pos.nunit.1#0",
        "layer.3.comp.mlp_output.unit.pos.nunit.1#0"
    ],
    "mode": "parallel"
}
.loaded model
.loaded model
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Once upon a time there was a little girl named Lucy. She was three years old and loved to explore. One day, Lucy was walking in the park when
.loaded model
loaded model
.loaded model
loaded model
.loaded model
.You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.48.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
.loaded model
Directory './tmp/' already exists.
/juice2/scr2/ram1998/pyvene/pyvene/models/intervenable_base.py:179: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
  logging.warn(
WARNING:root:The key is provided in the config. Assuming this is loaded from a pretrained module.
.loaded model
.loaded model
.loaded model
.loaded model
.loaded model
Directory './test_output_dir_prefix-8fd8c6' already exists.
WARNING:root:The key is provided in the config. Assuming this is loaded from a pretrained module.
.loaded model
.loaded model
.loaded model
.Removing testing dir ./test_output_dir_prefix-8fd8c6
=== Test Suite: InterventionWithGPT2TestCase ===
loaded model
testing stream: head_attention_value_output with multiple heads positions
testing stream: head_query_output with multiple heads positions
testing stream: head_key_output with multiple heads positions
testing stream: head_value_output with multiple heads positions
.=== Test Suite: InterventionWithLlamaTestCase ===
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message
loaded model
testing stream: head_attention_value_output with multiple heads positions
testing stream: head_query_output with multiple heads positions
testing stream: head_key_output with multiple heads positions
testing stream: head_value_output with multiple heads positions
.=== Test Suite: InterventionWithMLPTestCase ===
loaded model
......=== Test Suite: CausalModelTestCase ===
......=== Test Suite: IntervenableConfigUnitTestCase ===
loaded model
.=== Test Suite: InterventionUtilsTestCase ===
loaded model
.....Directory './test_output_dir_prefix-ef01f3' created successfully.
WARNING:root:The key is provided in the config. Assuming this is loaded from a pretrained module.
Directory './test_output_dir_prefix-37f95d' created successfully.
WARNING:root:The key is provided in the config. Assuming this is loaded from a pretrained module.
/juice2/scr2/ram1998/pyvene/pyvene/models/intervenable_base.py:1308: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  saved_state_dict = torch.load(os.path.join(load_directory, binary_filename))
Directory './test_output_dir_prefix-605169' created successfully.
WARNING:root:The key is provided in the config. Assuming this is loaded from a pretrained module.
.Directory './test_output_dir_prefix-39d3c6' created successfully.
.Directory './test_output_dir_prefix-3bcf6c' created successfully.
.tensor([[3.5763e-06, 1.0000e+00, 1.4000e+01, 1.5000e+01, 1.6000e+01, 1.7000e+01],
        [6.0000e+00, 7.0000e+00, 2.0000e+01, 2.1000e+01, 2.2000e+01, 2.3000e+01]],
       grad_fn=<AddBackward0>)
./juice2/scr2/ram1998/pyvene/pyvene/models/interventions.py:437: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  mask_sigmoid = torch.sigmoid(self.mask / torch.tensor(self.temperature))
.........Removing testing dir ./test_output_dir_prefix-ef01f3
Removing testing dir ./test_output_dir_prefix-37f95d
Removing testing dir ./test_output_dir_prefix-605169
Removing testing dir ./test_output_dir_prefix-39d3c6
Removing testing dir ./test_output_dir_prefix-3bcf6c
.............
----------------------------------------------------------------------
Ran 72 tests in 71.307s

OK
###############################

Checklist:

  • My PR title strictly follows the format: [Your Priority] Your Title
  • I have attached the testing log above
  • I provide enough comments to my code (no comments needed, small self-explanatory change)
  • I have changed documentations (no documentation change needed)
  • I have added tests for my changes

@aryamanarora aryamanarora self-requested a review December 29, 2024 01:22
@aryamanarora aryamanarora self-assigned this Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants