Support for LLaMA-2 #23

junzhang-zj · 2023-09-16T12:11:01Z

I couldn't reach 'allenai/c4' on the Hub.

junzhang-zj · 2023-09-17T00:17:38Z

I have solved the data problem, but I ran into a new problem. I used wanda to prune LLaMA-2-13B and got a zero score on rouge-2 of CNN/DM, my perplexity of C4 on unstructured pruning is high to 56050.3008.

Eric-mingjie · 2023-09-23T00:21:46Z

Hi, we just updated the repo supporting pruning LLaMA-2 model, see here for the corresponding command. We also provide the results from our own run.

junzhang-zj · 2023-09-23T03:49:55Z

@Eric-mingjie Thanks！

junzhang-zj · 2023-09-23T06:08:13Z

@Eric-mingjie Is the performance of ppl related to the environment, I still get poor results on LLaMA-2.

Eric-mingjie · 2023-09-23T07:29:08Z

I think for LLaMA-2, i used the transformers library with version 4.34.0.dev0 to load the models. I used this commit 0a55d9f7376f72ad3ff296d4249840021b03bcc4 on the main branch specifically. What ppl number do you get?

junzhang-zj · 2023-09-23T07:44:18Z

My environment is transformers 4.34.0.dev0, accelerate 0.24.0.dev0 and I get ppl 146760.7188 and now a lot of cuda errors.

Eric-mingjie · 2023-09-23T07:46:06Z

hmm, can you load the llama-2-7b dense model and test the perplexity, in this case, you can simply pass --sparsity_ratio 0 to avoid doing any pruning?

junzhang-zj · 2023-09-23T07:48:26Z

OK, i will try it to check.

Eric-mingjie · 2023-09-23T07:50:57Z

This is the output of conda env export from the conda environment i am running. Hope this may be helpful. https://gist.github.com/Eric-mingjie/4ca851c64144d53800d60e4c74ebfbaf

junzhang-zj · 2023-09-23T08:43:06Z

@Eric-mingjie I get ppl wikitext_train 5.171178340911865, wikitext_test 4.883730888366699 on Llama-2-13b with no pruning.

junzhang-zj · 2023-09-23T09:25:41Z

I think it might be helpful to think in terms of why 'wrapped_layers[name].scaler_row' is the all-0 tensor causing the metric to fail, have you run into this? Looks like something's wrong with the hook.

junzhang-zj · 2023-09-23T15:45:41Z

😭, I finally found the bug, we need to set pretraining_tp to 1, otherwise, the forward will not be executed and the callback will fail. ppl of llama-2-13b (4:8) on wikitext_train 7.27443265914917, wikitext_test 7.004149913787842

Eric-mingjie · 2023-09-23T15:48:37Z

That's good to know. I was starting to rerun the code on my end.

simlaharma · 2024-01-26T11:35:31Z

I couldn't reach 'allenai/c4' on the Hub.

Hello @junzhang-zj ,
How did you solve the data problem?
I get the following message:

ValueError: BuilderConfig 'allenai--c4' not found. Available: ['en', 'en.noblocklist', 'en.noclean', 'realnewslike', 'multilingual', 'af', 'am', 'ar', 'az', 'be', 'bg', 'bg-Latn', 'bn', 'ca', 'ceb', 'co', 'cs', 'cy', 'da', 'de', 'el', 'el-Latn', 'en-multi', 'eo', 'es', 'et', 'eu', 'fa', 'fi', 'fil', 'fr', 'fy', 'ga', 'gd', 'gl', 'gu', 'ha', 'haw', 'hi', 'hi-Latn', 'hmn', 'ht', 'hu', 'hy', 'id', 'ig', 'is', 'it', 'iw', 'ja', 'ja-Latn', 'jv', 'ka', 'kk', 'km', 'kn', 'ko', 'ku', 'ky', 'la', 'lb', 'lo', 'lt', 'lv', 'mg', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'ne', 'nl', 'no', 'ny', 'pa', 'pl', 'ps', 'pt', 'ro', 'ru', 'ru-Latn', 'sd', 'si', 'sk', 'sl', 'sm', 'sn', 'so', 'sq', 'sr', 'st', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'tr', 'uk', 'und', 'ur', 'uz', 'vi', 'xh', 'yi', 'yo', 'zh', 'zh-Latn', 'zu']

I changed the code for the c4 data to the following:

traindata = load_dataset('allenai/c4', 'en', data_files={'train': 'en/c4-train.00000-of-01024.json.gz'}, split='train')
valdata = load_dataset('allenai/c4', 'en', data_files={'validation': 'en/c4-validation.00000-of-00008.json.gz'}, split='validation')

Then, I started getting the following error:

File "/simla/wanda/lib/data.py", line 48, in get_c4
traindata = load_dataset('allenai/c4', 'en', data_files={'train': 'en/c4-train.00000-of-01024.json.gz'}, split='train')
File "/home/.local/lib/python3.10/site-packages/datasets/load.py", line 2549, in load_dataset
builder_instance.download_and_prepare(
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1005, in download_and_prepare
self._download_and_prepare(
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1118, in _download_and_prepare
verify_splits(self.info.splits, split_dict)
File "/home/.local/lib/python3.10/site-packages/datasets/utils/info_utils.py", line 92, in verify_splits
raise ExpectedMoreSplits(str(set(expected_splits) - set(recorded_splits)))
datasets.utils.info_utils.ExpectedMoreSplits: {'validation'}

I tried downloading with:

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/allenai/c4
cd c4
git lfs pull --include "en/*"

After downloading the whole dataset, I need to change the load_dataset function to call the local files. So I did the following:

traindata = load_dataset('/simla/wanda/c4', 'en', data_files={'train': 'en/c4-train.00000-of-01024.json.gz'}, split='train', trust_remote_code=True)
 valdata = load_dataset('/simla/wanda/c4', 'en', data_files={'validation': 'en/c4-validation.00000-of-00008.json.gz'}, split='validation', trust_remote_code=True)

Now I am getting the following error:

Failed to read file '/simla/wanda/c4/en/c4-train.00000-of-01024.json.gz' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Invalid value. in row 0
Generating train split: 0%| | 0/364868892 [00:00<?, ? examples/s]
Traceback (most recent call last):
File "/home/.local/lib/python3.10/site-packages/datasets/packaged_modules/json/json.py", line 144, in _generate_tables
dataset = json.load(f)
File "/usr/lib/python3.10/json/init.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.10/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1973, in _prepare_split_single
for _, table in generator:
File "/home/.local/lib/python3.10/site-packages/datasets/packaged_modules/json/json.py", line 147, in _generate_tables
raise e
File "/home/.local/lib/python3.10/site-packages/datasets/packaged_modules/json/json.py", line 121, in _generate_tables
pa_table = paj.read_json(
File "pyarrow/_json.pyx", line 259, in pyarrow._json.read_json
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: JSON parse error: Invalid value. in row 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/simla/wanda/main.py", line 110, in
main()
File "/simla/wanda/main.py", line 69, in main
prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/simla/wanda/lib/prune.py", line 132, in prune_wanda
dataloader, _ = get_loaders("c4",nsamples=args.nsamples,seed=args.seed,seqlen=model.seqlen,tokenizer=tokenizer)
File "/simla/wanda/lib/data.py", line 80, in get_loaders
return get_c4(nsamples, seed, seqlen, tokenizer)
File "/simla/wanda/lib/data.py", line 50, in get_c4
traindata = load_dataset('/simla/wanda/c4', 'en', data_files={'train': 'en/c4-train.00000-of-01024.json.gz'}, split='train', trust_remote_code=True)
File "/home/.local/lib/python3.10/site-packages/datasets/load.py", line 2549, in load_dataset
builder_instance.download_and_prepare(
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1005, in download_and_prepare
self._download_and_prepare(
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1100, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 1860, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/home/.local/lib/python3.10/site-packages/datasets/builder.py", line 2016, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset

junzhang-zj · 2024-01-29T06:29:33Z

@simlaharma Have you tried downloading directly from the huggingface website and then loading it locally?

rsong0606 · 2024-04-29T19:02:49Z

@simlaharma

I had a similar issue as you did. check this post, it worked for me.

huggingface/datasets#6746

rakeshsai22 · 2024-08-30T01:21:25Z

can we use Wanda for pruning the last linear layer in Llama 2?

junzhang-zj changed the title ~~Data problem~~ Data & Rouge-2 on CNN/DM problem Sep 17, 2023

Eric-mingjie changed the title ~~Data & Rouge-2 on CNN/DM problem~~ Support for LLaMA2 Sep 23, 2023

Eric-mingjie changed the title ~~Support for LLaMA2~~ Support for LLaMA-2 Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for LLaMA-2 #23

Support for LLaMA-2 #23

junzhang-zj commented Sep 16, 2023

junzhang-zj commented Sep 17, 2023 •

edited

Loading

Eric-mingjie commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023 •

edited

Loading

Eric-mingjie commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023

Eric-mingjie commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023

Eric-mingjie commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

Eric-mingjie commented Sep 23, 2023

simlaharma commented Jan 26, 2024 •

edited

Loading

junzhang-zj commented Jan 29, 2024

rsong0606 commented Apr 29, 2024

rakeshsai22 commented Aug 30, 2024

Support for LLaMA-2 #23

Support for LLaMA-2 #23

Comments

junzhang-zj commented Sep 16, 2023

junzhang-zj commented Sep 17, 2023 • edited Loading

Eric-mingjie commented Sep 23, 2023 • edited Loading

junzhang-zj commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023 • edited Loading

Eric-mingjie commented Sep 23, 2023 • edited Loading

junzhang-zj commented Sep 23, 2023

Eric-mingjie commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023

Eric-mingjie commented Sep 23, 2023

junzhang-zj commented Sep 23, 2023 • edited Loading

junzhang-zj commented Sep 23, 2023 • edited Loading

junzhang-zj commented Sep 23, 2023 • edited Loading

Eric-mingjie commented Sep 23, 2023

simlaharma commented Jan 26, 2024 • edited Loading

junzhang-zj commented Jan 29, 2024

rsong0606 commented Apr 29, 2024

rakeshsai22 commented Aug 30, 2024

junzhang-zj commented Sep 17, 2023 •

edited

Loading

Eric-mingjie commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

Eric-mingjie commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

junzhang-zj commented Sep 23, 2023 •

edited

Loading

simlaharma commented Jan 26, 2024 •

edited

Loading