OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125 #46

dhjoo98 · 2024-05-02T16:25:04Z

Hello, I used the scripts to prune the OPT-66B. (Unstructured, n_samples 128)
Upon with, I get a wikitext perplexity of 3404, which is way off the metric given in the paper.

I was wondering if the code output metric should be scaled by 0.01, (thus 3.404 perplexity)
Or if this is an outlier performance.

Eric-mingjie · 2024-05-03T02:39:44Z

This seems to be an outlier performance, which i get before from running on OPT-66B. I wasn't able to look into this (mainly because LLaMA and LLaMA2 is much more popular), but it would be interesting to study why this is the case from a scientific perspective.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125 #46

OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125 #46

dhjoo98 commented May 2, 2024

Eric-mingjie commented May 3, 2024 •

edited

Loading

OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125 #46

OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125 #46

Comments

dhjoo98 commented May 2, 2024

Eric-mingjie commented May 3, 2024 • edited Loading

Eric-mingjie commented May 3, 2024 •

edited

Loading