size mismatch for vision_model.post_layernorm.weight #355

zhanghang1995 · 2024-11-28T14:13:08Z

when i use the llavavideo to inference, load the model get these error:

zhanghang1995 · 2024-11-28T14:13:35Z

when i use the llavavideo to inference, load the model get these error:

zhanghang1995 · 2024-11-28T14:17:30Z

when i use the llavavideo to inference, load the model get these error:

I use the local models to load: lmms-lab/LLaVA-Video-7B-Qwen2 and siglip-so400m-patch14-384

KejiaZhang-Robust · 2024-11-29T16:42:24Z

i also have the same question, could the authors' team help us to fix this propblem?

KejiaZhang-Robust · 2024-11-29T16:43:26Z

KejiaZhang-Robust · 2024-11-30T05:48:06Z

I am attempting to load CLIPConfig to resolve an issue in the following code:

config = CLIPConfig.from_pretrained('google/siglip-so400m-patch14-384')
self.vision_tower = CLIPVisionModel.from_pretrained(self.vision_tower_name, device_map=device_map, config=config.vision_config)

I am now encountering only one dimension mismatch error when trying to initialize the weights:

ValueError: Trying to set a tensor of shape torch.Size([729, 1152]) in "weight" (which has shape torch.Size([730, 1152])), this looks incorrect.

KejiaZhang-Robust · 2024-12-02T08:40:06Z

Hi @Luodian @kcz358 @ZhangYuanhan-AI @ChunyuanLI ,

Could you kindly take a look at this issue? Your input would be greatly appreciated in resolving it.

ZHANGH83 · 2024-12-06T08:30:12Z

I got the same problem of size mismatch for vision_model.embeddings/encoder/post_layernorm....... for inference with LlaVA-OV.
Is this the problem of vision tower? It seems the siglip-so400m-patch14-384 does not match the LLaVA ckpts...

ZHANGH83 · 2024-12-06T08:44:14Z

Hey guys, I just tried the method in #246 (comment), and it works for me!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size mismatch for vision_model.post_layernorm.weight #355

size mismatch for vision_model.post_layernorm.weight #355

zhanghang1995 commented Nov 28, 2024

zhanghang1995 commented Nov 28, 2024

zhanghang1995 commented Nov 28, 2024

KejiaZhang-Robust commented Nov 29, 2024 •

edited

Loading

KejiaZhang-Robust commented Nov 29, 2024

KejiaZhang-Robust commented Nov 30, 2024

KejiaZhang-Robust commented Dec 2, 2024

ZHANGH83 commented Dec 6, 2024

ZHANGH83 commented Dec 6, 2024

size mismatch for vision_model.post_layernorm.weight #355

size mismatch for vision_model.post_layernorm.weight #355

Comments

zhanghang1995 commented Nov 28, 2024

zhanghang1995 commented Nov 28, 2024

zhanghang1995 commented Nov 28, 2024

KejiaZhang-Robust commented Nov 29, 2024 • edited Loading

KejiaZhang-Robust commented Nov 29, 2024

KejiaZhang-Robust commented Nov 30, 2024

KejiaZhang-Robust commented Dec 2, 2024

ZHANGH83 commented Dec 6, 2024

ZHANGH83 commented Dec 6, 2024

KejiaZhang-Robust commented Nov 29, 2024 •

edited

Loading