> Thank you for your feedback, it seems that context buffer allocation is not correctly compatible with the shape of omnivision-968M vision embedding, we will fix it as soon as possible (in 1 or 2 ...
> I'm not quite sure what your question is. Could you describe the problem more clearly?
like tokenpacker, image tokens are compressed ,for example original image tokens num is
about 27*27 = 7...
### Question or Issue
using llama cpp to inference OmniVision-968M on an android device.
it shows that image and text tokens are too many not same as mentioned in hugging face.
can someone help?...