Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

Tlntin

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm
  • support pytorch session type fb8640d

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm
  • update kv-cache to 2048, reduce memory use, speed onnx runtime b6b11e6

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
新版优化了一下内存加载,目前应该可以上10k(10240)上下文了,还在测试中,没问题就会push了。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
@Carious-ads 你转onnx的时候设置kv-cache-length是4096, 那么在cli_chat的时候需要加上`--max_output_length`也为4096。不加的话,会导致shape出错,所以你上面才报错了,晚点我会把这个加到readme说明一下。

View on GitHub

Tlntin starred hyperai/triton-cn
Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm
  • fixup some bug when run api.py d56c319

View on GitHub

Tlntin starred Tlntin/qwen-ascend-llm
Tlntin pushed 2 commits to main Tlntin/qwen-ascend-llm
  • code optimization 7b7e445
  • Merge branch 'main' of github.com:Tlntin/qwen-ascend-llm 8a27cbb

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm
推理速度咋样?
Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
qwen2-1.5b已经是3GB左右的模型文件占用了,所以应该和2GB容量限制无关,超过2GB会拆分成一堆小文件,所以大概率和他无关,可能是你的内存不够用了哈。

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm
onnx 2GB容量限制
您好,我在测试qwen-7b时通过转onnx会出错,直接中断。但是我测试了qwen-1.5b转onnx是正常的。因为我了解到onnx单个文件是有大小限制的,所以想问下这块有测试过更大的模型吗?
Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
@yaohaojie 今天做了一下910的适配,310p应该也可以跑了,不过推理速度肯定远远不如MindIE哈。

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm
  • code optimization for support 910A(and more) 607b473

View on GitHub

Tlntin starred luhengshiwo/LLMForEverybody
Tlntin created a comment on an issue on tw93/Pake
参考这个:[Can't install on Arch linux](https://github.com/tw93/Pake/issues/113)

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm
推理结果输出不对
我是使用OpenEuler系统,没有使用docker镜像,手动转换om模型,在推理时输出这个结果: ![image](https://github.com/user-attachments/assets/6d50f9db-5387-4852-a0c8-df38c555824a)
Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
不客气。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
突然发现你上面的kv-cache比较长,你可以用默认的1024试试。太长了你的内存可能不够用。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
那可能就是CANN的问题了,你的CANN版本是?python版本是?

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
看操作没有问题,你试试onnx推理正常嘛?

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm
看报错日志,貌似是转模型的时候没有设置动态shape

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-mindspore-lite-llm

View on GitHub