Tlntin Events in 2024 - Ecosyste.ms: Timeline

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 24, 2024 2:41pm

support pytorch session type fb8640d

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 24, 2024 10:32am

update kv-cache to 2048, reduce memory use, speed onnx runtime b6b11e6

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 23, 2024 7:15am

新版优化了一下内存加载，目前应该可以上10k（10240）上下文了，还在测试中，没问题就会push了。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 23, 2024 7:10am

@Carious-ads 你转onnx的时候设置kv-cache-length是4096，那么在cli_chat的时候需要加上`--max_output_length`也为4096。不加的话，会导致shape出错，所以你上面才报错了，晚点我会把这个加到readme说明一下。

View on GitHub

Tlntin starred hyperai/triton-cn

October 22, 2024 7:26am

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 22, 2024 3:24am

fixup some bug when run api.py d56c319

View on GitHub

Tlntin starred Tlntin/qwen-ascend-llm

October 22, 2024 3:13am

Tlntin pushed 2 commits to main Tlntin/qwen-ascend-llm

October 22, 2024 3:13am

code optimization 7b7e445
Merge branch 'main' of github.com:Tlntin/qwen-ascend-llm 8a27cbb

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm

October 21, 2024 3:34pm

推理速度咋样？

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 21, 2024 3:34pm

qwen2-1.5b已经是3GB左右的模型文件占用了，所以应该和2GB容量限制无关，超过2GB会拆分成一堆小文件，所以大概率和他无关，可能是你的内存不够用了哈。

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm

October 21, 2024 3:34pm

onnx 2GB容量限制

您好，我在测试qwen-7b时通过转onnx会出错，直接中断。但是我测试了qwen-1.5b转onnx是正常的。因为我了解到onnx单个文件是有大小限制的，所以想问下这块有测试过更大的模型吗？

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 21, 2024 3:32pm

@yaohaojie 今天做了一下910的适配，310p应该也可以跑了，不过推理速度肯定远远不如MindIE哈。

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm

October 21, 2024 3:32pm

om模型推理闪退报段错误 (核心已转储)

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 3:31pm

Update README.md e6d27bf

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 3:27pm

update README.md a087551

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 3:26pm

update requirements.txt a1da4e4

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 3:13pm

code optimization 99530d5

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 6:48am

Update README.md d2dd231

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-ascend-llm

October 21, 2024 6:47am

code optimization for support 910A（and more) 607b473

View on GitHub

Tlntin starred luhengshiwo/LLMForEverybody

October 18, 2024 10:14am

Tlntin created a comment on an issue on tw93/Pake

October 18, 2024 9:56am

参考这个：[Can't install on Arch linux](https://github.com/tw93/Pake/issues/113)

View on GitHub

Tlntin closed an issue on Tlntin/qwen-ascend-llm

October 18, 2024 6:02am

推理结果输出不对

我是使用OpenEuler系统，没有使用docker镜像，手动转换om模型，在推理时输出这个结果： ![image](https://github.com/user-attachments/assets/6d50f9db-5387-4852-a0c8-df38c555824a)

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 18, 2024 6:00am

不客气。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 18, 2024 3:28am

突然发现你上面的kv-cache比较长，你可以用默认的1024试试。太长了你的内存可能不够用。

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 18, 2024 3:24am

那可能就是CANN的问题了，你的CANN版本是？python版本是？

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 18, 2024 3:17am

看操作没有问题，你试试onnx推理正常嘛？

View on GitHub

Tlntin created a comment on an issue on Tlntin/qwen-ascend-llm

October 18, 2024 3:08am

看报错日志，貌似是转模型的时候没有设置动态shape

View on GitHub

Tlntin pushed 1 commit to main Tlntin/qwen-mindspore-lite-llm

October 17, 2024 1:05pm

code optimization 47a6b7d

View on GitHub