Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

sam-hey

sam-hey created a branch on sam-hey/mteb

b1.12.1-mteb - MTEB: Massive Text Embedding Benchmark

sam-hey pushed 1 commit to index-colbert sam-hey/mteb

View on GitHub

sam-hey pushed 1 commit to index-colbert sam-hey/mteb
  • feat(wip): colbert with index 7c20f1b

View on GitHub

sam-hey created a branch on sam-hey/mteb

index-colbert - MTEB: Massive Text Embedding Benchmark

sam-hey pushed 1 commit to colbert-with-index sam-hey/mteb

View on GitHub

sam-hey opened an issue on embeddings-benchmark/mteb
GermanDPR Dataset Causes Cross-Encoder Failure Due to Unexpected dict
When using the GermanDPR dataset with a CrossEncoder, the dataset is returning a dict instead of a str. This results in an error because the CrossEncoder expects text data as a string. The follo...
sam-hey pushed 8 commits to main sam-hey/mteb
  • doc: colbert add score_function & doc section (#1592) * doc: colbert add score_function & doc section * doc: Update... 992b20b
  • Feat: add support for scoring function (#1594) * add support for scoring function * lint * move similarity to wrap... 8e6ee46
  • Add new models nvidia, gte, linq (#1436) * Add new models nvidia, gte, linq * add warning for gte-Qwen and nvidia m... 95d5ae5
  • Leaderboard: Refined plots (#1601) * Added embedding size guide to performance-size plot, removed shading on radar c... 0c9e046
  • fix: Leaderboard refinements (#1603) * Added explanation of aggregate measures * Added download button to result ... 6ecc86f
  • 1.25.1 Automatically generated by python-semantic-release 5e9c468
  • Feat: Use similarity scores if available (#1602) * Use similarity scores if available * lint b81b584
  • Merge branch 'embeddings-benchmark:main' into main 4a75f71

View on GitHub

sam-hey pushed 1 commit to main sam-hey/ColBERT-training

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb

View on GitHub

sam-hey created a review comment on a pull request on embeddings-benchmark/mteb
I completely agree with you. In my opinion, it's a bit surprising that `ModelMeta.similarity_fn_name` isn't being utilized. Priorities: 1. Pass `score_function` directly to `run()` 2. Utilize ...

View on GitHub

sam-hey created a review on a pull request on embeddings-benchmark/mteb

View on GitHub

sam-hey created a review comment on a pull request on embeddings-benchmark/mteb
https://github.com/embeddings-benchmark/mteb/pull/1592 Resolved this issue. Apologies for the inconvenience, and thank you very much for your support!

View on GitHub

sam-hey created a review on a pull request on embeddings-benchmark/mteb

View on GitHub

sam-hey opened a pull request on embeddings-benchmark/mteb
doc: colbert add score_function & doc section
Closes: https://github.com/embeddings-benchmark/mteb/issues/1589 Improve the documentation and add information about the PLAID Index in PyLate.
sam-hey pushed 1 commit to main sam-hey/mteb
  • doc: colbert add score_function & doc section fb628ee

View on GitHub

sam-hey pushed 0 commits to main sam-hey/mteb

View on GitHub

sam-hey pushed 5 commits to main sam-hey/mteb
  • fix: Eval langs not correctly passed to monolingual tasks (#1587) * fix SouthAfricanLangClassification.py * add che... 373db74
  • 1.24.2 Automatically generated by python-semantic-release eecc9f1
  • feat: Add ColBert (#1563) * feat: add max_sim operator for IR tasks to support multi-vector models * docs: add doc ... fdfdaef
  • 1.25.0 Automatically generated by python-semantic-release b466051
  • Merge branch 'embeddings-benchmark:main' into main 1fbbd4e

View on GitHub

sam-hey created a comment on an issue on embeddings-benchmark/mteb
Hello @CZH-THU, passing a ColBERT model directly is not supported and will default to cosine similarity, which results in an error. You can refer to this example for guidance: [Using Late Inter...

View on GitHub

sam-hey pushed 1 commit to main sam-hey/RAGatouille-training
  • fix: python number to big for srsly fa3a7fb

View on GitHub

sam-hey created a comment on a pull request on embeddings-benchmark/mteb
@Samoed ran additional tasks, and the results were as expected. Added a note to the documentation indicating that MaxSim becomes resource-intensive with large datasets. A solution is already und...

View on GitHub

sam-hey created a branch on sam-hey/mteb

colbert-with-index - MTEB: Massive Text Embedding Benchmark

sam-hey pushed 17 commits to main sam-hey/mteb
  • fix(bm25s): search implementation (#1566) fix: bm25s implementation ac44e58
  • 1.22.1 Automatically generated by python-semantic-release b8ff89c
  • docs: Fix dependency library name for bm25s (#1568) * fix: bm25s implementation * correct library name --------- ... 03347eb
  • fix: Add training dataset to model meta (#1561) * fix: Add training dataset to model meta Adresses #1556 * Add... 6489fca
  • feat: (cohere_models) cohere_task_type issue, batch requests and tqdm for visualization (#1564) * feat: batch reques... 1d21818
  • fix(publichealth-qa): ignore rows with `None` values in `question` or `answer` (#1565) 68bd8ac
  • 1.23.0 Automatically generated by python-semantic-release 2550a27
  • fix: Added metadata for miscellaneous models (#1557) * Added script for generating metadata, and metadata for the li... ce8c175
  • 1.23.1 Automatically generated by python-semantic-release f9ede12
  • fix: Added radar chart displaying capabilities on task types (#1570) * Added radar chart displaying capabilities on ... c49f838
  • 1.23.2 Automatically generated by python-semantic-release e605c7b
  • feat: add new arctic v2.0 models (#1574) * feat: add new arctic v2.0 models * chore: make lint 53756ad
  • 1.24.0 Automatically generated by python-semantic-release 27f7d8c
  • fix: Add namaa MrTydi reranking dataset (#1573) * Add dataset class and file requirements * pass tests * make ... 7b9b3c9
  • Update tasks table 1101db7
  • 1.24.1 Automatically generated by python-semantic-release 9c0b208
  • Merge branch 'embeddings-benchmark:main' into main a3a126f

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb
  • doc: warning: higher resource usage for MaxSim 67e5200

View on GitHub

sam-hey created a comment on a pull request on embeddings-benchmark/mteb
@isaac-chung and @Samoed, Thanks for your support! 😊 I’ve added the handling of `prompt_name` and integrated jinja-colbertv2 (128). Let me know your thoughts!

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb
  • feat: integrate Jinja templates for ColBERTv2 and add model prompt handling 8b64f4c

View on GitHub

sam-hey created a comment on a pull request on embeddings-benchmark/mteb
> @sam-hey Just pushed a few changes, and now the example script runs and gives `ndcg_at_10": 0.27872` on NFCorpus, which is close to the 0.338 reported in the [Colbert v2 paper](https://arxiv.org/...

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb
  • fix: max_sim add pad_sequence 216d3f8

View on GitHub

sam-hey pushed 1 commit to main sam-hey/mteb
  • fix: pass is_query to pylate 1167517

View on GitHub

sam-hey created a comment on a pull request on embeddings-benchmark/mteb
@isaac-chung Yes, I put it to draft as I found a major bug. The Encode function needs is_query to work. I am really sorry - I am working on a fix...

View on GitHub

Load more