Embedding Models Overview
A comprehensive guide to the diverse models we offer.
We currently offer a variety of models to cater to a wide array of use cases. Our ongoing efforts are dedicated to expanding our portfolio and continuously introducing new models. As these models become available, they will be promptly added to our API Reference.
Models
The selection of our models is based on their rankings on the MTEB leaderboard and overall popularity. For most use cases, we recommend using the e5-large-v2 model, which is currently the top performer on the MTEB leaderboard. If a smaller model is what you're seeking, we suggest the all-MiniLM-L6-v2 model. For multilingual use cases, the multilingual-e5-base model would be most suitable.
Name | Recommended Sequence Length | Dimensions |
---|---|---|
stella-large-zh-v2 (Chinese) | 1024 | 1024 |
bge-large-en-v1.5 | 512 | 1024 |
gte-large | 512 | 1024 |
e5-large-v2 | 512 | 1024 |
instructor-large | 512 | 768 |
multilingual-e5-large | 512 | 1024 |
multilingual-e5-base | 512 | 768 |
all-MiniLM-L6-v2 | 256 | 384 |
paraphrase-multilingual-mpnet-base-v2 | 128 | 768 |
The maximum sequence length can be larger than the recommended sequence length, nevertheless, we always recommend using the recommended sequence length for each model for best performance.
Performance Benchmarks
The performance benchmarks are sourced from the MTEB leaderboard. The Classification Average is derived from 12 datasets, and the Retrieval Average is based on 15 datasets. The Overall Average represents the mean scores over 56 datasets across a range of tasks.
Name | Classification Average | Retrieval Average | Average |
---|---|---|---|
stella-large-zh-v2 (Chinese) | 69.05 | 70.14 | 65.13 |
bge-large-en-v1.5 | 75.97 | 54.29 | 64.23 |
gte-large | 73.33 | 52.22 | 63.13 |
e5-large-v2 | 75.24 | 50.56 | 62.25 |
instructor-large | 73.86 | 47.57 | 61.59 |
multilingual-e5-large | 74.81 | 51.43 | 61.5 |
multilingual-e5-base | 73.02 | 48.88 | 59.45 |
text-embedding-ada-002 (reference) | 70.93 | 49.25 | 60.99 |
all-MiniLM-L6-v2 | 63.21 | 42.69 | 56.53 |
paraphrase-multilingual-mpnet-base-v2 | 67.9 | 35.34 | 54.71 |