Update README.md
Browse files
README.md
CHANGED
|
@@ -79,34 +79,37 @@ The provided OpenVINO™ IR model is compatible with:
|
|
| 79 |
1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
|
| 80 |
|
| 81 |
```
|
| 82 |
-
pip install
|
| 83 |
```
|
| 84 |
|
| 85 |
2. Run model inference:
|
| 86 |
|
| 87 |
```
|
| 88 |
-
|
|
|
|
| 89 |
|
|
|
|
| 90 |
|
| 91 |
-
embedding_model_name = 'OpenVINO/bge-base-en-v1.5-int8-ov'
|
| 92 |
-
embedding_model_kwargs = {"device": "CPU", "compile": False}
|
| 93 |
-
encode_kwargs = {
|
| 94 |
-
"mean_pooling": False,
|
| 95 |
-
"normalize_embeddings": True,
|
| 96 |
-
"batch_size": 4,
|
| 97 |
-
}
|
| 98 |
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
|
| 105 |
-
|
|
|
|
| 106 |
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
```
|
| 111 |
|
| 112 |
For more examples and possible optimizations, refer to the [Inference with Optimum Intel](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-optimum-intel.html).
|
|
|
|
| 79 |
1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
|
| 80 |
|
| 81 |
```
|
| 82 |
+
pip install optimum[openvino]
|
| 83 |
```
|
| 84 |
|
| 85 |
2. Run model inference:
|
| 86 |
|
| 87 |
```
|
| 88 |
+
import torch
|
| 89 |
+
from transformers import AutoTokenizer
|
| 90 |
|
| 91 |
+
from optimum.intel.openvino import OVModelForFeatureExtraction
|
| 92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
+
# Sentences we want sentence embeddings for
|
| 95 |
+
sentences = ["Sample Data-1", "Sample Data-2"]
|
| 96 |
+
|
| 97 |
+
# Load model from HuggingFace Hub
|
| 98 |
+
tokenizer = AutoTokenizer.from_pretrained('OpenVINO/bge-base-en-v1.5-int8-ov')
|
| 99 |
+
model = OVModelForFeatureExtraction.from_pretrained('OpenVINO/bge-base-en-v1.5-int8-ov')
|
| 100 |
+
|
| 101 |
+
# Tokenize sentences
|
| 102 |
+
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
| 103 |
+
|
| 104 |
+
# Compute token embeddings
|
| 105 |
+
model_output = model(**encoded_input)
|
| 106 |
|
| 107 |
+
# Perform pooling. In this case, cls pooling.
|
| 108 |
+
sentence_embeddings = model_output[0][:, 0]
|
| 109 |
|
| 110 |
+
# normalize embeddings
|
| 111 |
+
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
|
| 112 |
+
print("Sentence embeddings:", sentence_embeddings)
|
| 113 |
```
|
| 114 |
|
| 115 |
For more examples and possible optimizations, refer to the [Inference with Optimum Intel](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-optimum-intel.html).
|