Update README.md

Browse files

Files changed (1) hide show

README.md +82 -26

README.md CHANGED Viewed

@@ -26,11 +26,13 @@ LLM2Vec4CXR is a bidirectional language model that converts the base decoder-onl
 ### Key Features
 - **Base Architecture**: LLM2CLIP-Llama-3.2-1B-Instruct
-- **Pooling Mode**: Latent Attention (modified from original)
 - **Bidirectional Processing**: Enabled for better context understanding
 - **Medical Domain**: Specialized for chest X-ray report analysis
 - **Max Length**: 512 tokens
 - **Precision**: bfloat16
 ## Training Details
@@ -62,20 +64,22 @@ pip install -e .
 ### Basic Usage
 ```python
 from llm2vec_wrapper import LLM2VecWrapper as LLM2Vec
-# Load the model
 model = LLM2Vec.from_pretrained(
     base_model_name_or_path='lukeingawesome/llm2vec4cxr',
-    enable_bidirectional=True,
-    pooling_mode="latent_attention",
     max_length=512,
     torch_dtype=torch.bfloat16,
 )
-# Simple text encoding (built-in method)
 report = "There is a small increase in the left-sided effusion. There continues to be volume loss at both bases."
-embedding = model.encode_text(report)
 # Multiple texts at once
 reports = [
@@ -86,38 +90,90 @@ reports = [
 embeddings = model.encode_text(reports)
 ```
-### Advanced Usage with Instructions
 ```python
 # For instruction-following tasks with separator
-separator = '!@#$%^&*()'
 instruction = 'Determine the change or the status of the pleural effusion.'
 report = 'There is a small increase in the left-sided effusion.'
-text_with_instruction = instruction + separator + report
-# Use the built-in method for instruction-based encoding
-embedding = model.encode_with_instruction([text_with_instruction])
 ```
-**Note**: The model now includes convenient `encode_text()` and `encode_with_instruction()` methods that handle the `embed_mask` automatically.
-### Manual Usage (if you need more control)
-If you need more control over the tokenization process, you can still use the manual approach:
 ```python
-# Manual tokenization with embed_mask
-def encode_text_manual(model, text):
-    inputs = model.tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
-    inputs["embed_mask"] = inputs["attention_mask"].clone()  # Required for proper functioning
-    with torch.no_grad():
-        embeddings = model(inputs)
-    return embeddings
-# For instruction-based tasks, use the built-in tokenize_with_separator method
-tokenized = model.tokenize_with_separator([text_with_instruction])
-embedding = model(tokenized)
 ```
 ## Evaluation

 ### Key Features
 - **Base Architecture**: LLM2CLIP-Llama-3.2-1B-Instruct
+- **Pooling Mode**: Latent Attention (fine-tuned weights automatically loaded)
 - **Bidirectional Processing**: Enabled for better context understanding
 - **Medical Domain**: Specialized for chest X-ray report analysis
 - **Max Length**: 512 tokens
 - **Precision**: bfloat16
+- **Automatic Loading**: Latent attention weights are automatically loaded from safetensors
+- **Simple API**: Built-in methods for similarity computation and instruction-based encoding
 ## Training Details
 ### Basic Usage
 ```python
+import torch
 from llm2vec_wrapper import LLM2VecWrapper as LLM2Vec
+# Load the model - latent attention weights are automatically loaded!
 model = LLM2Vec.from_pretrained(
     base_model_name_or_path='lukeingawesome/llm2vec4cxr',
+    pooling_mode="latent_attention",  # This automatically loads the trained weights
     max_length=512,
+    enable_bidirectional=True,
     torch_dtype=torch.bfloat16,
+    use_safetensors=True,
 )
+# Simple text encoding
 report = "There is a small increase in the left-sided effusion. There continues to be volume loss at both bases."
+embedding = model.encode_text([report])
 # Multiple texts at once
 reports = [
 embeddings = model.encode_text(reports)
 ```
+### Advanced Usage with Instructions and Similarity
 ```python
 # For instruction-following tasks with separator
 instruction = 'Determine the change or the status of the pleural effusion.'
 report = 'There is a small increase in the left-sided effusion.'
+query_text = instruction + '!@#$%^&*()' + report
+# Compare against multiple options
+candidates = [
+    'No pleural effusion',
+    'Pleural effusion present',
+    'Pleural effusion is worsening',
+    'Pleural effusion is improving'
+]
+# Get similarity scores using the built-in method
+similarities = model.compute_similarities(query_text, candidates)
+print(f"Similarities: {similarities}")
+# For custom separator-based encoding
+embeddings = model.encode_with_separator([query_text], separator='!@#$%^&*()')
+```
+**Note**: The model now includes convenient methods like `compute_similarities()` and `encode_with_separator()` that handle complex tokenization automatically.
+### Quick Start Example
+Here's a complete example showing the model's capabilities:
+```python
+import torch
+from llm2vec_wrapper import LLM2VecWrapper as LLM2Vec
+# Load model
+model = LLM2Vec.from_pretrained(
+    'lukeingawesome/llm2vec4cxr',
+    pooling_mode="latent_attention",
+    torch_dtype=torch.bfloat16,
+    use_safetensors=True,
+)
+# Medical text analysis
+instruction = 'Determine the change or the status of the pleural effusion.'
+report = 'There is a small increase in the left-sided effusion.'
+query = instruction + '!@#$%^&*()' + report
+# Compare with different diagnoses
+options = [
+    'No pleural effusion',
+    'Pleural effusion is worsening',
+    'Pleural effusion is stable',
+    'Pleural effusion is improving'
+]
+# Get similarity scores
+scores = model.compute_similarities(query, options)
+best_match = options[torch.argmax(scores)]
+print(f"Best match: {best_match} (score: {torch.max(scores):.4f})")
 ```
+## API Reference
+The model provides several convenient methods:
+### Core Methods
+- **`encode_text(texts)`**: Simple text encoding with automatic embed_mask handling
+- **`encode_with_separator(texts, separator='!@#$%^&*()')`**: Encoding with instruction/content separation
+- **`compute_similarities(query_text, candidate_texts)`**: One-line similarity computation
+- **`from_pretrained(..., pooling_mode="latent_attention")`**: Automatic latent attention weight loading
+### Migration from Manual Usage
+If you were previously using manual tokenization, you can now simply use:
 ```python
+# Old way (still works)
+tokenized = model.tokenizer(text, return_tensors="pt", ...)
+tokenized["embed_mask"] = tokenized["attention_mask"].clone()
+embeddings = model(tokenized)
+# New way (recommended)
+embeddings = model.encode_text([text])
 ```
 ## Evaluation