Query Auto-Completion Model

A CNN+Transformer model for ranking query auto-completion suggestions.

Model Description

This model scores how well a candidate completion matches a given query prefix. It uses:

  • Prefix Encoder: Multi-scale CNN + Transformer for extracting search intention from partial queries
  • Candidate Encoder: Transformer for encoding candidate completions
  • Match Predictor: MLP that scores the compatibility between prefix and candidate

The model uses pretrained ByT5 byte-level embeddings for robust character-level understanding.

Model Architecture

  • Embedding Dimension: 256
  • CNN Filters: 64 (filter sizes: [3, 4, 5])
  • Transformer Heads: 4
  • Transformer Layers: 2
  • Base Embeddings: google/byt5-small

Usage

Installation

pip install transformers torch

Loading the Model

from transformers import AutoTokenizer, AutoConfig, AutoModel

# Load model (trust_remote_code required for custom architecture)
config = AutoConfig.from_pretrained("lv12/sin-qac-model", trust_remote_code=True)
model = AutoModel.from_pretrained("lv12/sin-qac-model", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("google/byt5-small")

Scoring Candidates

import torch

def score_completion(model, tokenizer, prefix: str, candidate: str, max_length: int = 20):
    """Score how well a candidate matches a prefix."""
    model.eval()

    prefix_encoding = tokenizer(
        prefix,
        max_length=max_length,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )
    candidate_encoding = tokenizer(
        candidate,
        max_length=max_length,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )

    with torch.no_grad():
        score = model(
            prefix_ids=prefix_encoding["input_ids"],
            candidate_ids=candidate_encoding["input_ids"]
        )

    return score.squeeze().item()

# Example usage
prefix = "how to"
candidates = ["how to cook pasta", "how to learn python", "weather today"]

scores = []
for candidate in candidates:
    score = score_completion(model, tokenizer, prefix, candidate)
    scores.append((candidate, score))

# Sort by score (higher is better match)
scores.sort(key=lambda x: x[1], reverse=True)
for candidate, score in scores:
    print(f"{score:.4f} - {candidate}")

Batch Scoring

def batch_score(model, tokenizer, prefix: str, candidates: list, max_length: int = 20):
    """Score multiple candidates efficiently."""
    model.eval()

    prefix_encoding = tokenizer(
        prefix,
        max_length=max_length,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )
    prefix_ids = prefix_encoding["input_ids"]

    candidate_encodings = tokenizer(
        candidates,
        max_length=max_length,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )

    scores = []
    with torch.no_grad():
        for i in range(len(candidates)):
            score = model(
                prefix_ids=prefix_ids,
                candidate_ids=candidate_encodings["input_ids"][i:i+1]
            )
            scores.append(score.squeeze().item())

    return list(zip(candidates, scores))

# Example
results = batch_score(model, tokenizer, "best resta", [
    "best restaurants near me",
    "best restaurant in new york",
    "best resume templates",
    "weather forecast"
])
for candidate, score in sorted(results, key=lambda x: -x[1]):
    print(f"{score:.4f} - {candidate}")

Training Details

Evaluation

The model outputs scores between 0 and 1:

  • > 0.7: Strong match (candidate is a likely completion)
  • 0.4 - 0.7: Moderate match
  • < 0.4: Weak match (candidate unlikely to be what user wants)

Limitations

  • Optimized for English queries
  • Best performance on short prefixes (< 20 characters)
  • Trained on search autocomplete data; may not generalize to other domains

Citation

If you use this model, please cite:

@misc{query-completion-model,
  title={Query Auto-Completion Model},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/lv12/sin-qac-model}
}
Downloads last month
14
Safetensors
Model size
4.05M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lv12/sin-qac-model

Base model

google/byt5-small
Finetuned
(169)
this model

Dataset used to train lv12/sin-qac-model