Spaces:

Mehardeep7
/

rag-pipeline-llm

Runtime error

App Files Files Community

rag-pipeline-llm / README.md

Mehardeep7

Deploy RAG pipeline to Hugging Face Spaces

d29a257 2 months ago

preview code

raw

history blame

3.07 kB

	# 🔍 RAG Pipeline For LLMs 🚀

	[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Mehardeep79/rag-pipeline-llm)
	[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org)

	## 📖 Project Overview

	An intelligent Retrieval-Augmented Generation (RAG) pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.

	## ✨ Key Features

	- 📚 Dynamic Knowledge Retrieval from Wikipedia with error handling
	- 🧮 Semantic Search using sentence transformers (no keyword dependency)
	- ⚡ Fast Vector Similarity with FAISS indexing (sub-second search)
	- 🤖 Intelligent Answer Generation using pre-trained QA models
	- 📊 Confidence Scoring for answer quality assessment
	- 🎛️ Customizable Parameters (chunk size, retrieval count, overlap)
	- ✂️ Smart Text Chunking with overlapping segments for context preservation

	## 🏗️ Architecture

	```
	User Query → Embedding → FAISS Search → Retrieve Chunks → QA Model → Answer + Confidence
	```

	## 🤖 AI Models Used

	- 📏 Text Chunking: `sentence-transformers/all-mpnet-base-v2` tokenizer
	- 🧮 Vector Embeddings: `sentence-transformers/all-mpnet-base-v2` (768-dimensional)
	- ❓ Question Answering: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0)
	- 🔍 Vector Search: FAISS IndexFlatL2 for L2 distance similarity

	## 🚀 How to Use

	1. 📖 Process Article: Enter any Wikipedia topic and configure chunk settings
	2. ❓ Ask Questions: Switch to Q&A tab and enter your questions
	3. 📊 View Results: Explore answers with confidence scores and similarity metrics
	4. 🔍 Analyze: Check retrieved context and visualization analytics

	## 💡 Example Usage

	```
	Topic: "Artificial Intelligence"
	Question: "What is machine learning?"
	Answer: "Machine learning is a subset of artificial intelligence..."
	Confidence: 89.7%
	```

	## 🔧 Configuration Options

	- Chunk Size: 128-512 tokens (default: 256)
	- Overlap: 10-50 tokens (default: 20)
	- Retrieval Count: 1-10 chunks (default: 3)

	## 📊 Performance

	- Search Speed: Sub-second retrieval for 1000+ chunks
	- Accuracy: High precision with confidence scoring
	- Memory Efficient: Optimized chunk sizes prevent token overflow

	## 🔗 Links

	- 📝 Full Project: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM)
	- 📓 Jupyter Notebook: Complete implementation with explanations
	- 🌐 Streamlit App: Alternative web interface

	## 🤝 Credits

	Built with ❤️ using:
	- 🤗 Hugging Face for transformers and model hosting
	- ⚡ FAISS for efficient vector search
	- 🎨 Gradio for the interactive interface
	- 📖 Wikipedia API for knowledge content

	---

	⭐ If you find this useful, please give it a star on GitHub!

	# 🔍 RAG Pipeline For LLMs 🚀

	[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Mehardeep79/rag-pipeline-llm)
	[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org)

	## 📖 Project Overview

	An intelligent Retrieval-Augmented Generation (RAG) pipeline that combines semantic search with question-answering capabilities. This system fetches Wikipedia articles, processes them into searchable chunks, and uses state-of-the-art AI models to provide accurate, context-aware answers.

	## ✨ Key Features

	- 📚 Dynamic Knowledge Retrieval from Wikipedia with error handling
	- 🧮 Semantic Search using sentence transformers (no keyword dependency)
	- ⚡ Fast Vector Similarity with FAISS indexing (sub-second search)
	- 🤖 Intelligent Answer Generation using pre-trained QA models
	- 📊 Confidence Scoring for answer quality assessment
	- 🎛️ Customizable Parameters (chunk size, retrieval count, overlap)
	- ✂️ Smart Text Chunking with overlapping segments for context preservation

	## 🏗️ Architecture

	```
	User Query → Embedding → FAISS Search → Retrieve Chunks → QA Model → Answer + Confidence
	```

	## 🤖 AI Models Used

	- 📏 Text Chunking: `sentence-transformers/all-mpnet-base-v2` tokenizer
	- 🧮 Vector Embeddings: `sentence-transformers/all-mpnet-base-v2` (768-dimensional)
	- ❓ Question Answering: `deepset/roberta-base-squad2` (RoBERTa fine-tuned on SQuAD 2.0)
	- 🔍 Vector Search: FAISS IndexFlatL2 for L2 distance similarity

	## 🚀 How to Use

	1. 📖 Process Article: Enter any Wikipedia topic and configure chunk settings
	2. ❓ Ask Questions: Switch to Q&A tab and enter your questions
	3. 📊 View Results: Explore answers with confidence scores and similarity metrics
	4. 🔍 Analyze: Check retrieved context and visualization analytics

	## 💡 Example Usage

	```
	Topic: "Artificial Intelligence"
	Question: "What is machine learning?"
	Answer: "Machine learning is a subset of artificial intelligence..."
	Confidence: 89.7%
	```

	## 🔧 Configuration Options

	- Chunk Size: 128-512 tokens (default: 256)
	- Overlap: 10-50 tokens (default: 20)
	- Retrieval Count: 1-10 chunks (default: 3)

	## 📊 Performance

	- Search Speed: Sub-second retrieval for 1000+ chunks
	- Accuracy: High precision with confidence scoring
	- Memory Efficient: Optimized chunk sizes prevent token overflow

	## 🔗 Links

	- 📝 Full Project: [GitHub Repository](https://github.com/Mehardeep79/RAG_Pipeline_LLM)
	- 📓 Jupyter Notebook: Complete implementation with explanations
	- 🌐 Streamlit App: Alternative web interface

	## 🤝 Credits

	Built with ❤️ using:
	- 🤗 Hugging Face for transformers and model hosting
	- ⚡ FAISS for efficient vector search
	- 🎨 Gradio for the interactive interface
	- 📖 Wikipedia API for knowledge content

	---

	⭐ If you find this useful, please give it a star on GitHub!