File size: 2,787 Bytes
0f36d1c
 
 
 
 
 
 
 
 
 
e884643
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6631a15
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: GraphLLM - PDF Knowledge Graph RAG
emoji: πŸ•ΈοΈ
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---

# πŸ•ΈοΈ GraphLLM - PDF Knowledge Graph + RAG System

Transform PDFs into interactive knowledge graphs with AI-powered Q&A.

## πŸš€ Features

- **πŸ“„ PDF Processing:** Extract text, tables, and images from PDFs
- **πŸ•ΈοΈ Knowledge Graph Generation:** Build semantic graphs using Gemini AI
- **πŸ” Vector Search:** FAISS-powered semantic search with sentence transformers
- **πŸ’¬ RAG Chat:** Ask questions and get answers with source citations
- **🎨 Interactive Visualization:** Explore knowledge graphs in your browser

## πŸ› οΈ Technology Stack

- **LLM:** Google Gemini (gemini-2.5-flash)
- **Embeddings:** sentence-transformers/all-MiniLM-L6-v2
- **Vector Store:** FAISS with HNSW index
- **Graph:** NetworkX (in-memory)
- **Backend:** FastAPI + Uvicorn
- **Frontend:** Vanilla JS with D3.js/Cytoscape

## πŸ“‹ Setup

### Required: Gemini API Key

This app requires a Google Gemini API key:

1. Get your API key from [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Add it as a **Secret** in Hugging Face Spaces settings:
   - Name: `GEMINI_API_KEY`
   - Value: Your API key

### Configuration (Optional)

You can set these environment variables in Space Settings:

```bash
# LLM Settings
GEMINI_MODEL=gemini-2.5-flash     # Gemini model
LLM_TEMPERATURE=0.0               # Temperature for extraction

# Embedding Settings
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

# Environment
ENVIRONMENT=production
LOG_LEVEL=INFO
```

## 🎯 Usage

1. **Upload PDF:** Click "Upload PDF" and select your document
2. **Wait for Processing:** The system will:
   - Extract text chunks
   - Generate embeddings
   - Build knowledge graph with Gemini
3. **Explore Graph:** Click nodes to see details and related concepts
4. **Ask Questions:** Use the chat interface for Q&A with citations

## πŸ“Š Graph Generation

- **Per-Page Extraction:** Max 2 concepts per page (quality over quantity)
- **Parallel Processing:** All pages processed concurrently via Gemini API
- **Strict Filtering:** Only technical/domain-specific concepts
- **Co-occurrence Relationships:** Concepts on same page are linked

## 🎨 Frontend

The frontend is a single-page application located in `/frontend/`:
- `index.html` - Main UI
- `app.js` - Graph visualization & API calls
- `styles.css` - Styling

Access it at: `http://your-space-url.hf.space/frontend/`


## πŸ“¦ Docker

This Space uses Docker for deployment:
- Base: `python:3.12-slim`
- Port: 7860 (HF Spaces default)
- Health check enabled
- Persistent data directory

## 🀝 Credits

- **LLM:** Google Gemini
- **Embeddings:** Hugging Face sentence-transformers


---