yashgori20 commited on
Commit
780dd63
Β·
1 Parent(s): 6db5c5e
SETUP.md DELETED
@@ -1,108 +0,0 @@
1
- # SEO Report Generator - Setup Instructions
2
-
3
- ## Quick Start
4
-
5
- 1. **Install Dependencies**
6
- ```bash
7
- python -m pip install -r requirements.txt
8
- ```
9
-
10
- 2. **Run the Application**
11
- ```bash
12
- python -m streamlit run app.py
13
- ```
14
- Or use the helper script:
15
- ```bash
16
- python run.py
17
- ```
18
-
19
- 3. **Access the App**
20
- - Open your browser to: http://localhost:8501
21
- - The app will automatically open if you use `python run.py`
22
-
23
- 3. **Test the System** (Optional)
24
- ```bash
25
- python test_app.py
26
- ```
27
-
28
- ## Requirements
29
-
30
- - Python 3.8+
31
- - Internet connection for API calls and web crawling
32
- - Modern web browser
33
-
34
- ## Key Features Ready to Use
35
-
36
- ### βœ… Core Features Implemented
37
- - **Technical SEO Analysis** - PageSpeed Insights integration
38
- - **Content Audit** - Automated web crawling and analysis
39
- - **Professional Reports** - HTML with interactive charts
40
- - **PDF Export** - Professional PDF generation
41
- - **Competitor Benchmarking** - Side-by-side comparison
42
- - **Executive Summary** - Health scoring and quick wins
43
-
44
- ### πŸ“Š Report Sections
45
- 1. Executive Summary with overall health score
46
- 2. Technical SEO performance metrics
47
- 3. Content audit results
48
- 4. Competitor comparison (if provided)
49
- 5. Placeholder sections for future modules
50
- 6. Prioritized recommendations
51
-
52
- ## Usage Tips
53
-
54
- 1. **URLs**: Always include `https://` for best results
55
- 2. **Competitor Analysis**: Add 1-3 competitor URLs for benchmarking
56
- 3. **Report Generation**: Takes 1-3 minutes depending on site size
57
- 4. **PDF Export**: May take additional time for complex reports
58
-
59
- ## API Limits
60
-
61
- - **PageSpeed Insights**: 25,000 requests/day (no API key needed)
62
- - For higher limits, get a free Google Cloud API key
63
-
64
- ## Troubleshooting
65
-
66
- ### Common Issues:
67
- 1. **Import Errors**: Run `python -m pip install -r requirements.txt`
68
- 2. **Command Not Found**: Use `python -m streamlit run app.py` instead of `streamlit run app.py`
69
- 3. **PDF Generation Issues**: Use HTML export and browser print-to-PDF as fallback
70
- 4. **Site Access Issues**: Some sites may block crawlers
71
- 5. **Slow Performance**: Large sites may take longer to analyze
72
-
73
- ### Performance Tips:
74
- - Use quick_scan=True for competitor analysis
75
- - Limit crawl to ~200 pages for faster results
76
- - Some sites may require custom headers
77
-
78
- ## File Structure
79
- ```
80
- β”œβ”€β”€ app.py # Main Streamlit application
81
- β”œβ”€β”€ run.py # Quick start script
82
- β”œβ”€β”€ test_app.py # Test suite
83
- β”œβ”€β”€ requirements.txt # Dependencies
84
- β”œβ”€β”€ modules/
85
- β”‚ β”œβ”€β”€ technical_seo.py # PageSpeed integration
86
- β”‚ └── content_audit.py # Content crawling
87
- β”œβ”€β”€ report_generator.py # HTML report generation
88
- └── pdf_generator.py # PDF export
89
- ```
90
-
91
- ## Next Steps
92
-
93
- The MVP is complete and ready for demo! Future enhancements can include:
94
- - Google Search Console integration for keyword data
95
- - Backlink analysis via Ahrefs/SEMrush APIs
96
- - GA4 conversion tracking
97
- - Advanced competitor analysis
98
- - Automated scheduling and monitoring
99
-
100
- ## Success Criteria βœ…
101
-
102
- βœ… Functional: User can input URL and receive full HTML + PDF report
103
- βœ… Professional output: Agency-quality reports with charts and summaries
104
- βœ… Modular design: Independent technical and content modules
105
- βœ… Extensible: Template-based report generation for easy expansion
106
- βœ… Evaluation metrics: Works with multiple domains, reliable API integration
107
-
108
- The system is ready for demonstration and production use!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
START.md DELETED
@@ -1,46 +0,0 @@
1
- # πŸš€ Quick Start Guide
2
-
3
- ## Your SEO Report Generator is Ready!
4
-
5
- The application is currently running at: **http://localhost:8501**
6
-
7
- ### How to Use:
8
-
9
- 1. **πŸ“± Open your browser** and go to: http://localhost:8501
10
- 2. **🌐 Enter a website URL** to analyze (e.g., https://example.com)
11
- 3. **βš”οΈ Add competitor URLs** (optional) for benchmarking
12
- 4. **🎯 Click "Generate SEO Report"** and wait 1-3 minutes
13
- 5. **πŸ“Š View the interactive report** with charts and analysis
14
- 6. **πŸ’Ύ Download HTML report** (PDF instructions included)
15
-
16
- ### What You'll Get:
17
-
18
- βœ… **Executive Summary** - Overall SEO health score
19
- βœ… **Technical Analysis** - PageSpeed performance metrics
20
- βœ… **Content Audit** - Metadata and content quality analysis
21
- βœ… **Competitor Comparison** - Performance benchmarking
22
- βœ… **Recommendations** - Prioritized action items
23
-
24
- ### Example URLs to Try:
25
-
26
- - https://example.com (simple test site)
27
- - https://python.org (tech documentation)
28
- - https://github.com (development platform)
29
- - Your own website!
30
-
31
- ### Features Available:
32
-
33
- - πŸ” **Technical SEO** via Google PageSpeed Insights
34
- - πŸ“ **Content Analysis** via automated web crawling
35
- - πŸ“Š **Interactive Charts** with Plotly visualizations
36
- - πŸ† **Competitor Benchmarking** (up to 3 competitors)
37
- - πŸ“„ **Professional HTML Reports** with executive summary
38
- - πŸ’‘ **PDF Creation** via browser print functionality
39
-
40
- ### Need Help?
41
-
42
- - **Stop the app**: Press `Ctrl+C` in the terminal
43
- - **Restart**: Run `python -m streamlit run app.py` again
44
- - **Issues**: Check SETUP.md for troubleshooting
45
-
46
- **πŸŽ‰ Ready to analyze some websites? Open http://localhost:8501 and start generating reports!**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
__pycache__/app.cpython-313.pyc DELETED
Binary file (7.56 kB)
 
__pycache__/pdf_generator.cpython-313.pyc DELETED
Binary file (12 kB)
 
__pycache__/report_generator.cpython-313.pyc DELETED
Binary file (43.6 kB)
 
__pycache__/simple_pdf_generator.cpython-313.pyc DELETED
Binary file (4.57 kB)
 
claude.md DELETED
@@ -1,115 +0,0 @@
1
-
2
- # PRD: One-Click SEO Report Generator (v1 MVP)
3
-
4
- ## Objective
5
-
6
- Deliver a working demo system that generates a structured SEO report from a website URL.
7
- The report should highlight **content audit** and **technical SEO performance**, and demonstrate the framework for future modules (keywords, backlinks, competitors).
8
-
9
- ---
10
-
11
- ## Scope (v1)
12
-
13
- **In scope**
14
-
15
- 1. **Input**:
16
-
17
- * User enters website URL (and optional competitor domains).
18
- * System validates and normalizes URL.
19
-
20
- 2. **Modules implemented**:
21
-
22
- * **Technical SEO** (PageSpeed Insights API)
23
-
24
- * Mobile & desktop performance scores
25
- * Core Web Vitals (LCP, CLS, INP)
26
- * Key flagged issues (e.g., oversized images, render-blocking JS)
27
- * **Content Audit** (custom crawl)
28
-
29
- * # of pages discovered (via sitemap / bounded crawl, capped \~200)
30
- * Metadata completeness (Title, Description, H1)
31
- * Avg. word count per page
32
- * CTA keyword presence (β€œcontact”, β€œdownload”, etc.)
33
- * Content freshness (last modified vs today)
34
-
35
- 3. **Report generation**:
36
-
37
- * Render as **HTML** report (modular sections).
38
- * Provide **Download as PDF** option (same HTML rendered to PDF).
39
- * Include **charts/visuals** (e.g., doughnut/pie for metadata completeness, freshness buckets, bar for Core Web Vitals vs benchmarks).
40
-
41
- 4. **Interface**:
42
-
43
- * **Streamlit app** for demo UI.
44
- * Inputs: URL (+ optional competitor domains).
45
- * Buttons: β€œGenerate Report”, β€œDownload PDF”.
46
- * Report preview inline in Streamlit.
47
-
48
- **Out of scope (v1, stub/fallback only)**
49
-
50
- * Keyword Rankings (GSC/SEMrush) β†’ show placeholder section.
51
- * Backlink Profile (Ahrefs/SEMrush) β†’ placeholder section.
52
- * Competitor benchmarking β†’ limited to PageSpeed/content freshness comparison if URLs provided.
53
- * GA4 / conversion metrics.
54
-
55
- ---
56
-
57
- ## Output structure (MVP report)
58
-
59
- 1. **Executive Summary**
60
-
61
- * Quick health snapshot: Technical performance + Content audit highlights.
62
- * β€œQuick wins” (e.g., missing metadata, low mobile score).
63
-
64
- 2. **Technical SEO**
65
-
66
- * PageSpeed scores (Mobile + Desktop).
67
- * Core Web Vitals chart.
68
- * Top issues flagged.
69
-
70
- 3. **Content Audit**
71
-
72
- * Indexed pages count (discovered pages).
73
- * Metadata completeness (% with title, description, H1).
74
- * Avg. word count per page (vs benchmark 800–1200 words).
75
- * CTA presence (% pages with calls-to-action).
76
- * Content freshness buckets (<6 months, 6–18 months, >18 months).
77
-
78
- 4. **Competitor Light (optional if input provided)**
79
-
80
- * PageSpeed score comparison.
81
- * Content freshness comparison (avg. last-modified).
82
-
83
- 5. **Placeholder sections**
84
-
85
- * Keywords, backlinks, conversions β†’ visible but labeled as β€œto be added in future versions.”
86
-
87
- 6. **Recommendations**
88
-
89
- * Auto-generated based on findings (ruleset from benchmarks).
90
- * Example: β€œ50% of pages missing meta descriptions β†’ prioritize metadata optimization.”
91
-
92
- ---
93
-
94
- ## Success criteria
95
-
96
- * **Functional**: User can input a URL and receive a full HTML + PDF report in <3 minutes.
97
- * **Professional output**: Report visually resembles an agency deck (charts, tables, summaries).
98
- * **Modular design**: Technical SEO and Content Audit implemented as independent modules, with stubs for others.
99
- * **Extensible**: Report generator uses templates so adding future modules is straightforward.
100
-
101
- ---
102
-
103
- ## Evaluation metrics
104
-
105
- * Report generates without failures for at least 3 different domains.
106
- * PageSpeed data fetched reliably via Google API.
107
- * Crawl completes within 200 pages, respecting robots.txt.
108
- * Charts render correctly in HTML and export cleanly to PDF.
109
- * Report structure matches defined format.
110
-
111
- ---
112
-
113
- This PRD keeps the v1 realistic (2–4 days build) while laying the bones for the full system.
114
-
115
- Do you want me to next **map this PRD to required API keys/libraries** so we know what accounts to set up before coding, or should we first design the **module interfaces (input/output contract)**?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
modules/__pycache__/__init__.cpython-313.pyc DELETED
Binary file (144 Bytes)
 
modules/__pycache__/content_audit.cpython-313.pyc DELETED
Binary file (17.1 kB)
 
modules/__pycache__/technical_seo.cpython-313.pyc DELETED
Binary file (9.8 kB)
 
pdf_generator.py DELETED
@@ -1,457 +0,0 @@
1
- from weasyprint import HTML, CSS
2
- import base64
3
- import io
4
- from typing import Dict, Any, List
5
-
6
- class PDFGenerator:
7
- def __init__(self):
8
- self.css_styles = self._get_pdf_styles()
9
-
10
- def generate_pdf(self, html_content: str) -> bytes:
11
- """
12
- Generate PDF from HTML content
13
-
14
- Args:
15
- html_content: HTML string to convert to PDF
16
-
17
- Returns:
18
- PDF content as bytes
19
- """
20
- try:
21
- # Clean HTML for PDF generation (remove interactive elements)
22
- pdf_html = self._prepare_html_for_pdf(html_content)
23
-
24
- # Create HTML document
25
- html_doc = HTML(string=pdf_html)
26
-
27
- # Generate PDF
28
- pdf_buffer = io.BytesIO()
29
- html_doc.write_pdf(pdf_buffer, stylesheets=[CSS(string=self.css_styles)])
30
-
31
- return pdf_buffer.getvalue()
32
-
33
- except Exception as e:
34
- print(f"PDF generation failed: {e}")
35
- raise
36
-
37
- def _prepare_html_for_pdf(self, html_content: str) -> str:
38
- """
39
- Prepare HTML content for PDF generation by removing interactive elements
40
- """
41
- # Remove Plotly scripts and interactive charts
42
- # Replace with static chart placeholders
43
- pdf_html = html_content.replace(
44
- '<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>',
45
- ''
46
- )
47
-
48
- # Remove any JavaScript
49
- import re
50
- pdf_html = re.sub(r'<script[^>]*>.*?</script>', '', pdf_html, flags=re.DOTALL)
51
-
52
- # Replace interactive Plotly divs with chart placeholders
53
- pdf_html = re.sub(
54
- r'<div[^>]*class="plotly-graph-div"[^>]*>.*?</div>',
55
- '<div class="chart-placeholder"><p>πŸ“Š Chart: View interactive version in HTML report</p></div>',
56
- pdf_html,
57
- flags=re.DOTALL
58
- )
59
-
60
- return pdf_html
61
-
62
- def _get_pdf_styles(self) -> str:
63
- """
64
- Get CSS styles optimized for PDF generation
65
- """
66
- return """
67
- @page {
68
- margin: 2cm;
69
- size: A4;
70
- @top-center {
71
- content: "SEO Report";
72
- font-size: 10pt;
73
- color: #666;
74
- }
75
- @bottom-center {
76
- content: "Page " counter(page) " of " counter(pages);
77
- font-size: 10pt;
78
- color: #666;
79
- }
80
- }
81
-
82
- * {
83
- margin: 0;
84
- padding: 0;
85
- box-sizing: border-box;
86
- }
87
-
88
- body {
89
- font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
90
- line-height: 1.4;
91
- color: #333;
92
- font-size: 11pt;
93
- }
94
-
95
- .report-container {
96
- max-width: 100%;
97
- }
98
-
99
- .report-header {
100
- background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
101
- color: white;
102
- padding: 30px;
103
- text-align: center;
104
- border-radius: 8px;
105
- margin-bottom: 20px;
106
- break-inside: avoid;
107
- }
108
-
109
- .report-header h1 {
110
- font-size: 24pt;
111
- margin-bottom: 10px;
112
- }
113
-
114
- .section {
115
- background: white;
116
- margin-bottom: 20px;
117
- padding: 20px;
118
- border: 1px solid #ddd;
119
- border-radius: 8px;
120
- break-inside: avoid-page;
121
- }
122
-
123
- .section h2 {
124
- color: #2c3e50;
125
- margin-bottom: 15px;
126
- font-size: 16pt;
127
- border-bottom: 2px solid #3498db;
128
- padding-bottom: 5px;
129
- }
130
-
131
- .summary-card {
132
- display: flex;
133
- justify-content: space-between;
134
- align-items: center;
135
- margin-bottom: 20px;
136
- padding: 15px;
137
- background: #f8f9fa;
138
- border-radius: 8px;
139
- border: 1px solid #dee2e6;
140
- }
141
-
142
- .health-score {
143
- text-align: center;
144
- margin-right: 20px;
145
- }
146
-
147
- .score-circle {
148
- width: 80px;
149
- height: 80px;
150
- border: 4px solid #3498db;
151
- border-radius: 50%;
152
- display: flex;
153
- flex-direction: column;
154
- align-items: center;
155
- justify-content: center;
156
- margin: 10px auto;
157
- }
158
-
159
- .score-number {
160
- font-size: 18pt;
161
- font-weight: bold;
162
- color: #3498db;
163
- }
164
-
165
- .score-label {
166
- font-size: 8pt;
167
- }
168
-
169
- .key-metrics {
170
- display: flex;
171
- gap: 20px;
172
- flex: 1;
173
- }
174
-
175
- .metric {
176
- text-align: center;
177
- flex: 1;
178
- }
179
-
180
- .metric h4 {
181
- margin-bottom: 5px;
182
- font-size: 10pt;
183
- color: #666;
184
- }
185
-
186
- .quick-wins {
187
- background: #fff3cd;
188
- border: 1px solid #ffeeba;
189
- border-radius: 6px;
190
- padding: 15px;
191
- break-inside: avoid;
192
- }
193
-
194
- .quick-wins h3 {
195
- color: #856404;
196
- margin-bottom: 10px;
197
- font-size: 12pt;
198
- }
199
-
200
- .quick-wins ul {
201
- list-style-type: none;
202
- }
203
-
204
- .quick-wins li {
205
- color: #856404;
206
- margin-bottom: 5px;
207
- padding-left: 15px;
208
- position: relative;
209
- }
210
-
211
- .quick-wins li:before {
212
- content: "β†’";
213
- position: absolute;
214
- left: 0;
215
- color: #ffc107;
216
- font-weight: bold;
217
- }
218
-
219
- .metric-row {
220
- display: flex;
221
- gap: 15px;
222
- margin-bottom: 20px;
223
- flex-wrap: wrap;
224
- }
225
-
226
- .metric-card {
227
- background: #667eea;
228
- color: white;
229
- padding: 15px;
230
- border-radius: 8px;
231
- text-align: center;
232
- flex: 1;
233
- min-width: 120px;
234
- }
235
-
236
- .metric-card h4 {
237
- font-size: 9pt;
238
- margin-bottom: 8px;
239
- opacity: 0.9;
240
- }
241
-
242
- .metric-card .score {
243
- font-size: 16pt;
244
- font-weight: bold;
245
- }
246
-
247
- .chart-placeholder {
248
- background: #f8f9fa;
249
- border: 2px dashed #ddd;
250
- padding: 40px;
251
- text-align: center;
252
- border-radius: 8px;
253
- margin: 15px 0;
254
- }
255
-
256
- .chart-placeholder p {
257
- color: #666;
258
- font-style: italic;
259
- }
260
-
261
- .stat {
262
- display: flex;
263
- justify-content: space-between;
264
- align-items: center;
265
- padding: 8px 0;
266
- border-bottom: 1px solid #eee;
267
- }
268
-
269
- .stat:last-child {
270
- border-bottom: none;
271
- }
272
-
273
- .stat .label {
274
- font-weight: 600;
275
- color: #2c3e50;
276
- font-size: 10pt;
277
- }
278
-
279
- .stat .value {
280
- font-weight: bold;
281
- color: #3498db;
282
- font-size: 10pt;
283
- }
284
-
285
- .stat .benchmark {
286
- font-size: 8pt;
287
- color: #7f8c8d;
288
- }
289
-
290
- .opportunity {
291
- background: #f8f9fa;
292
- border-left: 3px solid #ff6b6b;
293
- padding: 10px;
294
- margin-bottom: 10px;
295
- break-inside: avoid;
296
- }
297
-
298
- .opportunity h4 {
299
- color: #2c3e50;
300
- margin-bottom: 5px;
301
- font-size: 11pt;
302
- }
303
-
304
- .savings {
305
- display: inline-block;
306
- background: #ff6b6b;
307
- color: white;
308
- padding: 2px 6px;
309
- border-radius: 3px;
310
- font-size: 8pt;
311
- margin-top: 5px;
312
- }
313
-
314
- .comparison-table {
315
- width: 100%;
316
- border-collapse: collapse;
317
- margin-top: 15px;
318
- font-size: 9pt;
319
- }
320
-
321
- .comparison-table th,
322
- .comparison-table td {
323
- padding: 8px;
324
- text-align: left;
325
- border-bottom: 1px solid #ddd;
326
- }
327
-
328
- .comparison-table th {
329
- background: #f8f9fa;
330
- font-weight: bold;
331
- color: #2c3e50;
332
- }
333
-
334
- .primary-site {
335
- background: #e8f5e8;
336
- font-weight: bold;
337
- }
338
-
339
- .placeholder-sections {
340
- display: flex;
341
- flex-wrap: wrap;
342
- gap: 15px;
343
- }
344
-
345
- .placeholder-section {
346
- border: 2px dashed #ddd;
347
- border-radius: 8px;
348
- padding: 15px;
349
- text-align: center;
350
- background: #fafafa;
351
- flex: 1;
352
- min-width: 250px;
353
- }
354
-
355
- .placeholder-section h3 {
356
- color: #7f8c8d;
357
- margin-bottom: 10px;
358
- font-size: 12pt;
359
- }
360
-
361
- .placeholder-content p {
362
- color: #7f8c8d;
363
- font-style: italic;
364
- margin-bottom: 10px;
365
- font-size: 9pt;
366
- }
367
-
368
- .placeholder-content ul {
369
- list-style: none;
370
- color: #95a5a6;
371
- font-size: 9pt;
372
- }
373
-
374
- .recommendations-section {
375
- background: #667eea;
376
- color: white;
377
- border-radius: 8px;
378
- padding: 20px;
379
- }
380
-
381
- .recommendations-section h3 {
382
- margin-bottom: 15px;
383
- font-size: 14pt;
384
- }
385
-
386
- .recommendation {
387
- background: white;
388
- color: #333;
389
- border-radius: 6px;
390
- padding: 15px;
391
- margin-bottom: 15px;
392
- break-inside: avoid;
393
- }
394
-
395
- .rec-header {
396
- display: flex;
397
- align-items: center;
398
- gap: 8px;
399
- margin-bottom: 8px;
400
- }
401
-
402
- .rec-number {
403
- background: #3498db;
404
- color: white;
405
- width: 24px;
406
- height: 24px;
407
- border-radius: 50%;
408
- display: flex;
409
- align-items: center;
410
- justify-content: center;
411
- font-weight: bold;
412
- font-size: 10pt;
413
- }
414
-
415
- .rec-priority {
416
- color: white;
417
- padding: 3px 6px;
418
- border-radius: 3px;
419
- font-size: 8pt;
420
- font-weight: bold;
421
- }
422
-
423
- .rec-category {
424
- background: #ecf0f1;
425
- color: #2c3e50;
426
- padding: 3px 6px;
427
- border-radius: 3px;
428
- font-size: 8pt;
429
- }
430
-
431
- .recommendation h4 {
432
- font-size: 11pt;
433
- margin-bottom: 5px;
434
- }
435
-
436
- .recommendation p {
437
- font-size: 9pt;
438
- line-height: 1.3;
439
- }
440
-
441
- .rec-timeline {
442
- color: #7f8c8d;
443
- font-size: 8pt;
444
- margin-top: 8px;
445
- font-weight: bold;
446
- }
447
-
448
- .error-message {
449
- background: #f8d7da;
450
- border: 1px solid #f5c6cb;
451
- color: #721c24;
452
- padding: 15px;
453
- border-radius: 6px;
454
- text-align: center;
455
- font-size: 10pt;
456
- }
457
- """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
run.py DELETED
@@ -1,40 +0,0 @@
1
- """
2
- Quick start script for SEO Report Generator
3
- """
4
-
5
- import subprocess
6
- import sys
7
- import os
8
-
9
- def main():
10
- print("πŸ” SEO Report Generator")
11
- print("=" * 40)
12
-
13
- # Check if we're in the right directory
14
- if not os.path.exists('app.py'):
15
- print("❌ Error: app.py not found. Make sure you're in the correct directory.")
16
- sys.exit(1)
17
-
18
- print("πŸ“¦ Starting Streamlit application...")
19
- print("🌐 App will be available at: http://localhost:8501")
20
- print("πŸ”„ Press Ctrl+C to stop the application")
21
- print("\nπŸ’‘ Quick Tips:")
22
- print(" β€’ Enter any website URL to analyze")
23
- print(" β€’ Add competitor URLs for benchmarking")
24
- print(" β€’ Reports include technical SEO + content audit")
25
- print(" β€’ Download HTML reports (PDF via browser print)")
26
- print("=" * 40)
27
-
28
- try:
29
- # Start Streamlit app
30
- subprocess.run([sys.executable, "-m", "streamlit", "run", "app.py"], check=True)
31
- except KeyboardInterrupt:
32
- print("\nπŸ‘‹ Application stopped by user")
33
- except subprocess.CalledProcessError as e:
34
- print(f"❌ Error starting application: {e}")
35
- print("πŸ’‘ Make sure you have installed the requirements: pip install -r requirements.txt")
36
- except FileNotFoundError:
37
- print("❌ Streamlit not found. Install it with: pip install streamlit")
38
-
39
- if __name__ == "__main__":
40
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
test_app.py DELETED
@@ -1,122 +0,0 @@
1
- """
2
- Test script for SEO Report Generator
3
- Run this to test the core functionality without the Streamlit UI
4
- """
5
-
6
- from modules.technical_seo import TechnicalSEOModule
7
- from modules.content_audit import ContentAuditModule
8
- from report_generator import ReportGenerator
9
- from pdf_generator import PDFGenerator
10
-
11
- def test_seo_report_generation():
12
- """Test the complete SEO report generation process"""
13
-
14
- # Test URLs
15
- test_urls = [
16
- "https://example.com",
17
- "https://python.org",
18
- "https://github.com"
19
- ]
20
-
21
- print("πŸ” Starting SEO Report Generator Tests\n")
22
-
23
- for url in test_urls:
24
- print(f"Testing URL: {url}")
25
- print("-" * 50)
26
-
27
- try:
28
- # Initialize modules
29
- technical_module = TechnicalSEOModule()
30
- content_module = ContentAuditModule()
31
- report_gen = ReportGenerator()
32
-
33
- # Technical SEO Analysis
34
- print("⚑ Running Technical SEO analysis...")
35
- technical_data = technical_module.analyze(url)
36
-
37
- if technical_data.get('error'):
38
- print(f"⚠️ Technical analysis failed: {technical_data['error']}")
39
- else:
40
- mobile_score = technical_data.get('mobile', {}).get('performance_score', 0)
41
- desktop_score = technical_data.get('desktop', {}).get('performance_score', 0)
42
- print(f"βœ… Performance scores - Mobile: {mobile_score}/100, Desktop: {desktop_score}/100")
43
-
44
- # Content Audit
45
- print("πŸ“ Running Content audit...")
46
- content_data = content_module.analyze(url, quick_scan=True) # Quick scan for testing
47
-
48
- if content_data.get('error'):
49
- print(f"⚠️ Content analysis failed: {content_data['error']}")
50
- else:
51
- pages_analyzed = content_data.get('pages_analyzed', 0)
52
- title_coverage = content_data.get('metadata_completeness', {}).get('title_coverage', 0)
53
- print(f"βœ… Content metrics - Pages analyzed: {pages_analyzed}, Title coverage: {title_coverage}%")
54
-
55
- # Generate HTML Report
56
- print("πŸ“Š Generating HTML report...")
57
- report_html = report_gen.generate_html_report(
58
- url=url,
59
- technical_data=technical_data,
60
- content_data=content_data,
61
- include_charts=True
62
- )
63
-
64
- # Save HTML report
65
- filename = f"test_report_{url.replace('https://', '').replace('/', '_')}.html"
66
- with open(filename, 'w', encoding='utf-8') as f:
67
- f.write(report_html)
68
- print(f"βœ… HTML report saved: {filename}")
69
-
70
- # Test PDF generation
71
- print("πŸ“‘ Testing PDF generation...")
72
- try:
73
- pdf_gen = PDFGenerator()
74
- pdf_data = pdf_gen.generate_pdf(report_html)
75
-
76
- pdf_filename = filename.replace('.html', '.pdf')
77
- with open(pdf_filename, 'wb') as f:
78
- f.write(pdf_data)
79
- print(f"βœ… PDF report saved: {pdf_filename}")
80
-
81
- except Exception as pdf_error:
82
- print(f"⚠️ PDF generation failed: {pdf_error}")
83
-
84
- print("βœ… Test completed successfully!\n")
85
-
86
- except Exception as e:
87
- print(f"❌ Test failed for {url}: {str(e)}\n")
88
-
89
- def test_individual_modules():
90
- """Test individual modules separately"""
91
- print("πŸ§ͺ Testing Individual Modules\n")
92
-
93
- # Test Technical SEO Module
94
- print("Testing Technical SEO Module...")
95
- tech_module = TechnicalSEOModule()
96
- tech_result = tech_module.analyze("https://example.com")
97
- print(f"Technical SEO result keys: {list(tech_result.keys())}")
98
-
99
- # Test Content Audit Module
100
- print("\nTesting Content Audit Module...")
101
- content_module = ContentAuditModule()
102
- content_result = content_module.analyze("https://example.com", quick_scan=True)
103
- print(f"Content Audit result keys: {list(content_result.keys())}")
104
-
105
- print("\nβœ… Individual module tests completed!")
106
-
107
- if __name__ == "__main__":
108
- print("=" * 60)
109
- print("SEO REPORT GENERATOR - TEST SUITE")
110
- print("=" * 60)
111
-
112
- # Run individual module tests
113
- test_individual_modules()
114
- print("\n" + "=" * 60 + "\n")
115
-
116
- # Run full report generation tests
117
- test_seo_report_generation()
118
-
119
- print("=" * 60)
120
- print("πŸŽ‰ All tests completed!")
121
- print("Check the generated HTML and PDF files to verify output.")
122
- print("=" * 60)