structured-llm-ready-doc-converter

Sleeping

pierreguillou commited on 17 days ago

Commit

448f55b

verified ·

1 Parent(s): b740887

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -285,9 +285,10 @@ def reset_form():
 # Gradio Interface
 with gr.Blocks(title="LLM-Ready Document Converter") as app:
     gr.Markdown("# 📄 LLM-Ready Document Converter")
     gr.Markdown("**HOWTO** : Upload a document or image and get 4 output files: Docling JSON, TXT, Markdown, and HTML")
-	gr.Markdown("**EXPLANATION** : This app transforms various document formats (like TXT, standard and scanned PDFs, DOCX, PPT, CSV, XLS, XLSX) and **images (PNG, JPG, JPEG, BMP, TIFF)** into structured, machine-readable outputs optimized for Large Language Models (LLMs). For images, it uses OCR (Optical Character Recognition) to extract text. For all input documents, it extracts and converts content into clean formats such as DocLing JSON (for document structure), plain text, Markdown, and HTML making it easier for AI models to process, analyze, or generate responses from complex documents without losing key details like layout or formatting. Essentially, it's a bridge between raw files and AI-ready data.")
     with gr.Row():
         with gr.Column():

 # Gradio Interface
 with gr.Blocks(title="LLM-Ready Document Converter") as app:
     gr.Markdown("# 📄 LLM-Ready Document Converter")
     gr.Markdown("**HOWTO** : Upload a document or image and get 4 output files: Docling JSON, TXT, Markdown, and HTML")
+    gr.Markdown("**EXPLANATION** : This app transforms various document formats (like TXT, standard and scanned PDFs, DOCX, PPT, CSV, XLS, XLSX) and **images (PNG, JPG, JPEG, BMP, TIFF)** into structured, machine-readable outputs optimized for Large Language Models (LLMs). For images, it uses OCR (Optical Character Recognition) to extract text. For all input documents, it extracts and converts content into clean formats such as DocLing JSON (for document structure), plain text, Markdown, and HTML making it easier for AI models to process, analyze, or generate responses from complex documents without losing key details like layout or formatting. Essentially, it's a bridge between raw files and AI-ready data.")
     with gr.Row():
         with gr.Column():