Spaces:
Sleeping
Sleeping
| title: Search Engine | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| # Prompt Search Engine | |
| ## Table of Contents | |
| 1. [Project Overview](#project-overview) | |
| 2. [Environment Setup](#environment-setup) | |
| 3. [Run the Project](#run-the-project) | |
| 4. [API Endpoints and Usage](#api-endpoints-and-usage) | |
| 5. [Instructions for Building and Running the Docker Container](#instructions-for-building-and-running-the-docker-container) | |
| 6. [Deployment Details](#deployment-details) | |
| 7. [Running Tests](#running-tests) | |
| 8. [Information on How to Use the UI](#information-on-how-to-use-the-ui) | |
| 9. [Future Improvements](#future-improvements) | |
| --- | |
| ## Project Overview | |
| The Prompt Search Engine is designed to address the growing need for high-quality prompts used in AI-generated content, | |
| particularly for models like Stable Diffusion. By leveraging a database of existing prompts, | |
| this search engine helps users discover the most relevant and effective prompts, significantly enhancing the quality of generated images. | |
| The main goal of the prompt search engine is to return the top n most similar prompts with respect to the input prompt query. | |
| This way, we can generate higher quality images by providing better prompts for the Stable Diffusion models. | |
| ### Technology Used | |
| This project leverages a modern tech stack to deliver efficient search functionality: | |
| 1. **FastAPI**: A high-performance web framework for building the backend API. | |
| 2. **Gradio**: A lightweight UI framework for creating the frontend interface. | |
| 3. **Hugging Face Spaces**: For hosting the application using Docker. | |
| 4. **Hugging Face Datasets**: Downloads and processes the `google-research-datasets/conceptual_captions` dataset at runtime. | |
| 5. **Uvicorn**: ASGI server for running the FastAPI application. | |
| 6. **Python**: Core language used for development and scripting. | |
| --- | |
| ## Environment Setup | |
| To set up the environment for the Prompt Search Engine, follow these steps: | |
| ### Prerequisites | |
| 1. **Python**: Ensure Python >= 3.9 is installed. You can download it from [Python.org](https://www.python.org/downloads/). | |
| 2. **Docker**: Install Docker to containerize and deploy the application. Visit [Docker's official site](https://www.docker.com/get-started) for installation instructions. | |
| 3. **Conda (Optional)**: Install Miniconda or Anaconda for managing a virtual environment locally. | |
| ### Steps to Install Dependencies | |
| 1. Navigate to the project directory: | |
| ```bash | |
| cd <project-directory> | |
| ``` | |
| 2. Create and activate a Conda environment (optional): | |
| ```bash | |
| conda create -n prompt_search_env python={version} -y | |
| conda activate prompt_search_env | |
| ``` | |
| - Replace `{version}` with your desired Python version (e.g., 3.9). | |
| 3. Install dependencies inside the Conda environment using `pip`: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. Review and update the `config.py` file to match your environment, such as specifying API keys or dataset paths. | |
| ## Run the Project | |
| You can run the application locally using either a Conda environment or Docker: | |
| - **Using Conda Environment:** | |
| 1. Start the backend API. Swagger documentation will be accessible at `http://0.0.0.0:8000/docs`: | |
| ```bash | |
| python run.py | |
| ``` | |
| 2. Run the frontend application: | |
| ```bash | |
| python -m fe.gradio_app | |
| ``` | |
| The frontend will be accessible at `http://0.0.0.0:7860`. | |
| - **Using Docker:** | |
| Refer to the instructions in the next section for building and running the Docker container. | |
| ## Instructions for Building and Running the Docker Container | |
| 1. Build the Docker image: | |
| ```bash | |
| docker build -t prompt-search-engine . | |
| ``` | |
| 2. Run the Docker container: | |
| ```bash | |
| docker run -p 8000:8000 -p 7860:7860 prompt-search-engine | |
| ``` | |
| - The backend API will be accessible at `http://0.0.0.0:8000/docs`. | |
| - The frontend will be accessible at `http://0.0.0.0:7860`. | |
| Your environment is now ready to use the Prompt Search Engine. | |
| ## API Endpoints and Usage | |
| ### `/search` (GET) | |
| Endpoint for querying the search engine. | |
| #### Parameters: | |
| - `query` (str): The search query. **Required**. | |
| - `n` (int): Number of results to return (default: 5). Must be greater than or equal to 1. | |
| #### Example Request: | |
| ```bash | |
| curl -X GET "http://0.0.0.0:8000/search?query=example+prompt&n=5" | |
| ``` | |
| #### Example Response: | |
| ```json | |
| { | |
| "query": "example prompt", | |
| "results": [ | |
| {"score": 0.95, "prompt": "example similar prompt 1"}, | |
| {"score": 0.92, "prompt": "example similar prompt 2"} | |
| ] | |
| } | |
| ``` | |
| --- | |
| ## Deployment Details | |
| ### Overview | |
| This section outlines the steps to deploy the **Prompt Search Engine** application using Docker and Hugging Face Spaces. The application comprises a backend (API) and a frontend (Gradio-based UI) that runs together in a single Docker container. | |
| ### Prerequisites | |
| 1. A [Hugging Face account](https://huggingface.co/). | |
| 2. Git installed locally. | |
| 3. Access to the project repository on GitHub. | |
| 4. Docker installed locally for testing. | |
| 5. A Hugging Face **Access Token** (needed for authentication). | |
| ### Deployment Steps | |
| 1. **Create a Hugging Face Space:** | |
| - Log in to [Hugging Face Spaces](https://huggingface.co/spaces). | |
| - Click on **Create Space**. | |
| - Fill in the details: | |
| - **Space Name**: Choose a name like `promptsearchengine`. | |
| - **SDK**: Select `Docker`. | |
| - **Visibility**: Choose between public or private. | |
| - Click **Create Space** to generate a new repository. | |
| 2. **Create a Hugging Face Access Token:** | |
| - Log in to [Hugging Face](https://huggingface.co/). | |
| - Navigate to **Settings** > **Access Tokens**. | |
| - Click **New Token**: | |
| - **Name**: `Promptsearchengine Deployment`. | |
| - **Role**: Select `Write`. | |
| - Copy the token. Youโll need it for pushing to Hugging Face Spaces. | |
| 3. **Test the Application Locally:** | |
| ```bash | |
| docker build -t promptsearchengine . | |
| docker run -p 8000:8000 -p 7860:7860 promptsearchengine | |
| ``` | |
| - **Backend**: Test at `http://localhost:8000`. | |
| - **Frontend**: Test at `http://localhost:7860`. | |
| 4. **Prepare the Project for Hugging Face Spaces:** | |
| - Ensure the `Dockerfile` is updated for Hugging Face Spaces: | |
| - Set environment variables for writable directories (e.g., `HF_HOME=/tmp/huggingface`). | |
| - Ensure a valid `README.md` is present at the root with the Hugging Face configuration: | |
| ```markdown | |
| --- | |
| title: Promptsearchengine | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| --- | |
| ``` | |
| 5. **Push the Project to Hugging Face Spaces:** | |
| ```bash | |
| git remote add space https://huggingface.co/spaces/<your-username>/promptsearchengine | |
| git push space main | |
| ``` | |
| 6. **Monitor the Build Logs:** | |
| - Navigate to your Space on Hugging Face. | |
| - Monitor the "Logs" tab to ensure the build completes successfully. | |
| ### Testing the Deployment | |
| Once deployed, test the application on `https://huggingface.co/spaces/<your-username>/promptsearchengine`. | |
| ## Running Tests | |
| Execute all the tests by running in the terminal within your local project environment: | |
| ```bash | |
| python -m pytest -vv tests/ | |
| ``` | |
| ### Test Structure | |
| - **Unit Tests**: Focus on isolated functionality, like individual endpoints or methods. | |
| - **Integration Tests**: Verify end-to-end behavior using real components. | |
| ## Information on How to Use the UI | |
| The **Prompt Search Engine** interface is designed for simplicity and ease of use. Follow these steps to interact with the application: | |
| 1. **Enter Your Query**: | |
| - In the "Enter your query" field, type a phrase or keywords for which you want to find related prompts. | |
| 2. **Set the Number of Results**: | |
| - Use the "Number of top results" field to specify how many similar prompts you want to retrieve. Default is 5. | |
| 3. **Submit a Query**: | |
| - Click the **Search** button to execute your query and display results in real-time. | |
| 4. **View Results**: | |
| - The results will display in a table with the following columns: | |
| - **Prompt**: The retrieved prompts that are most similar to your query. | |
| - **Similarity**: The similarity score between your query and each retrieved prompt. | |
| 5. **Interpreting Results**: | |
| - Higher similarity scores indicate a closer match to your query. | |
| - Use these prompts to refine or inspire new input for your task. | |
| The clean, dark theme is optimized for readability, making it easier to analyze and use the results effectively. | |
| ## Future Improvements | |
| 1. **Replace Print Statements with Logging** | |
| - Integrate the `logging` module to replace print statements for better debugging, configurability, and structured logs. | |
| 2. **GitHub Workflow for Continuous Integration** | |
| - Set up GitHub Actions to automatically sync the GitHub repository with the Hugging Face Space. This will streamline deployment processes and ensure consistency across versions. | |
| 3. **Support for Predownloaded Datasets** | |
| - Add options to use predownloaded datasets instead of downloading them at runtime, enhancing usability for restricted or offline environments. | |
| 4. **Code Refactoring** | |
| - Extract remaining hardcoded values into a constants module or configuration file for better maintainability. | |
| 5. **Improve Prompt Corpus Handling** | |
| - Make the current limitation on the number of prompts configurable via a parameter or remove it entirely if unnecessary to provide users with greater flexibility. | |
| 6. **Database or Persistent Storage** | |
| - Explore integrating a database for persistent storage, moving away from reliance on runtime memory or temporary files to enhance scalability and reliability. | |
| 7. **Enhance Unit Testing** | |
| - Expand the test coverage for edge cases and performance testing; | |
| - Automate test execution with GitHub Actions to maintain code quality and reliability. | |
| 8. **Frontend Enhancements** | |
| - Focus on general improvements to the Gradio frontend, such as better customization, theming, and user experience. | |