Post

RAG Document Assistant

RAG Document Assistant

RAG Document Assistant

This project is a production-ready Retrieval-Augmented Generation (RAG) application designed for intelligent document querying and analysis. Available on GitHub at ammarlouah/rag_project, it supports multiple document formats, hybrid vector stores, caching, and real-time streaming responses using LangChain, Groq API, ChromaDB, FAISS, Redis, and Streamlit. Below, I detail the project’s features, architecture, setup instructions, and key implementation insights.

Project Overview

The RAG Document Assistant enables users to upload documents in various formats, index them efficiently, and query them using natural language. It combines retrieval from vector stores with generation from a large language model, providing accurate, context-aware responses with source attributions.

Key features include:

  • Multi-format support: PDF, DOCX, TXT, Markdown, CSV.
  • Hybrid vector store: ChromaDB for persistence and FAISS for fast similarity search.
  • Intelligent caching with Redis for faster repeated queries.
  • Streaming responses from Groq LLM for real-time interaction.
  • Source citation with page numbers.
  • Adaptive chunking based on document type.
  • Persistent chat history and document management via Streamlit UI.

The architecture follows: User Query → Cache Check → Vector Search (FAISS/ChromaDB) → LLM (Groq) → Streaming Response → Cache Store → UI Display.

Video Demonstration

To illustrate the application’s capabilities, here’s a video walkthrough:

RAG Document Assistant Demo:

Repository Structure

The GitHub repository is organized as follows:

  • app.py: Main Streamlit application entry point.
  • requirements.txt: List of Python dependencies.
  • .env.example: Template for environment variables.
  • rag/:
    • groq_llm.py: Groq LLM integration.
    • embeddings.py: Embedding model setup (HuggingFace).
    • vector_store.py: Hybrid vector store management (ChromaDB + FAISS).
    • chain.py: RAG chain configuration.
  • processing/:
    • document_processor.py: Handles document ingestion and processing.
    • chunker.py: Adaptive text chunking logic.
    • file_loaders/*.py: Loaders for different file formats (PDF, DOCX, etc.).
  • redis_backend/:
    • redis_client.py: Redis connection utilities.
    • cache.py: Caching mechanisms for queries and responses.
  • utils/:
    • helpers.py: Utility functions for the application.

Functionality

Document Processing and Indexing

  • Supports loading and parsing multiple formats using PyMuPDF, python-docx, and pandas.
  • Smart chunking adapts overlap and size based on document type for optimal retrieval.
  • Embeddings generated via HuggingFace’s all-MiniLM-L6-v2 model.
  • Indexed in hybrid stores: ChromaDB for persistent storage and FAISS for efficient querying.

Query Handling

  • Checks Redis cache for existing responses.
  • Performs vector search to retrieve relevant chunks (configurable top-K).
  • Augments query with retrieved context and passes to Groq LLM (e.g., llama-3.1-70b-versatile).
  • Streams response in real-time via Streamlit.
  • Caches new responses for future use.

User Interface

  • Streamlit-based UI for document upload, querying, and management.
  • Sidebar for file handling (upload, view, delete).
  • Chat interface with history persistence.
  • Settings for retrieval count, temperature, and caching toggle.

Setup Instructions

Prerequisites

  • Python: 3.8+.
  • Redis: Server installed and running.
  • Groq API Key: Obtain from Groq Console.

Cloning the Repository

1
2
git clone https://github.com/ammarlouah/rag_project.git
cd rag_project

Installation

  1. Create and activate a virtual environment:
    1
    2
    
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  2. Install dependencies:
    1
    
    pip install -r requirements.txt
    

Install and Start Redis

  • Ubuntu/Debian:
    1
    2
    3
    4
    
    sudo apt-get update
    sudo apt-get install redis-server
    sudo systemctl start redis-server
    sudo systemctl enable redis-server
    
  • macOS:
    1
    2
    
    brew install redis
    brew services start redis
    
  • Windows: Download from Redis for Windows.
  • Verify: redis-cli ping should return “PONG”.

Configure Environment

1
2
cp .env.example .env
# Edit .env to add GROQ_API_KEY and other variables

Running the Application

1
streamlit run app.py

Access at http://localhost:8501.

Implementation Details

  • LLM Integration: Uses Groq API with models like llama-3.1-70b-versatile; configurable in .env.
  • Embeddings: HuggingFace model for 384-dim vectors; alternatives like all-mpnet-base-v2 available.
  • Chunking: Default size 1000 chars with 200 overlap; adjustable via .env.
  • Caching: Redis stores query-response pairs; toggleable in UI.
  • Source Attribution: Automatically includes page numbers and metadata in responses.
  • Extensibility: Easy to add new file loaders by registering in DocumentProcessor.

Troubleshooting

  • Redis Connection: Ensure server is running; check with redis-cli ping.
  • Groq API: Verify key and rate limits in Groq Console; try alternative models.
  • Import Errors: Reinstall dependencies with --upgrade.
  • ChromaDB Issues: Delete chroma_db folder and restart.
  • Adding Loaders: Implement in processing/file_loaders/ and register in document_processor.py.

License

The project is released under the MIT License.

Contributing

Contributions are encouraged! Potential enhancements include multi-modal support, advanced memory, query rewriting, hybrid search, and summarization. Fork the repo, create a branch, and submit a Pull Request.

Contact

For questions or feedback, reach out via GitHub issues or email at ammarlouah9@gmail.com.

Explore the full project and try it out at ammarlouah/rag_project!

Last updated: December 21, 2025

This post is licensed under CC BY 4.0 by the author.