RAG Document Assistant

Posted Dec 21, 2025

By ammarlouah

3 min read

This project is a production-ready Retrieval-Augmented Generation (RAG) application designed for intelligent document querying and analysis. Available on GitHub at ammarlouah/rag_project, it supports multiple document formats, hybrid vector stores, caching, and real-time streaming responses using LangChain, Groq API, ChromaDB, FAISS, Redis, and Streamlit. Below, I detail the project’s features, architecture, setup instructions, and key implementation insights.

Project Overview

The RAG Document Assistant enables users to upload documents in various formats, index them efficiently, and query them using natural language. It combines retrieval from vector stores with generation from a large language model, providing accurate, context-aware responses with source attributions.

Key features include:

Multi-format support: PDF, DOCX, TXT, Markdown, CSV.
Hybrid vector store: ChromaDB for persistence and FAISS for fast similarity search.
Intelligent caching with Redis for faster repeated queries.
Streaming responses from Groq LLM for real-time interaction.
Source citation with page numbers.
Adaptive chunking based on document type.
Persistent chat history and document management via Streamlit UI.

The architecture follows: User Query → Cache Check → Vector Search (FAISS/ChromaDB) → LLM (Groq) → Streaming Response → Cache Store → UI Display.

Video Demonstration

To illustrate the application’s capabilities, here’s a video walkthrough:

RAG Document Assistant Demo:

Repository Structure

The GitHub repository is organized as follows:

app.py: Main Streamlit application entry point.
requirements.txt: List of Python dependencies.
.env.example: Template for environment variables.
rag/:
- groq_llm.py: Groq LLM integration.
- embeddings.py: Embedding model setup (HuggingFace).
- vector_store.py: Hybrid vector store management (ChromaDB + FAISS).
- chain.py: RAG chain configuration.
processing/:
- document_processor.py: Handles document ingestion and processing.
- chunker.py: Adaptive text chunking logic.
- file_loaders/*.py: Loaders for different file formats (PDF, DOCX, etc.).
redis_backend/:
- redis_client.py: Redis connection utilities.
- cache.py: Caching mechanisms for queries and responses.
utils/:
- helpers.py: Utility functions for the application.

Functionality

Document Processing and Indexing

Supports loading and parsing multiple formats using PyMuPDF, python-docx, and pandas.
Smart chunking adapts overlap and size based on document type for optimal retrieval.
Embeddings generated via HuggingFace’s all-MiniLM-L6-v2 model.
Indexed in hybrid stores: ChromaDB for persistent storage and FAISS for efficient querying.

Query Handling

Checks Redis cache for existing responses.
Performs vector search to retrieve relevant chunks (configurable top-K).
Augments query with retrieved context and passes to Groq LLM (e.g., llama-3.1-70b-versatile).
Streams response in real-time via Streamlit.
Caches new responses for future use.

User Interface

Streamlit-based UI for document upload, querying, and management.
Sidebar for file handling (upload, view, delete).
Chat interface with history persistence.
Settings for retrieval count, temperature, and caching toggle.

Setup Instructions

Prerequisites

Python: 3.8+.
Redis: Server installed and running.
Groq API Key: Obtain from Groq Console.

Cloning the Repository

git clone https://github.com/ammarlouah/rag_project.git
cd rag_project

Installation

Create and activate a virtual environment:

  
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
1 pip install -r requirements.txt

Install and Start Redis

Ubuntu/Debian:

  
sudo apt-get update
sudo apt-get install redis-server
sudo systemctl start redis-server
sudo systemctl enable redis-server

macOS:

brew install redis
brew services start redis

Windows: Download from Redis for Windows.
Verify: redis-cli ping should return “PONG”.

Configure Environment

cp .env.example .env
# Edit .env to add GROQ_API_KEY and other variables

Running the Application

streamlit run app.py

Access at http://localhost:8501.

Implementation Details

LLM Integration: Uses Groq API with models like llama-3.1-70b-versatile; configurable in .env.
Embeddings: HuggingFace model for 384-dim vectors; alternatives like all-mpnet-base-v2 available.
Chunking: Default size 1000 chars with 200 overlap; adjustable via .env.
Caching: Redis stores query-response pairs; toggleable in UI.
Source Attribution: Automatically includes page numbers and metadata in responses.
Extensibility: Easy to add new file loaders by registering in DocumentProcessor.

Troubleshooting

Redis Connection: Ensure server is running; check with redis-cli ping.
Groq API: Verify key and rate limits in Groq Console; try alternative models.
Import Errors: Reinstall dependencies with --upgrade.
ChromaDB Issues: Delete chroma_db folder and restart.
Adding Loaders: Implement in processing/file_loaders/ and register in document_processor.py.

License

The project is released under the MIT License.

Contributing

Contributions are encouraged! Potential enhancements include multi-modal support, advanced memory, query rewriting, hybrid search, and summarization. Fork the repo, create a branch, and submit a Pull Request.

Contact

For questions or feedback, reach out via GitHub issues or email at ammarlouah9@gmail.com.

Explore the full project and try it out at ammarlouah/rag_project!

Last updated: December 21, 2025

Projects, AI, Machine Learning, RAG

This post is licensed under CC BY 4.0 by the author.

RAG Document Assistant

Project Overview

Video Demonstration

Repository Structure

Functionality

Document Processing and Indexing

Query Handling

User Interface

Setup Instructions

Prerequisites

Cloning the Repository

Installation

Install and Start Redis

Configure Environment

Running the Application

Implementation Details

Troubleshooting

License

Contributing

Contact

Trending Tags