RAG Document Assistant
RAG Document Assistant
This project is a production-ready Retrieval-Augmented Generation (RAG) application designed for intelligent document querying and analysis. Available on GitHub at ammarlouah/rag_project, it supports multiple document formats, hybrid vector stores, caching, and real-time streaming responses using LangChain, Groq API, ChromaDB, FAISS, Redis, and Streamlit. Below, I detail the project’s features, architecture, setup instructions, and key implementation insights.
Project Overview
The RAG Document Assistant enables users to upload documents in various formats, index them efficiently, and query them using natural language. It combines retrieval from vector stores with generation from a large language model, providing accurate, context-aware responses with source attributions.
Key features include:
- Multi-format support: PDF, DOCX, TXT, Markdown, CSV.
- Hybrid vector store: ChromaDB for persistence and FAISS for fast similarity search.
- Intelligent caching with Redis for faster repeated queries.
- Streaming responses from Groq LLM for real-time interaction.
- Source citation with page numbers.
- Adaptive chunking based on document type.
- Persistent chat history and document management via Streamlit UI.
The architecture follows: User Query → Cache Check → Vector Search (FAISS/ChromaDB) → LLM (Groq) → Streaming Response → Cache Store → UI Display.
Video Demonstration
To illustrate the application’s capabilities, here’s a video walkthrough:
RAG Document Assistant Demo:
Repository Structure
The GitHub repository is organized as follows:
app.py: Main Streamlit application entry point.requirements.txt: List of Python dependencies..env.example: Template for environment variables.rag/:groq_llm.py: Groq LLM integration.embeddings.py: Embedding model setup (HuggingFace).vector_store.py: Hybrid vector store management (ChromaDB + FAISS).chain.py: RAG chain configuration.
processing/:document_processor.py: Handles document ingestion and processing.chunker.py: Adaptive text chunking logic.file_loaders/*.py: Loaders for different file formats (PDF, DOCX, etc.).
redis_backend/:redis_client.py: Redis connection utilities.cache.py: Caching mechanisms for queries and responses.
utils/:helpers.py: Utility functions for the application.
Functionality
Document Processing and Indexing
- Supports loading and parsing multiple formats using PyMuPDF, python-docx, and pandas.
- Smart chunking adapts overlap and size based on document type for optimal retrieval.
- Embeddings generated via HuggingFace’s all-MiniLM-L6-v2 model.
- Indexed in hybrid stores: ChromaDB for persistent storage and FAISS for efficient querying.
Query Handling
- Checks Redis cache for existing responses.
- Performs vector search to retrieve relevant chunks (configurable top-K).
- Augments query with retrieved context and passes to Groq LLM (e.g., llama-3.1-70b-versatile).
- Streams response in real-time via Streamlit.
- Caches new responses for future use.
User Interface
- Streamlit-based UI for document upload, querying, and management.
- Sidebar for file handling (upload, view, delete).
- Chat interface with history persistence.
- Settings for retrieval count, temperature, and caching toggle.
Setup Instructions
Prerequisites
- Python: 3.8+.
- Redis: Server installed and running.
- Groq API Key: Obtain from Groq Console.
Cloning the Repository
1
2
git clone https://github.com/ammarlouah/rag_project.git
cd rag_project
Installation
- Create and activate a virtual environment:
1 2
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
1
pip install -r requirements.txt
Install and Start Redis
- Ubuntu/Debian:
1 2 3 4
sudo apt-get update sudo apt-get install redis-server sudo systemctl start redis-server sudo systemctl enable redis-server
- macOS:
1 2
brew install redis brew services start redis - Windows: Download from Redis for Windows.
- Verify:
redis-cli pingshould return “PONG”.
Configure Environment
1
2
cp .env.example .env
# Edit .env to add GROQ_API_KEY and other variables
Running the Application
1
streamlit run app.py
Access at http://localhost:8501.
Implementation Details
- LLM Integration: Uses Groq API with models like llama-3.1-70b-versatile; configurable in .env.
- Embeddings: HuggingFace model for 384-dim vectors; alternatives like all-mpnet-base-v2 available.
- Chunking: Default size 1000 chars with 200 overlap; adjustable via .env.
- Caching: Redis stores query-response pairs; toggleable in UI.
- Source Attribution: Automatically includes page numbers and metadata in responses.
- Extensibility: Easy to add new file loaders by registering in DocumentProcessor.
Troubleshooting
- Redis Connection: Ensure server is running; check with
redis-cli ping. - Groq API: Verify key and rate limits in Groq Console; try alternative models.
- Import Errors: Reinstall dependencies with
--upgrade. - ChromaDB Issues: Delete
chroma_dbfolder and restart. - Adding Loaders: Implement in
processing/file_loaders/and register indocument_processor.py.
License
The project is released under the MIT License.
Contributing
Contributions are encouraged! Potential enhancements include multi-modal support, advanced memory, query rewriting, hybrid search, and summarization. Fork the repo, create a branch, and submit a Pull Request.
Contact
For questions or feedback, reach out via GitHub issues or email at ammarlouah9@gmail.com.
Explore the full project and try it out at ammarlouah/rag_project!
Last updated: December 21, 2025