DocBrief – Smart Document Assistant

  • Tech Stack: Python · Flask · LangChain · FAISS · Pinecone · Google Generative AI
  • GitHub URL: Project Link

Project Overview
DocBrief is a web application that allows users to upload PDF documents, get concise AI-generated summaries, and ask questions to an intelligent assistant. The assistant retrieves accurate answers directly from the document content using vector search technologies.

Core Features

  • Upload PDF files for processing
  • Automatic document splitting and embedding generation
  • Smart assistant for Q&A based on document content
  • Real-time summaries and insight extraction
  • Uses FAISS or Pinecone for efficient vector search
  • Monitoring via LangSmith for observability

Example Use Cases

  • For CVs: "What is the candidate's experience?"
  • For research papers: "What is the main research question?"
  • For reports: "Summarize the key findings."

Technologies

  • Flask: For building the web interface
  • LangChain: Document parsing and language model integration
  • FAISS / Pinecone: Vector search for fast information retrieval
  • Google Gen AI: For summarizing documents
  • LangSmith: To track and monitor app performance

Customization
You can swap out the LLM (Google Gen AI) with OpenAI's GPT models or your own fine-tuned model by updating the summarization logic.