Category: Python

A Simple Implementation of the ‘Talk to Your Document’ Principle with AI using OpenAI & Python

AI is changing how we work with information. With “talk to your documents,” files like PDFs or spreadsheets are no longer static — you can ask them questions directly and get instant, meaningful answers, instead of wasting time searching or scrolling.

Why “Talk to Your Document”?

Save time: No more scanning through 50-page reports to find a single answer.
Make knowledge accessible: Anyone in your team can query technical or legal documents without being an expert.
Improve productivity: Teams spend less time searching and more time acting.

From HR policies and financial reports to research papers and contracts, the principle applies everywhere.

How It Works

At a high level, the process is straightforward:

Upload Documents – PDFs, Word files, or spreadsheets are ingested into the system.
Chunking – The text is split into manageable sections (e.g., 500–1000 characters) so that AI can handle context effectively.
Embedding & Indexing – Each chunk is converted into a vector (a numerical representation of meaning) using tools like SentenceTransformers. These vectors are stored in a search index such as FAISS.
User Query – When a user asks a question, the query is also converted into a vector and matched with the most relevant chunks.
AI Response – A language model uses the retrieved chunks to generate an accurate and conversational answer.

Simple Example using OpenAI & Python

import fitz  # PyMuPDF for PDFs
from sentence_transformers import SentenceTransformer
import faiss

# Load PDF and extract text
doc = fitz.open("document.pdf")
text = " ".join([page.get_text() for page in doc])

# Split into chunks
chunks = [text[i:i+500] for i in range(0, len(text), 500)]

# Create embeddings
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(chunks)

# Build FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

# User query
query = "What are the key benefits in this document?"
query_embedding = model.encode([query])
D, I = index.search(query_embedding, k=1)

print("Answer:", chunks[I[0][0]])

This script is very simplified, but it demonstrates the talk-to-your-document principle:

Load a file
Create vector representations
Retrieve the most relevant text chunk
Return the answer

We may try a better implementation through a simple Python implementation using:

LangChain (to load, split, and search documents)
FAISS (for efficient vector storage & retrieval)
OpenAI GPT models (to generate conversational answers)

How It Works

Load the document (PDF or TXT).
Split the text into chunks to make it manageable for embeddings.
Generate embeddings using OpenAI models.
Store embeddings in FAISS, a fast vector database.
Retrieve the most relevant chunks based on the user’s question.
Pass the context + question to an LLM (e.g., GPT-4o-mini).
Get a natural language answer as if the document were talking back.

Project Setup

First, install the dependencies:

pip install langchain langchain-community langchain-openai faiss-cpu python-dotenv PyMuPDF pandas

And create a .env file for your API key and model:

OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o-mini

Example Code

Here’s a complete script you can try:

# talk_to_doc.py

import os
from dotenv import load_dotenv
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader, TextLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

# Load environment variables
load_dotenv()
OPENAI_MODEL = os.getenv("OPENAI_MODEL", "gpt-4o-mini")

# ---- Load Document ----
def load_document(path: str):
    if path.endswith(".pdf"):
        loader = PyPDFLoader(path)
    else:
        loader = TextLoader(path, encoding="utf-8")
    return loader.load()

# ---- Build Vector Store ----
def build_vectorstore(docs):
    splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)
    chunks = splitter.split_documents(docs)
    embeddings = OpenAIEmbeddings()
    return FAISS.from_documents(chunks, embeddings)

# ---- Ask questions ----
def query_vectorstore(vectorstore, question: str):
    retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
    llm = ChatOpenAI(model=OPENAI_MODEL, temperature=0)
    docs = retriever.get_relevant_documents(question)
    context = "\n\n".join([d.page_content for d in docs])
    prompt = f"Answer the question based only on the context:\n\n{context}\n\nQuestion: {question}"
    return llm.predict(prompt)

if __name__ == "__main__":
    # Example usage
    path = "docs/example.pdf"  # or "example.txt"
    docs = load_document(path)
    vectorstore = build_vectorstore(docs)

    print("Chat with your document! (type 'exit' to quit)")
    while True:
        q = input("\nYour question: ")
        if q.lower() in ["exit", "quit"]:
            break
        answer = query_vectorstore(vectorstore, q)
        print(f"\nAI: {answer}")

Example Run

Suppose you have a PDF of your company’s HR policies. Running the script:

python talk_to_doc.py

You can now ask:

python talk_to_doc.py
Chat with your document! (type 'exit' to quit)

Your question: What is this document about?
AI: The document discusses the development and features of an open-source document conversion tool called Docling, which focuses on ensuring that documents are free to use. It highlights the sources of data used for the tool, the challenges of annotating scanned documents, and the preparation work involved in using a cloud-native platform for visual annotation. Additionally, it mentions the gap in the market for open-source tools compared to commercial software for document understanding and conversion, emphasizing the capabilities of Docling in layout analysis and table structure recognition.

Conclusion

The “talk to your document” principle is about making information conversational and accessible. With just a few open-source libraries and a language model, you can transform static files into interactive knowledge companions.

October 1, 2025

Refactoring a Monolithic Django Application – Before/After and Performance Gains
Refactoring a monolithic Django application can significantly improve maintainability, scalability, and performance. This article explores the before and after of such a refactor, the strategies used, and the measurable gains in performance.

Why Refactor a Monolithic Django App?
- Maintainability: As the codebase grows, a monolith can become difficult to maintain.
- Performance: Tight coupling between modules may lead to slow responses and high memory usage.
- Scalability: Monolithic apps are harder to scale horizontally compared to microservices.
- Agility: Introducing new features is slower due to interdependencies.
Common Challenges in Monolithic Django Applications
- Tightly Coupled Code: Models, views, and templates are heavily interdependent.
- Single Database Bottleneck: All modules access the same database schema, leading to contention.
- Long Build and Deployment Times: Even minor changes require redeploying the entire application.
- Testing Difficulties: Running tests can be slow and complex due to the large codebase.
Refactoring Strategy

Modularization
- Split the monolith into reusable Django apps with clearly defined responsibilities.
- Example: Separate users, orders, and products apps.
Decouple Services
- Move non-critical or resource-intensive features into separate services or microservices.
- Use Django REST Framework (DRF) to expose APIs for inter-service communication.
Optimize Database Access
- Use Django ORM efficiently: reduce N+1 queries, leverage select_related and prefetch_related.
- Introduce caching for frequently accessed data with Redis or Memcached.
- Consider read replicas for high-traffic tables.
Asynchronous Tasks
- Offload heavy operations to background tasks using Celery or Django-Q.
- Examples: sending emails, processing images, generating reports.
Frontend Optimization
- Minimize server-side rendering for static content.
- Use client-side frameworks or React for interactive components.
Before/After Comparison

Aspect Before After
Response Time Avg. 1.2s Avg. 0.5s
Database Queries per Page 45 12
CPU Usage High under load Moderate
Deployment Time 15 min 4 min
Test Suite Duration 45 min 15 min

Lessons Learned
- Incremental Refactoring: Avoid a complete rewrite. Refactor in stages to reduce risk.
- Monitoring is Key: Use metrics (CPU, memory, response time) to measure performance gains.
- Automated Testing: Ensure all refactored components are thoroughly tested.
- Team Collaboration: Maintain clear documentation and consistent coding standards.
- Use Modern Django Features: Leverage async views, QuerySet optimizations, and built-in caching mechanisms.
September 12, 2025

Aspect	Before	After
Response Time	Avg. 1.2s	Avg. 0.5s
Database Queries per Page	45	12
CPU Usage	High under load	Moderate
Deployment Time	15 min	4 min
Test Suite Duration	45 min	15 min

Category: Python

A Simple Implementation of the ‘Talk to Your Document’ Principle with AI using OpenAI & Python

How It Works

Conclusion

Refactoring a Monolithic Django Application – Before/After and Performance Gains