Summary

Short tutorial building a website Q&A chatbot with LangChain in 4 steps: scrape a webpage with WebBaseLoader, embed and store content in an InMemoryVectorStore, retrieve relevant chunks via similarity search, then answer questions using ChatOpenAI (GPT-4o-mini).

使用 LangChain 分 4 步構建網站問答聊天機器人的簡短教程:用 WebBaseLoader 爬取網頁、在 InMemoryVectorStore 中嵌入和存儲內容、通過相似性搜索檢索相關片段,然後使用 ChatOpenAI(GPT-4o-mini)回答問題。

Key Points

  • 4-step pipeline: (1) WebBaseLoader → fetch webpage; (2) InMemoryVectorStore + OpenAIEmbeddings → embed docs; (3) similarity_search(question, k=3) → get relevant chunks; (4) ChatOpenAI + system prompt with context → answer
  • InMemoryVectorStore: simple in-RAM vector store; data lost on process exit; for persistence use FAISS or a DB-backed store
  • Pattern is the classic RAG (Retrieval-Augmented Generation) pipeline in minimal form
  • Practical use case: the author built it to track travel deals on specific sites

Insights

This is a minimal but complete RAG implementation — it shows the core 4-step pattern that underlies virtually all RAG chatbots. The simplicity is pedagogically valuable: no framework magic, each step is explicit. The InMemoryVectorStore is fine for prototyping; production use requires persistent storage (FAISS, Chroma, pgvector, etc.). The tutorial omits chunking (splitting large documents), which becomes necessary for longer pages — the absence of a text splitter is a limitation for real-world use.

Connections

Raw Excerpt

With only a few lines of code, I was able to build a very well functioning chatbot. Here’s the basic idea: grab content from a webpage, store the content in a searchable vector store, search for relevant documents based on a user’s question, and use OpenAI’s language model to answer the question.