Key Technologies
- LanceDB: Multimodal vector database for efficient storage and retrieval
- PydanticAI: Modern AI agent framework with type safety
- Sentence Transformers: Text embeddings for semantic search
- CLIP: Vision-language model for image understanding
- Streamlit: Interactive web application framework
Tutorial Overview
Option 1: Notebook
The notebook shows how to work through the steps and prepare a small sample recipe dataset, generate both text and image embeddings, store everything efficiently in LanceDB, and then build a PydanticAI agent with custom tools to query it. You’ll finish by testing the agent against a few example questions to see the full multimodal flow end to end.Option 2: Demo Application (Local Setup)
The demo application is the full codebase: you’ll download and process a real recipe dataset with thousands of items, run a Streamlit chat interface that supports image upload, and follow a structure that includes production-minded touches like error handling, logging, and monitoring. Everything you need to deploy is included.Download Tutorial Files
Download the files for the full demo application here.
Dataset Information
- Source: Kaggle Recipe Dataset
- Size: Thousands of recipes with images
- Format: CSV file with recipe data and image references