LanceDB

LanceDB is a multimodal lakehouse for AI, built on top of Lance, an open-source lakehouse format. Below, we list a few ways LanceDB can help you build and scale your AI and ML workloads.

Massively scalable, fast and high-quality retrieval − without breaking the bank

Use LanceDB as the data + retrieval layer for production AI workloads: RAG, agents, semantic search, recommendation systems, and more. Keep multimodal data, metadata, and embeddings in the same table and query them via vector search, full-text search or SQL. Easily add new features (columns in your tables) as your application evolves, without copying existing data.

High-performance random access and data management for model training

Use LanceDB to curate, explore and distribute very large multimodal datasets for training and fine-tuning models. LanceDB comes with built-in table versioning, schema evolution, and fast random access, making it practical to do dataset slicing, sampling, exploratory analysis and shuffles on large, evolving corpora.

LanceDB is designed for a variety of workloads and deployment scenarios, and supports use cases that are way beyond traditional vector search. The LanceDB suite includes three products, all built on top of the same open-source Lance format and table abstractions.

Use cases

Embedding pipelines: Add new columns (features), create embeddings, and transform your data at scale. LanceDB lets you extend tables both vertically and horizontally with minimal I/O overhead.
Search: Build high-performance search and retrieval applications using LanceDB’s optimized storage, including vector search, full-text search, and hybrid search with secondary indexes.
Training: Efficiently access and manage large-scale multimodal datasets for training and fine-tuning AI models.
Exploratory Data Analysis: Analyze and search through petabyte-scale multimodal datasets, including video and point cloud data, to gain insights and inform model development.

Choose how you run LanceDB

Depending on your needs, you can choose one of three ways to run LanceDB.

LanceDB OSS

The fastest way to get started is the open-source embedded library, with client SDKs in Python, TypeScript and Rust. Run it locally during development, then use the same data model and APIs as you scale up and need a managed solution. Start here:

Quickstart

Get started with LanceDB in minutes.

Basic Table Operations

Create tables, search vectors, and modify data in LanceDB.

LanceDB Enterprise

LanceDB Enterprise is a distributed and managed multimodal lakehouse built for search, exploratory data analysis, feature engineering, and training-oriented data access workflows on top of the same core table abstraction. This eliminates the need for teams to build bespoke infrastructure to manage petabyte-scale multimodal datasets. To get started, reach out at contact@lancedb.com.

Built with scale, performance, and security in mind.LanceDB Enterprise is designed for very large-scale, high-performance, distributed workloads in private deployments, and can operate under strict security requirements.

LanceDB Cloud

LanceDB Cloud is a serverless, managed service for users who are more focused on search use cases. You can easily create and manage projects in the Cloud UI, and integrate via REST API or client SDKs (Python, TypeScript, Rust).

Get started

User Guide

Feature Engineering (Geneva)

Support

Use cases

Choose how you run LanceDB

LanceDB OSS

Quickstart

Basic Table Operations

LanceDB Enterprise

LanceDB Cloud

Serverless vector search

Get started

User Guide

Feature Engineering (Geneva)

Support

​Use cases

​Choose how you run LanceDB

​LanceDB OSS

Quickstart

Basic Table Operations

​LanceDB Enterprise

​LanceDB Cloud

Serverless vector search

Use cases

Choose how you run LanceDB

LanceDB OSS

LanceDB Enterprise

LanceDB Cloud