Deploying with Docker
This document provides instructions for building a self-contained Docker image for Youtu Embedding that includes all model weights and dependencies.
Requirements:
- Docker installed on your system
- Sufficient disk space (~8GB for the image)
- NVIDIA GPU with CUDA support (optional, for GPU acceleration)
Setting Up the Build Directory
First, create a directory for building the Docker image and download the model weights:
Creating the Embedding Server Script
Create a file named embedding_server.py with the server implementation. You can copy this from the local deployment guide or create it with the following content:
Creating the Dockerfile
Create a file named Dockerfile with the following content:
Building the Docker Image
Build the Docker image with the following command:
This process may take several minutes as it downloads the base image and copies the model weights.
Running the Docker Container
Run the container with GPU support:
For CPU-only mode:
You can customize the server parameters:
Pushing to a Container Registry
To deploy on a remote machine, push the image to a container registry:
Running on a Remote Machine
On the remote machine, pull and run the image:
The Youtu Embedding service will be available at http://<remote-machine-ip>:8501.
API Endpoints
Once running, the following endpoints are available:
| Endpoint | Method | Description |
|---|---|---|
/embed_query | POST | Embed a single query |
/embed_docs | POST | Embed multiple documents |
/embed | POST | Generic embedding endpoint |
/embed_texts | POST | Embed texts with custom instruction |
/model_id | GET | Get the model checkpoint path |
/health | GET | Health check endpoint |
Final Directory Structure
Before building, your youtu-embedding-docker directory should have the following structure:
