Deploying with Docker
This document provides instructions for building a self-contained Docker image for Youtu HiChunk that includes all model weights and dependencies.
Requirements:
- Docker installed on your system
- Sufficient disk space (~10GB for the image)
- NVIDIA GPU with CUDA 12.x support (for running the container)
Setting Up the Build Directory
First, create a directory for building the Docker image and download the model weights:
Creating the Custom vLLM Model Files
Youtu HiChunk requires custom model files to be registered with vLLM. Create the following files in your build directory.
utu_v1.py
Create a file named utu_v1.py with the Youtu HiChunk model implementation. You can copy this file from the downloaded Youtu-HiChunk directory or from the local deployment guide.
registry.py
Create a file named registry.py with the updated vLLM model registry. You can copy this file from the downloaded HiChunk directory or from the local deployment guide.
Creating the Dockerfile
Create a file named Dockerfile with the following content:
Building the Docker Image
Build the Docker image with the following command:
This process may take several minutes as it downloads the base image and copies the model weights.
Running the Docker Container
Run the container with GPU support:
You can also customize the server parameters by overriding the CMD:
Pushing to a Container Registry
To deploy on a remote machine, push the image to a container registry:
Running on a Remote Machine
On the remote machine, pull and run the image:
The Youtu HiChunk service will be available at http://<remote-machine-ip>:8501.
Final Directory Structure
Before building, your hichunk-docker directory should have the following structure:
