Deploying Locally
The following document provides instructions for deploying the Youtu Parsing as a backend service for Youtu-RAG.
Requirements:
- Conda environment with Python 3.10
- CUDA version 12.x
- Linux x86_64 operating system
Installing Youtu Parsing
First create a conda environment and install the required dependencies:
Setting up Flash Attention
Flash attention V2 is required for Youtu Parsing to run efficiently. Follow the instructions below to install it:
Note: Flash Attention installation is platform-specific. If you encounter issues, please refer to the official installation guide.
Downloading the Youtu Parsing Model Weights
Download the pre-trained model weights from our official repository:
Installing dependencies for the server
Next, install the additional dependencies required to run the Youtu Parsing server:
Running the Youtu Parsing Server
Save the following code as youtu_parsing_server.py:
Now you may run the server with the following command:
API Endpoints
Once running, the following endpoints are available:
| Endpoint | Method | Description |
|---|---|---|
/parse or / | POST | Parse a base64-encoded image |
/health | GET | Health check endpoint |
Example Usage
To parse an image, send a POST request with a base64-encoded image:
Check server health:
