Quickstart: FlowGRPO training on Qwen-Image OCR dataset with Ascend NPU
Last updated: 05/09/2026
Post-train a diffusion image generation model with FlowGRPO on Atlas 800T A2.
Introduction
This guide launches FlowGRPO LoRA training for Qwen-Image OCR generation on Ascend NPU.
Prerequisite
Prepare an Atlas 800T A2 server with 8 NPUs, and install the necessary software stack.
Install CANN by following the Ascend CANN installation guide.
Install VeRL-Omni and its dependencies as described in the installation guide.
Install the FlowGRPO-specific reward dependency:
uv pip install Levenshtein
Launch Training
Refer to flowgrpo_quickstart for details on the OCR dataset format, preprocessing commands, and general FlowGRPO task descriptions. The launch script for Ascend NPU is located at examples/flowgrpo_trainer/qwen_image/run_qwen_image_ocr_lora_npu.sh.
Run the FlowGRPO training script for Ascend NPU from the repository root:
bash examples/flowgrpo_trainer/qwen_image/run_qwen_image_ocr_lora_npu.sh
The script executes:
python3 -m verl_omni.trainer.main_diffusion
Checkpoints are saved to:
checkpoints/${trainer.project_name}/${trainer.experiment_name}
TensorBoard logs are saved to:
tensorboard_log/${trainer.project_name}/${trainer.experiment_name}
To enable logging with Weights & Biases (WandB), modify examples/flowgrpo_trainer/qwen_image/run_qwen_image_ocr_lora_npu.sh and set:
trainer.logger='["console", "wandb"]'