VeRL-Omni

Getting Started

  • Installation
  • Quickstart: FlowGRPO training on Qwen-Image OCR dataset
  • Training Metrics

Advanced Features

  • Rollout Correction for Diffusion Training (Experimental)
  • Using an External HTTP Scorer Service

Algorithms

  • Flow-GRPO
  • DiffusionNFT
  • GRPO-Guard
  • Mix-GRPO
  • Performance Reference

Performance Tuning Guide

  • Diffusion FLOPs / MFU
  • Profiling FlowGRPO / diffusion training in VeRL-Omni

Hardware Support

  • Quickstart: FlowGRPO training on Qwen-Image OCR dataset with Ascend NPU

API Reference

  • Trainer Interface
  • Workers Interface
  • Rollout & Agent Loop
  • Reward Interface
  • Pipelines Interface
  • Utilities

Developer Guide

  • Editing Agent Instructions
  • How to Integrate a New Diffusion Model for FlowGRPO Training
  • How to Integrate a New Policy-Gradient Algorithm for Diffusion Model
  • How to Integrate a New Direct-Preference Algorithm for Diffusion Model
  • Common Pitfalls
VeRL-Omni
  • Overview: module code

All modules for which code is available

  • verl_omni.pipelines.model_base
  • verl_omni.pipelines.schedulers.flow_match_sde
  • verl_omni.utils.dataset.rl_dataset
  • verl_omni.utils.fs
  • verl_omni.utils.fsdp_utils
  • verl_omni.utils.reward_score
    • verl_omni.utils.reward_score.genrm_ocr
    • verl_omni.utils.reward_score.http_scorer_client
    • verl_omni.utils.reward_score.jpeg_compressibility
    • verl_omni.utils.reward_score.reward_utils
    • verl_omni.utils.reward_score.unified_reward
  • verl_omni.workers.config.diffusion.actor
  • verl_omni.workers.config.diffusion.model
  • verl_omni.workers.config.diffusion.rollout
  • verl_omni.workers.utils.padding

© Copyright 2026 Bytedance Ltd. and/or its affiliates.

Built with Sphinx using a theme provided by Read the Docs.