openrlhf-training

Item: openrlhf-training
Rating: 4.2
Author: Implexa

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2×…

view source

installs

stars

karma

SkillRank score ↗

4.2/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-07-11

openrlhf-training provides distributed rlhf orchestration via ray and vllm, covering ppo/grpo/dpo workflows for models 7b-70b+. installation and basic setup documented, but training procedure and failure modes absent.

structure

3.0

trigger phrases

2.0

procedure

4.0

edge cases

2.0

documentation

5.0

strengths

SKILL.md

OpenRLHF - High-Performance RLHF Training

Quick start

OpenRLHF is a Ray-based RLHF framework optimized for distributed training with vLLM inference acceleration.

Installation:

# Launch Docker container
docker run --runtime=nvidia -it --rm --shm-size="10g" --cap-add=SYS_ADMIN \
  -v $PWD:/openrlhf nvcr.io/nvidia/pytorch:25.02-py3 bash

# Uninstall conflicts
sudo pip uninstall xgboost transformer_engine flash_attn pynvml -y

# Install OpenRLHF with vLLM
pip install openrlhf[vllm]

related skills

semantically similar in the cross-vendor index

skills.sh

59% match

tensorrt-llm

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need…

don't have the plugin yet? install it then click "run inline in claude" again.