Back to AIBriefs
AnalysisAI Models

Process-reward optimization advances computer use agents

Two papers propose process-reward optimization for training computer use agents (CUAs), addressing limitations of sparse reward and costly live environment interaction. Methods like filtered behavior cloning and multi-granularity reward models improve agent performance on complex digital workflows.

·
1 day ago
Process-reward optimization advances computer use agents — AIBriefs