Abstract: By studying performance measures via reward structures, on-line error bounds are obtained by successive approximation. These bounds indicate when to ...