Reinforcement Learning Using Python

Explainable and Safety Aware Deep Reinforcement Learning-Based Control of Nonlinear Discrete-Time Systems Using Neural Network Gradient Decomposition

Abstract: This paper presents an explainable deep-reinforcement learning (DRL)-based safety-aware optimal adaptive tracking (SOAT) scheme for a class of nonlinear discrete-time (DT) affine systems ...

11d

Quesma Releases OTelBench: Independent Benchmark Reveals Frontier LLMs Struggle with Real-World SRE Tasks

New benchmark shows top LLMs achieve only 29% pass rate on OpenTelemetry instrumentation, exposing the gap between ...

North Penn Now

Machine Learning Using Python: A Complete Learning Path With Practical Projects

Machine learning is an essential component of artificial intelligence. Whether it’s powering recommendation engines, fraud detection systems, self-driving cars, generative AI, or any of the countless ...

GitHub

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Building upon our previous work InftyThink, we introduce InftyThink+, an end-to-end reinforcement learning framework that directly optimizes the complete iterative reasoning trajectory. Building on ...

GitHub

Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning

REC-R1 is a general framework that bridges generative large language models (LLMs) and recommendation systems via reinforcement learning. Check the paper here.

Frontiers

Spike-based Q-learning in a non-von Neumann architecture

Reinforcement learning (RL) provides a computational framework in which an agent learns optimal policies by interacting with the environment and receiving feedback in the form of rewards (Sutton and ...

IEEE

Encoding High-Level Knowledge in Offline Multi-Agent Reinforcement Learning Using Reward Machines

Abstract: Offline reinforcement learning (RL) learns policies from fixed-size datasets without interacting with the environment, while multi-agent reinforcement learning (MARL) faces challenges from ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results