Gradient Approximation Research Articles

Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy. This differs from human decision-making, where gains and losses are valued differently and outlying outcomes are given increased consideration. It also fails to capitalize on opportunities to improve safety and/or performance through the incorporation of distributional context. Several approaches to distributional DRL have been investigated, with one popular strategy being to evaluate the projected distribution of returns for possible actions. We propose a more direct approach whereby risk-sensitive objectives, specified in terms of the cumulative distribution function (CDF) of the distribution of full-episode rewards, are optimized. This approach allows for outcomes to be weighed based on relative quality, can be used for both continuous and discrete action spaces, and may naturally be applied in both constrained and unconstrained settings. We show how to compute an asymptotically consistent estimate of the policy gradient for a broad class of risk-sensitive objectives via sampling, subsequently incorporating variance reduction and regularization measures to facilitate effective on-policy learning. We then demonstrate that the use of moderately "pessimistic" risk profiles, which emphasize scenarios where the agent performs poorly, leads to enhanced exploration and a continual focus on addressing deficiencies. We test the approach using different risk profiles in six OpenAI Safety Gym environments, comparing to state of the art on-policy methods. Without cost constraints, we find that pessimistic risk profiles can be used to reduce cost while improving total reward accumulation. With cost constraints, they are seen to provide higher positive rewards than risk-neutral approaches at the prescribed allowable cost.

Read full abstract

Given a dataset with incomplete data (e.g., missing values), training a machine learning model over the incomplete data requires two steps. First, it requires a data-effective step that cleans the data in order to improve the data quality (and the model quality on the cleaned data). Second, it requires a data-efficient step that selects a core subset of the data (called coreset) such that the trained models on the entire data and the coreset have similar model quality, in order to improve the training efficiency. The first-data-effective-then-data-efficient methods are too costly, because they are expensive to clean the whole data; while the first-data-efficient-then-data-effective methods have low model quality, because they cannot select high-quality coreset for incomplete data. In this paper, we investigate the problem of coreset selection over incomplete data for data-effective and data-efficient machine learning. The essential challenge is how to model the incomplete data for selecting high-quality coreset. To this end, we propose the GoodCore framework towards selecting a good coreset over incomplete data with low cost. To model the unknown complete data, we utilize the combinations of possible repairs as possible worlds of the incomplete data. Based on possible worlds, GoodCore selects an expected optimal coreset through gradient approximation without training ML models. We formally define the expected optimal coreset selection problem, prove its NP-hardness, and propose a greedy algorithm with an approximation ratio. To make GoodCore more efficient, we further propose optimization methods that incorporate human-in-the-loop imputation or automatic imputation method into our framework. Experimental results show the effectiveness and efficiency of our framework with low cost.

Read full abstract

Gradient Approximation Research Articles

Related Topics

Articles published on Gradient Approximation

A Risk-Sensitive Approach to Policy Optimization

Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack

Revisiting Item Promotion in GNN-Based Collaborative Filtering: A Masked Targeted Topological Attack Perspective

A note on gradient estimates for p-Laplacian equations

Correlation between energy band transition and optical absorption spectrum in bilayer armchair graphene nanoribbons

Optimal estimates for transmission problems including relative conductivities with different signs

GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data

Automatic Optimization of Multi-Well Multi-Stage Fracturing Treatments Combining Geomechanical Simulation, Reservoir Simulation and Intelligent Algorithm

The perfect conductivity problem with arbitrary vanishing orders and non-trivial topology

Elliptic gradient estimates for a nonlinear equation with Dirichlet boundary condition

Policy ensemble gradient for continuous control problems in deep reinforcement learning

Full‐Waveform Tomography of the African Plate Using Dynamic Mini‐Batches

Harnack Estimation for Nonlinear, Weighted, Heat-Type Equation along Geometric Flow and Applications

Liquid temperature gradient estimation for screen channel liquid acquisition device bubble point tests in liquid nitrogen

Adversarial example‐based test case generation for black‐box speech recognition systems

A point cloud denoising network based on manifold in an unknown noisy environment

Gradient estimates for the Schrödinger potentials: convergence to the Brenier map and quantitative stability

Souplet–Zhang and Hamilton‐type gradient estimates for non‐linear elliptic equations on smooth metric measure spaces

Gradient estimates and the fundamental solution for higher-order elliptic systems with lower-order terms

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Gradient Approximation Research Articles

Related Topics

Articles published on Gradient Approximation

A Risk-Sensitive Approach to Policy Optimization

Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack

Revisiting Item Promotion in GNN-Based Collaborative Filtering: A Masked Targeted Topological Attack Perspective

A note on gradient estimates for p-Laplacian equations

Correlation between energy band transition and optical absorption spectrum in bilayer armchair graphene nanoribbons

Optimal estimates for transmission problems including relative conductivities with different signs

GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data

Automatic Optimization of Multi-Well Multi-Stage Fracturing Treatments Combining Geomechanical Simulation, Reservoir Simulation and Intelligent Algorithm

The perfect conductivity problem with arbitrary vanishing orders and non-trivial topology

Elliptic gradient estimates for a nonlinear equation with Dirichlet boundary condition

Policy ensemble gradient for continuous control problems in deep reinforcement learning

Full‐Waveform Tomography of the African Plate Using Dynamic Mini‐Batches

Harnack Estimation for Nonlinear, Weighted, Heat-Type Equation along Geometric Flow and Applications

Liquid temperature gradient estimation for screen channel liquid acquisition device bubble point tests in liquid nitrogen

Adversarial example‐based test case generation for black‐box speech recognition systems

A point cloud denoising network based on manifold in an unknown noisy environment

Gradient estimates for the Schrödinger potentials: convergence to the Brenier map and quantitative stability

Souplet–Zhang and Hamilton‐type gradient estimates for non‐linear elliptic equations on smooth metric measure spaces

Gradient estimates and the fundamental solution for higher-order elliptic systems with lower-order terms