Initial Region Research Articles

Policy gradient is one of the most famous algorithms in reinforcement learning. This paper studies the mean dynamics of the soft-max policy gradient algorithm and its properties in multi-agent settings by resorting to evolutionary game theory and dynamical system tools. Unlike most multi-agent reinforcement learning algorithms, whose mean dynamics are a slight variant of the replicator dynamics not affecting the properties of the original dynamics, the soft-max policy gradient dynamics presents a structure significantly different from that of the replicator. In particular, we show that the soft-max policy gradient dynamics in a given game are equivalent to the replicator dynamics in an auxiliary game obtained by a non-convex transformation of the payoffs of the original game. Such a structure gives the dynamics several non-standard properties. The first property we study concerns the convergence to the best response. In particular, while the continuous-time mean dynamics always converge to the best response, the crucial question concerns the convergence speed. Precisely, we show that the space of initializations can be split into two complementary sets such that the trajectories initialized from points of the first set (said good initialization region) directly move to the best response. In contrast, those initialized from points of the second set (said bad initialization region) move first to a series of sub-optimal strategies and then to the best response. Interestingly, in multi-agent adversarial machine learning environments, we show that an adversary can exploit this property to make any current strategy of the learning agent using the soft-max policy gradient fall inside a bad initialization region, thus slowing its learning process and exploiting that policy. When the soft-max policy gradient dynamics is studied in multi-population games, modeling the learning dynamics in self-play, we show that the dynamics preserve the volume of the set of initial points. This property proves that the dynamics cannot converge when the only equilibrium of the game is fully mixed, as the volume of the set of initial points would need to shrink. We also give empirical evidence that the volume expands over time, suggesting that the dynamics in games with fully-mixed equilibrium is chaotic.

Read full abstract

Context. Massive and luminous O-type star (O star) atmospheres with winds have been studied primarily using one-dimensional (1D), spherically symmetric, and stationary models. However, observations and theory have suggested that O star atmospheres are highly structured, turbulent, and time-dependent. As such, when making comparisons to observations, present-day 1D modeling tools require the introduction of ad hoc quantities such as photospheric macro- and microturbulence, wind clumping, and other relevant properties. Aims. We present a series of multi-dimensional, time-dependent, radiation-hydrodynamical (RHD) simulations for O stars that encapsulate the deeper sub-surface envelope (down to T ~ 450 kK), as well as the supersonic line-driven wind outflow in one unified approach. Our overarching aim is to develop a framework that is free from the ad-hoc prescriptions that plague present-day 1D models. Here, we start with an analysis of a small set of such multi-dimensional simulations and then compare them to atmospheric structures predicted by their 1D counterparts. Methods. We performed time-dependent, two-dimensional (2D) simulations of O star atmospheres with winds using a flux-limiting RHD finite volume modelling technique. Opacities are computed using a hybrid approach combining tabulated Rosseland means with calculations (based on the Sobolev approximation) of the enhanced line opacities expected for supersonic flows. The initial conditions and comparison models were derived using similar procedures as those applied in standard 1D stationary model atmosphere with wind codes. Results. Structure starts appearing in our simulations just below the iron-opacity peak at ~200 kK. Local pockets of gas with radiative accelerations that exceed gravity then shoot up from these deep layers into the upper atmosphere, where they interact with the line-driven wind outflow initiated around or beyond the variable photosphere. This complex interplay creates large turbulent velocities in the photospheric layers of our simulations, on the order of ~30–100km s−1, with higher values for models with higher luminosity-to-mass ratios. This offers a generally good agreement with observations of large photospheric ‘macroturbulence’ in O stars. When compared to 1D models, the average structures in the 2D simulations display less envelope expansion and no sharp density-inversions, along with density and temperature profiles that are significantly less steep around the photosphere, and a strong anti-correlation between velocity and density in the supersonic wind. Although the wind initiation region is complex and highly variable in our simulations, our average mass-loss rates agree well with stationary wind models computed by means of full co-moving frame radiative transfer solutions. Conclusions. The different atmospheric structures found in 2D and 1D simulations are likely to affect the spectroscopic determination of fundamental stellar and wind parameters for O stars as well as the empirical derivation of their chemical abundance patterns. To qualitatively match the different density and temperature profiles seen in our multi-dimensional and 1D models, we need to add a modest amount of convective energy transport in the deep sub-surface layers and a large turbulent pressure around the photosphere to the 1D models.

Read full abstract

Initial Region Research Articles

Related Topics

Articles published on Initial Region

The Evolutionary Dynamics of Soft-Max Policy Gradient in Multi-Agent Settings

A Preliminary Assessment of Land Restoration Progress in the Great Green Wall Initiative Region Using Satellite Remote Sensing Measurements

Effect of Precipitation-Free Zone on Fatigue Properties in Al-7.02Mg-1.98Zn Alloys: Crystal Plasticity Finite Element Analysis.

Analysis of underwater radiated noise from ships using distributed acoustic sensing technology

Quantifying future carbon emissions uncertainties under stochastic modeling and Monte Carlo simulation: Insights for environmental policy consideration for the Belt and Road Initiative Region

Trade-offs and synergies pattern evolution of ecosystem structure-resilience-activity-services (SRAS) in the Belt and Road Initiative region

Bulk synthesis and beyond: The roles of eukaryotic replicative DNA polymerases

Consensus control and initialization region optimization for leader‐following multi‐agent systems under time‐varying communication delay and consecutive packet dropouts

Volumetric Image Analysis Of Pulsed Non-Thermal Plasma Produced By Microwave

Liner shipping connectivity: A dynamic link between energy trade, green exchange and inclusive growth using advanced econometric modelling

2D unified atmosphere and wind simulations of O-type stars

The regulation of methylation on the Z chromosome and the identification of multiple novel Male Hyper-Methylated regions in the chicken.

Unraveling the Dual-Stretch-Mode Impact on Tension Gauge Tethers' Mechanical Stability.

Automated detection of fin whale calls recorded with distributed acoustic sensing

Behavioral and Neurophysiological Implications of Pathological Human Tau Expression in Serotonin Neurons.

Low-cycle fatigue behaviour of extruded 7075 aluminium alloy bar: Competition of grain sizes and textures

The Effectiveness of the Regional Regulation Formation Agency in Producing Initiative Regional Regulations

Effects of pulse repetition frequency on bubble cloud characteristics and ablation in single-cycle histotripsy

Influence of the mRNA initial region on protein production: a case study using recombinant detoxified pneumolysin as a model.

Defense against smart invaders with swarms of sweeping agents

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Initial Region Research Articles

Related Topics

Articles published on Initial Region

The Evolutionary Dynamics of Soft-Max Policy Gradient in Multi-Agent Settings

A Preliminary Assessment of Land Restoration Progress in the Great Green Wall Initiative Region Using Satellite Remote Sensing Measurements

Effect of Precipitation-Free Zone on Fatigue Properties in Al-7.02Mg-1.98Zn Alloys: Crystal Plasticity Finite Element Analysis.

Analysis of underwater radiated noise from ships using distributed acoustic sensing technology

Quantifying future carbon emissions uncertainties under stochastic modeling and Monte Carlo simulation: Insights for environmental policy consideration for the Belt and Road Initiative Region

Trade-offs and synergies pattern evolution of ecosystem structure-resilience-activity-services (SRAS) in the Belt and Road Initiative region

Bulk synthesis and beyond: The roles of eukaryotic replicative DNA polymerases

Consensus control and initialization region optimization for leader‐following multi‐agent systems under time‐varying communication delay and consecutive packet dropouts

Volumetric Image Analysis Of Pulsed Non-Thermal Plasma Produced By Microwave

Liner shipping connectivity: A dynamic link between energy trade, green exchange and inclusive growth using advanced econometric modelling

2D unified atmosphere and wind simulations of O-type stars

The regulation of methylation on the Z chromosome and the identification of multiple novel Male Hyper-Methylated regions in the chicken.

Unraveling the Dual-Stretch-Mode Impact on Tension Gauge Tethers' Mechanical Stability.

Automated detection of fin whale calls recorded with distributed acoustic sensing

Behavioral and Neurophysiological Implications of Pathological Human Tau Expression in Serotonin Neurons.

Low-cycle fatigue behaviour of extruded 7075 aluminium alloy bar: Competition of grain sizes and textures

The Effectiveness of the Regional Regulation Formation Agency in Producing Initiative Regional Regulations

Effects of pulse repetition frequency on bubble cloud characteristics and ablation in single-cycle histotripsy

Influence of the mRNA initial region on protein production: a case study using recombinant detoxified pneumolysin as a model.

Defense against smart invaders with swarms of sweeping agents