The Lob–Pass Problem

Jun-Ichi Takeuchi,Naoki Abe,Shun-Ichi Amari

doi:10.1006/jcss.2000.1718

Jun-Ichi Takeuchi, Naoki Abe + Show 1 more

Open Access

https://doi.org/10.1006/jcss.2000.1718

Copy DOI

Journal: Journal of Computer and System Sciences	Publication Date: Dec 1, 2000
Citations: 2	License type: publisher-specific-oa

Affiliation: NEC (Japan)

Abstract

We consider a new variant of the online learning model in which the goal of an agent is to choose his or her actions so as to maximize the number of successes, while learning about his or her reacting environment through those very actions. In particular, we consider a model of tennis play, in which the only actions that the player can take are a pass and a lob, and the opponent is modeled by two linear (probabilistic) functions fL(r)=a1r+b1 and fP(r)=a2r+b2, specifying the probability that a lob (and a pass, respectively) will win a point when the proportion of lobs played in the past trials is r. We measure the performance of a player in this model by his or her expected regret, namely how many fewer points the player expects to win as compared to the ideal player (one that knows the two probabilistic functions) as a function of t, the total number of trials, which is unknown to the player a priori. Assuming that the probabilistic functions satisfy the “matching shoulders condition,” i.e., fL(0)=fP(1), we obtain a variety of upper bounds for assumptions and restrictions of varying degrees, ranging from O(logt), O(t1/2), O(t3/5), O(t2/3) to O(t5/7) as well as a matching lower bound of order Ω(logt) for the first case. When the total number of trials t is given to the player in advance, the upper bounds can be improved significantly. An extended abstract describing part of this work has appeared in N. Abe and J. Takeuchi, 1993, in “Proceedings of the Sixth Annual ACM Workshop on Computational Learning Theory,” pp. 422–428.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Lob–Pass Problem

Abstract

Talk to us

Similar Papers

More From: Journal of Computer and System Sciences

Lead the way for us

Similar Papers

Evaluation of Ground Reaction Forces and Limb Symmetry Indices Using Ground Reaction Forces Collected with One or Two Plates in Dogs Exhibiting a Stifle Lameness.
Nicola J Volstad ... Elizabeth M Pettit
Veterinary and comparative orthopaedics and traumatology : V.C.O.T | VOL. 33
Nicola J Volstad, et. al.Nicola J Volstad ... Elizabeth M Pettit
01 Oct 2020
Veterinary and comparative orthopaedics and traumatology : V.C.O.T | VOL. 33

YÜKSEK ATLAMA VE SIRIKLA ATLAMADA YARIŞMA STRATEJİSİ: KAÇ DENEME?
İşık Bayraktar
Ankara Üniversitesi Beden Eğitimi ve Spor Yüksekokulu SPORMETRE Beden Eğitimi ve Spor Bilimleri Dergisi | VOL. 17
İşık Bayraktarİşık Bayraktar
30 Dec 2020
Ankara Üniversitesi Beden Eğitimi ve Spor Yüksekokulu SPORMETRE Beden Eğitimi ve Spor Bilimleri Dergisi | VOL. 17

A unifying framework for computational reinforcement learning theory
...
-
, et. al. ...
01 Jan 2009
01 Jan 2009

Developments in computational learning and discovery theory within the framework of elementary formal systems
Setsuo Arikawa ... Ayumi Shinohara
-
Setsuo Arikawa, et. al.Setsuo Arikawa ... Ayumi Shinohara
27 Jan 2000
27 Jan 2000

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Lob–Pass Problem

Abstract

Talk to us

Similar Papers

More From: Journal of Computer and System Sciences