- Ariel Orda, Raphael Rom, Nahum Shimkin
- IEEE/ACM Trans. Netw.
- 1993

We consider a communication network shared by several selfish users. Each user seeks to optimize its own performance by controlling the routing of its given flow demand, giving rise to a noncooperative game. We investigate the Nash equilibrium of such systems. For a two-node multiple links system, uniqueness of the Nash equilibrium is proven underâ€¦ (More)

- Ishai Menache, Shie Mannor, Nahum Shimkin
- ECML
- 2002

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient MaxFlow/Min-Cut algorithm for identifying bottlenecks. The policies for reachingâ€¦ (More)

- Eitan Altman, Tamer Basar, Tania JimÃ©nez, Nahum Shimkin
- IEEE Trans. Automat. Contr.
- 2002

We study a class of noncooperative general topology networks shared by users. Each user has a given flow which it has to ship from a source to a destination. We consider a class of polynomial link cost functions adopted originally in the context of road traffic modeling, and show that these costs have appealing properties that lead to predictable andâ€¦ (More)

We consider a multi-class queueing system with multiple homogeneous servers and customer abandonment. For each customer class i, the holding cost per unit time, the service rate and the abandonment rate are denoted by ci, Î¼i and Î¸i, respectively. We prove that under a many-server fluid scaling and overload conditions, a routing policy that assigns priorityâ€¦ (More)

- Eitan Altman, Nahum Shimkin
- Operations Research
- 1998

Operations Research
1998

- Jia Yuan Yu, Shie Mannor, Nahum Shimkin
- EWRL
- 2008

We consider a learning problem where the decision maker interacts with a standard Markov decision process, with the exception that the reward functions vary arbitrarily over time. We show that, against every possible realization of the reward process, the agent can perform as wellâ€”in hindsightâ€”as every stationary policy. This generalizes the classicalâ€¦ (More)

- Ishai Menache, Shie Mannor, Nahum Shimkin
- Annals OR
- 2005

We examine methods for on-line optimization of the basis function for temporal difference Reinforcement Learning algorithms. We concentrate on architectures with a linear parameterization of the value function. Our methods optimize the weights of the network while simultaneously adapting the parameters of the basis functions in order to decrease the Bellmanâ€¦ (More)

- Mor Armony, Nahum Shimkin, Ward Whitt
- Operations Research
- 2009

This paper studies the performance impact of making delay announcements to arriving customers who must wait before starting service in a many-server queue with customer abandonment. The queue is assumed to be invisible to waiting customers, as in most customer contact centers, when contact is made by telephone, email or instant messaging. Customers who mustâ€¦ (More)

- Ishai Menache, Nahum Shimkin
- IEEE Journal on Selected Areas in Communications
- 2008

We consider a wireless collision channel, shared by a finite number of users who transmit to a common base station. Each user wishes to minimize its average transmission rate (or power investment), subject to minimum throughput demand. The channel quality between each user and the base station is randomly time-varying, and partially observed by the userâ€¦ (More)

- Shie Mannor, Nahum Shimkin
- Journal of Machine Learning Research
- 2004

We consider the problem of reinforcement learning in a controlled Markov environment with multiple objective functions of the long-term average reward type. The environment is initially unknown, and furthermore may be affected by the actions of other agents, actions that are observed but cannot be predicted beforehand. We capture this situation using aâ€¦ (More)