Learn More
In this paper, we address the problem of suboptimal behavior during online partially observable Markov decision process (POMDP) planning caused by time constraints on planning. Taking inspiration from the related field of reinforcement learning (RL), our solution is to shape the agent’s reward function in order to lead the agent to large future rewards(More)
Recent years have seen a surge in the use of intelligent computer-supported collaborative learning (CSCL) tools for improving student learning in traditional classrooms. However, adopting such a CSCL tool in a classroom still requires the teacher to develop (or decide on which to adopt) the CSCL tool and the CSCL script, design the relevant pedagogical(More)
Heuristic search algorithms for online POMDP planning have shown great promise in creating successful policies for maximizing agent rewards using heuristics typically focused on reducing the error bound in the agent’s cumulative future reward estimations. However, error bound-based heuristics are less informative in highly uncertain domains requiring long(More)
Prior research has established that active participation and collaboration by students results in multiple benefits during wiki-based CSCL activities. However, achieving such behavior can be a challenge without external motivation. To increase active participation and collaboration by users, we developed an enhanced wiki called the Written Agora. Using(More)
Traditional computer-supported collaborative learning (CSCL) systems focus primarily on facilitating group work amongst students, providing several different modes for different types of communication, from chatting applications to shared whiteboards. The power of a good CSCL system, in addition to the interactions it supports, lies with the data the system(More)
Agents operating in complex (e.g., dynamic, uncertain, partially observable) environments must gather information from various sources to inform their incomplete knowledge. Two popular types of sources include: (1) directly sensing the environment using the agent’s sensors, and (2) sharing information between networked agents occupying the same environment.(More)
In many real-world applications of multi-agent systems, agent reasoning suffers from bounded rationality caused by both limited resources and limited knowledge. When agent sensing to overcome its knowledge limitations also requires resource use, the agent’s knowledge refinement is affected due to its inability to always sense when and as accurately as(More)
We address the problem of suboptimal behavior caused by short horizons during online POMDP planning. Our solution extends potential-based reward shaping from the related field of reinforcement learning to online POMDP planning in order to improve planning without increasing the planning horizon. In our extension, information about the quality of belief(More)
One popular approach to active perception is using POMDPs to maximize rewards received for sensing actions towards task accomplishment and/or continually refining the agent’s knowledge. Multiple types of reward functions have been proposed to achieve these goals: (1) state-based rewards which minimize sensing costs and maximize task rewards, (2)(More)