Ujjwal Das Gupta

Learn More
Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Policy gradient algorithms , which directly represent the policy, often need fewer parameters to learn good policies. However, they typically employ a fixed parametric representation that may not be sufficient for complex(More)
In this paper, we study the problem of learning the structure of Markov Networks that permit efficient inference. We formulate structure learning as an optimization problem that maximizes the likelihood of the model such that the inference complexity on the resulting structure is bounded. The inference complexity is measured with respect to any chosen(More)
  • 1