Learn More
We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human(More)
We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Neural-Image-QA, an end-to-end formulation to this problem for which all parts are trained jointly. In contrast to previous efforts, we are facing a multi-modal(More)
One of the difficulties in interactive music and entertainment is creating environments that reflect and react to the collective activity of groups with tens, hundreds, or even thousands of participants. Generating content on this scale involves many challenges. For example, how is the individual granted low latency control and a sense of causality, while(More)
As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process. This trend has allowed the community to progress towards more challenging and open tasks and refueled the hope at achieving the old AI dream(More)
This paper describes CargoNet, a system of low-cost, micropower active sensor tags that seeks to bridge the current gap between wireless sensor networks and radio-frequency identification (RFID). CargoNet was aimed at applications in environmental monitoring at the crate and case level for supply-chain management and asset security. Custom-designed circuits(More)
This paper describes the design of a (2.3 kV, 2.4 MVA) two-level -, three-level - neutral point clamped -, three-level - flying capacitor - and four-level - flying capacitor - voltage source converter on the basis of state-of-the-art 6.5 kV, 3.3 kV and 2.5 kV IGBTs. The semiconductor loss distribution, design, and costs of semiconductors and passive(More)
Scaling up visual category recognition to large numbers of classes remains challenging. A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes. Ultimately, this may allow to use textbook knowledge that humans(More)
We present a wireless identification system that employs an optical communications link between an array of uniquely identifiable smart tags and an interrogator flashlight. As the tags consume a quiescent current of under 2 microamperes and are woken up directly by the interrogator's modulated illumination, they are able to last nearly the shelf life of(More)