Visual Context Modelling to improve security and logistics monitoring (2009-2012)

The ViCoMo project is developing advanced video-interpretation algorithms to enhance images acquired with multiple camera systems. By modelling the context in which such systems are used, ViCoMo will significantly improve the intelligence of visual systems and enable recognition of the behaviour of persons, objects and events in a 3D view. The project will enable advanced content and contextbased applications in surveillance and security, and transport/logistics with spin-offs in the consumer and multimedia domains.

ViCoMo will focus on visual interpretation and reasoning using context information. It will construct realistic context models to improve the decision making of complex vision systems and produce meaningful behaviour. The general goal is to find the context of events that were captured by the cameras or image sensors, and model the context to establish reliable reasoning about the event. These goals contribute to improving healthcare, security, safety and the public infrastructure in general. In addition, it supports the development of data storage, efficient retrieval and use for emerging surveillance, healthcare and data mining industries.

A novel aspect is that ViCoMo will merge the information from multiple camera sensors to build an extensive context model of the environment where the visual data was captured. This model can be used to construct a world view with the captured data and extend it with other modelling features such as distances, perspective correction and event indications. This context model is subsequently used to derive a more accurate analysis of for example an accident or the behaviour of a group of people. Besides the advances in video content analysis, the modelling also requires a new paradigm on content storage and retrieval. State-of-the-art systems store streams of consecutive video frames with time stamps whereas a ViCoMo system stores information on visual context amongst others events and objects with properties such as location, behaviour, identity and/or colour. Such information is indispensable for fast explicit reasoning about visual context and rapid decision making – and is how humans interpret visual data.

Key innovations will include: 3D environment modelling; context and metadata centric output results rather than video output; high-semantic reasoning; and information filtering significantly improving the information efficiency. The consortium intends to exploit context modelling in several domains: observation for surveillance and team training; 3D modelling of the real-world environment; observation of human behaviour for system control; and logistics control for traffic and transportation.