Research in MAIS+S

One major area of research in MAIS+S is on decision-theoretic models and methods for efficient optimization of the surveillance network performance. The other major area is concerned with machine vision.

Decision-theoretic research

Scalability of multiagent planning under uncertainty

An important focus area for state-of-the-art research on decentralized decision-theoretic planning is how to scale up (approximate) planning algorithms to larger number of agents. In MAIS+S we consider the nodes in the network as agents: a server handling a number of cameras, a mobile robot, or even a smart camera (a camera with local processing power). This implies that the surveillance network comprises a relatively large set of heterogeneous agents, which poses fresh challenges for current decentralized planning algorithms.

One major source of intractability for decentralized planning algorithms is that agents typically reason over all possible observations other agents can receive, and all possible actions they might take. This allows for tightly-coupled optimal solutions, but is not very scalable. On the other hand, in many real-world domains, including that considered in this project, interactions between agents is a local phenomenon, and for a satisfying system performance agents it is often not necessary to consider interactions between all agents. In fact, the network structure naturally induces a topology for interactions between the agents.

For example, when a person is leaving the field-of-view of a particular camera, and the system must coordinate the handover of tracking that person to another camera, it is only necessary to consider cameras whose fields-of-view are close to the one of the original camera. As the layout of the cameras is fixed, these interactions can be computed offline and exploited as such. For a mobile robot, its possible that interactions depend on its location, and hence the potential interactions with other agents (be it other robots or cameras) will be dynamic.

Reasoning about communication in multiagent systems

Communication is an important aspect of any networked robot system, and in particular when using cameras, given the large amount of data they generate. It is possible to distinguish two levels of communication: on a lower level, the network should carry video streams, and on a higher level, events detected by vision algorithms are transmitted. The video streams can be used for monitoring as well as storage and offline processing, and require high bandwidth. The detected events are typically much smaller in size, but they are crucial for the decision making in the system.

Efficient usage of the communication channels depends critically on the ability to decide when and what event information is shared among agents, without flooding the network. Sharing event information generally improves the situational awareness of each agent, potentially improving and simplifying several aspects of its process of decision-making. However, it also introduces an extra level of complexity into this process. In particular, an agent must now consider more inputs, which increases the size of the observation space during planning, and directly impacts the time required for planning.

Moreover, while sharing all information with each agent effectively turns the problem into a centralized planning problem, for the scenario considered in the project such a centralized problem will be too large to solve, and the intense information-sharing is likely to flood the network. Hence, to a certain degree, the project will consider that the agents have limited communication. This also reduces the risk that agents require event information to arrive timely (events and video streams are likely to share the same medium), by keeping the communication as local as possible.

Machine-vision research

Multi-camera tracking

One of the core elements of a distributed surveillance system is the capability of tracking targets across multiple cameras. In the case of non-overlapping cameras, tracking algorithms aim to minimize the probability of losing targets in a network of nodes. Intra-camera tracking is a difficult problem that greatly depends:

  1. On finding adequate features to represent the targets to be tracked;
  2. On choosing adequate methods to adapt the selected features.

In MAIS+S we investigate how to use the local structure of a node in order to generate robust labels of the moving objects. Thus, the feature selection and adaptation techniques include linked representations across nodes of the network. We also investigate the possibility of including human activity recognition algorithms in the system, endowing the surveillance system with the ability to build higher level behavior patterns in the environment.