Distributed Detection and Estimation
The literature on distributed detection and estimation is quite extensive, including the topic of multi-sensor data fusion. Initially, let us consider distributed detection; a good, relatively recent tutorial on the subject is given by Viswanathan and Varshney . The basic idea is to have a number of independent sensors each make a local decision (typically a binary one) and then to combine these decisions at a fusion center to generate a global decision. The figure below illustrates the parallel fusion topology, which implements this processing. Either the Bayesian or the Neyman-Pearson criterion can be used. Assuming the Neyman-Pearson formulation where one assumes a bound on the global probability of false alarm, the goal is to determine the optimum local and global decision rules that minimize the global probability of miss (or equivalently maximize the global probability of detection). When the sensors are deployed so that their observations are conditionally independent, one can show that these decision rules are threshold rules based on likelihood ratios . The problem now becomes one of determining the optimal threshold at each sensor, as well as at the fusion center. While this task is quite non-trivial, it can still be done for a reasonably small number of sensors using iterative techniques . More importantly, by using soft, multi-bit decisions at each of the sensors, it is possible to increase the performance so that it is asymptotically close to the optimal centralized scheme .
Depending on the sensor network topology, it may be more useful to implement the distributed detection or estimation using a tree structure. Tsistsiklis  shows that the optimal decision rules are still in the form of threshold tests. Tang et al.  consider the case where the local decisions made at a number of sensors are communicated to multiple root nodes for data fusion. With each sensor node characterized by a receiver operating curve (ROC) and assuming a Bayes' risk criterion, they reformulate the problem as a nonlinear optimal control problem that can be solved numerically. Furthermore, they briefly examine communication and robustness issues for two types of tree structures: a functional hierarchy and a decentralized market. One conclusion is that if the communication costs are a primary concern, then the functional hierarchy is preferred because it leads to less network traffic. However if robustness is the primary issue, then the decentralized market structure may be a better choice.
In the cases discussed above, the information flows in one direction from the sensors to either the single fusion center or to a number of root nodes. Even in the decentralized market topology, where numerous sensors report to multiple intermediate nodes, the graph of the network is still acyclic. If the communication network is able to handle the increased load, performance can be improved through the use of decision feedback [6, 7]. Pados et al.  examine two distributed structures: 1.) a network where the fusion center provides decision feedback connections to each of the sensor nodes, and 2.) a set of sensors that are fully interconnected via decision feedback. The performance of the fully connected network is quantifiably better, but their initial system was non-robust to variations in the statistical descriptions of the two hypotheses. Robust testing functions are able to overcome this problem, and they show that robust networks tend to reject the feedback when operating with contaminated data. Alhakem and Varshney  study a distributed detection system with feedback and memory. That is, each sensor not only uses its present input and the previous fed-back decision from the fusion center, but it also uses its own previous inputs. They derive the optimal fusion rule and local decision rules, and they show that the probability of error in a Bayesian formulation goes to zero asymptotically. Additionally, they address the communication requirements by developing two data transmission protocols that reduce the number of messages sent among the nodes.
Swaszek and Willet propose a more extensive feedback approach that they denote parleying . The basic idea is that each sensor makes an initial binary decision that is then distributed to all the other sensors. The goal is to achieve a consensus on the given hypothesis through multiple iterations. They develop two versions of the algorithm; the first is a greedy approach that achieves fast convergence at the expense of performance. The nth-root approach constrains the consensus to be optimum in that it would match that of a centralized processor having access to all the data. The main issue is the number of parleys (iterations) required to reach this consensus.
The second component of a smart sensor network is the wireless communications network used to relay the sensor information. In essentially all of the work discussed above, the initialization, routing, and reconfiguration details of this network are not considered. The effect of the distributed algorithm on the use of networking resources is often not examined. When it has been examined, the effects of lost or corrupted messages on the performance of the detection or estimation algorithm have been typically neglected. An exception is Tang et al. , who studied robustness to the loss of communication links. Also, Thomopoulos and Zhang  make some assumptions about networking delays and channel errors. Recent work in distributed estimation  assumes error-free communication channels with capacity constraints.
Since a real wireless network imposes channel errors, delays, packet losses, and power and topology constraints, it is essential that the over-all sensor network design consider these factors. Typically after deployment, the first action for a sensor network is to determine its topology. This step is done because many of the traditional routing protocols require topological information for initialization. This is especially true for link state routing, which forms the basis for the open shortest path first algorithm used within autonomous systems in the Internet . In order to both conserve battery power and reduce the probability of detection by hostile forces, it is better to use a reactive routing protocol. This is a protocol that determines a route only when it is required.
Another design choice is whether the network has a flat or hierarchical architecture. Both have advantages and disadvantages. The former is more survivable since it does not have a single point of failure; it also allows multiple routes between nodes. The latter provides simpler network management, and can help further reduce transmissions.
In a mobile ad-hoc network, the changing network topology leads to a requirement that the network periodically re-configure itself. Not only must the routing protocols be able to handle this situation, but also the media access mechanism. Link parameters, such as modulation type, amount of channel coding, transmitter power, etc. , must adapt to the new configuration. While we are initially assuming that the sensors are stationary, the possibility of (deliberate) sensor destruction requires that the communications network be re-configurable. Moreover, the distributed detection and estimation algorithms must also have this capability.
Summary of Research Issues
As discussed above, there has been a relatively long history of research in distributed detection and estimation in one research community, and wireless networking in another. However, there has been much less overlap between these two communities. The DARPA-sponsored SensIt program is also beginning to address this topic, with the focus primarily on collaborative signal processing, network routing protocols, and query procedures. The main contribution of our work is to combine the two disciplines of distributed detection/estimation and wireless networking so that realistic smart sensor network configurations are developed, evaluated, and optimized. Based on these designs, research is being conducted to answer the following inter-related questions:
* How do the communication and networking effects, specifically routing, bandwidth, and power constraints determine the quality of the distributed detection and/or estimation?
* What is the load on the communication network caused by the distributed processing? How many messages/second and how many system resources are required to achieve a desired quality of service?
* How robust is the resulting sensor network to lost nodes? What is the mechanism for reconfiguration that allows the network to adapt to such loss events?