TY - JOUR
T1 - Learning Recognition and Segmentation Using the Cresceptron
AU - Weng, John
AU - Ahuja, Narendra
AU - Huang, Thomas S.
N1 - Funding Information:
The work was supported by the Defense Advanced Research Projects Agency and the National Science Foundation under grant IRI-8902728, NSF under IRI-9410741, Office of Naval Research under N00014-95-1-0637 and N00014-96-1-0129.
PY - 1997
Y1 - 1997
N2 - This paper presents a framework called Cresceptron for view-based learning, recognition and segmentation. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. The learning phase is interactive. The user trains the system using a collection of training images. For each training image, the user manually draws a polygon outlining the region of interest and types in the label of its class. Then, from the directional edges of each of the segmented regions, the Cresceptron uses a hierarchical self-organization scheme to grow a sparsely connected network automatically, adaptively and incrementally during the learning phase. At each level, the system detects new image structures that need to be learned and assigns a new neural plane for each new feature. The network grows by creating new nodes and connections which memorize the new image structures and their context as they are detected. Thus, the structure of the network is a function of the training exemplars. The Cresceptron incorporates both individual learning and class learning; with the former, each training example is treated as a different individual while with the latter, each example is a sample of a class. In the performance phase, segmentation and recognition are tightly coupled. No foreground extraction is necessary, which is achieved by backtracking the response of the network down the hierarchy to the image parts contributing to recognition. Several stochastic shape distortion models are analyzed to show why multilevel matching such as that in the Cresceptron can deal with more general stochastic distortions that a single-level matching scheme cannot. The system is demonstrated using images from broadcast television and other video segments to learn faces and other objects, and then later to locate and to recognize similar, but possibly distorted, views of the same objects.
AB - This paper presents a framework called Cresceptron for view-based learning, recognition and segmentation. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. The learning phase is interactive. The user trains the system using a collection of training images. For each training image, the user manually draws a polygon outlining the region of interest and types in the label of its class. Then, from the directional edges of each of the segmented regions, the Cresceptron uses a hierarchical self-organization scheme to grow a sparsely connected network automatically, adaptively and incrementally during the learning phase. At each level, the system detects new image structures that need to be learned and assigns a new neural plane for each new feature. The network grows by creating new nodes and connections which memorize the new image structures and their context as they are detected. Thus, the structure of the network is a function of the training exemplars. The Cresceptron incorporates both individual learning and class learning; with the former, each training example is treated as a different individual while with the latter, each example is a sample of a class. In the performance phase, segmentation and recognition are tightly coupled. No foreground extraction is necessary, which is achieved by backtracking the response of the network down the hierarchy to the image parts contributing to recognition. Several stochastic shape distortion models are analyzed to show why multilevel matching such as that in the Cresceptron can deal with more general stochastic distortions that a single-level matching scheme cannot. The system is demonstrated using images from broadcast television and other video segments to learn faces and other objects, and then later to locate and to recognize similar, but possibly distorted, views of the same objects.
KW - Associative memory
KW - Face detection
KW - Face recognition
KW - Feature extraction
KW - Feature selection
KW - Object recognition
KW - Object segmentation
KW - Self-organization
KW - Shape representation
KW - Visual learning
UR - http://www.scopus.com/inward/record.url?scp=0031274976&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0031274976&partnerID=8YFLogxK
U2 - 10.1023/A:1007967800668
DO - 10.1023/A:1007967800668
M3 - Article
AN - SCOPUS:0031274976
SN - 0920-5691
VL - 25
SP - 109
EP - 143
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
IS - 2
ER -