PAMI,2009,July

学术进展   2009-05-26 11:07   阅读104   评论0  
字号:    
PAMI July,2009

1.A Fast 2D Shape Recovery Approach by Fusing Features and Appearance

In this paper, we present a fusion approach to solve the nonrigid shape recovery problem, which takes advantage of both the appearance information and the

local features. We have two major contributions. First, we propose a novel progressive finite Newton optimization scheme for the feature-based nonrigid

surface detection problem, which is reduced to only solving a set of linear equations. The key is to formulate the nonrigid surface detection as an

unconstrained quadratic optimization problem that has a closed-form solution for a given set of observations. Second, we propose a deformable Lucas-Kanade

algorithm that triangulates the template image into small patches and constrains the deformation through the second-order derivatives of the mesh vertices.

We formulate it into a sparse regularized least squares problem, which is able to reduce the computational cost and the memory requirement. The inverse

compositional algorithm is applied to efficiently solve the optimization problem. We have conducted extensive experiments for performance evaluation on

various environments, whose promising results show that the proposed algorithm is both efficient and effective.

使用外观信息以及局部信息,做非刚性形状的修复

2. Context-Aware Visual Tracking

Enormous uncertainties in unconstrained environments lead to a fundamental dilemma that many tracking algorithms have to face in practice: Tracking has to be

computationally efficient, but verifying whether or not the tracker is following the true target tends to be demanding, especially when the background is

cluttered and/or when occlusion occurs. Due to the lack of a good solution to this problem, many existing methods tend to be either effective but

computationally intensive by using sophisticated image observation models or efficient but vulnerable to false alarms. This greatly challenges long-duration

robust tracking. This paper presents a novel solution to this dilemma by considering the context of the tracking scene. Specifically, we integrate into the

tracking process a set of auxiliary objects that are automatically discovered in the video on the fly by data mining. Auxiliary objects have three

properties, at least in a short time interval: 1) persistent co-occurrence with the target, 2) consistent motion correlation to the target, and 3) easy to

track. Regarding these auxiliary objects as the context of the target, the collaborative tracking of these auxiliary objects leads to efficient computation

as well as strong verification. Our extensive experiments have exhibited exciting performance in very challenging real-world testing cases.

使用一组辅助物体作为跟踪目标的上下文信息,在跟踪中兼顾计算速度与跟踪效果。

3. A Novel Algorithm for Detecting Singular Points from Fingerprint Images

Fingerprint analysis is typically based on the location and pattern of detected singular points in the images. These singular points (cores and deltas) not

only represent the characteristics of local ridge patterns but also determine the topological structure (i.e., fingerprint type) and largely influence the

orientation field. In this paper, we propose a novel algorithm for singular points detection. After an initial detection using the conventional Poincaré

Index method, a so-called DORIC feature is used to remove spurious singular points. Then, the optimal combination of singular points is selected to minimize

the difference between the original orientation field and the model-based orientation field reconstructed using the singular points. A core-delta relation is

used as a global constraint for the final selection of singular points. Experimental results show that our algorithm is accurate and robust, giving better

results than competing approaches. The proposed detection algorithm can also be used for more general 2D oriented patterns, such as fluid flow motion, and so

forth.

在指纹图像中寻找奇异点

4. Minimum Distance between Pattern Transformation Manifolds: Algorithm and Applications

Transformation invariance is an important property in pattern recognition, where different observations of the same object typically receive the same label.

This paper focuses on a transformation-invariant distance measure that represents the minimum distance between the transformation manifolds spanned by

patterns of interest. Since these manifolds are typically nonlinear, the computation of the manifold distance (MD) becomes a nonconvex optimization problem.

We propose representing a pattern of interest as a linear combination of a few geometric functions extracted from a structured and redundant basis.

Transforming the pattern results in the transformation of its constituent parts. We show that, when the transformation is restricted to a synthesis of

translations, rotations, and isotropic scalings, such a pattern representation results in a closed-form expression of the manifold equation with respect to

the transformation parameters. The MD computation can then be formulated as a minimization problem whose objective function is expressed as the difference of

convex functions (DC). This interesting property permits optimally solving the optimization problem with DC programming solvers that are globally convergent.

We present experimental evidence which shows that our method is able to find the globally optimal solution, outperforming existing methods that yield

suboptimal solutions.

衡量模式的不同变换之间距离

5. Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition

The problem addressed in this study is the offline recognition of handwritten Arabic city names. The names are assumed to belong to a fixed lexicon of about

1,000 entries. A state-of-the-art classical right-left hidden Markov model (HMM)-based recognizer (reference system) using the sliding window approach is

developed. The feature set includes both baseline-independent and baseline-dependent features. The analysis of the errors made by the recognizer shows that

the inclination, overlap, and shifted positions of diacritical marks are major sources of errors. In this paper, we propose coping with these problems. Our

approach relies on the combination of three homogeneous HMM-based classifiers. All classifiers have the same topology as the reference system and differ only

in the orientation of the sliding window. We compare three combination schemes of these classifiers at the decision level. Our reported results on the

benchmark IFN/ENIT database of Arabic Tunisian city names give a recognition rate higher than 90 percent accuracy and demonstrate the superiority of the

neural network-based combination. Our results also show that the combination of classifiers performs better than a single classifier dealing with slant-

corrected images and that the approach is robust for a wide range of orientation angles.

对基于隐马尔可夫模型的手写识别算法的改进

6.Classification Based on Hybridization of Parametric and Nonparametric Classifiers

Parametric methods of classification assume specific parametric models for competing population densities (e.g., Gaussian population densities can lead to

linear and quadratic discriminant analysis) and they work well when these model assumptions are valid. Violation in one or more of these parametric model

assumptions often leads to a poor classifier. On the other hand, nonparametric classifiers (e.g., nearest-neighbor and kernel-based classifiers) are more

flexible and free from parametric model assumptions. But, the statistical instability of these classifiers may lead to poor performance when we have small

numbers of training sample observations. Nonparametric methods, however, do not use any parametric structure of population densities. Therefore, even when

one has some additional information about population densities, that important information is not used to modify the nonparametric classification rule. This

paper makes an attempt to overcome these limitations of parametric and nonparametric approaches and combines their strengths to develop some hybrid

classification methods. We use some simulated examples and benchmark data sets to examine the performance of these hybrid discriminant analysis tools.

Asymptotic results on their misclassification rates have been derived under appropriate regularity conditions.

混合使用参数分类器与非参数分类器的分类方法

5. Preprocessing of Low-Quality Handwritten Documents Using Markov Random Fields

This paper presents a statistical approach to the preprocessing of degraded handwritten forms including the steps of binarization and form line removal. The

degraded image is modeled by a Markov Random Field (MRF) where the hidden-layer prior probability is learned from a training set of high-quality binarized

images and the observation probability density is learned on-the-fly from the gray-level histogram of the input image. We have modified the MRF model to drop

the preprinted ruling lines from the image. We use the patch-based topology of the MRF and Belief Propagation (BP) for efficiency in processing. To further

improve the processing speed, we prune unlikely solutions from the search space while solving the MRF. Experimental results show higher accuracy on two data

sets of degraded handwritten images than previously used methods.

使用马尔可夫长处理手写体识别

6.A Constant-Time Algorithm for Finding Neighbors in Quadtrees

Quadtrees and linear quadtrees are well-known hierarchical data structures to represent square images of size 2^{r} times 2^{r}. Finding the neighbors of a

specific leaf node is a fundamental operation for many algorithms that manipulate quadtree data structures. In quadtrees, finding neighbors takes O(r)

computational time for the worst case, where r is the resolution (or height) of a given quadtree. Schrack [1] proposed a constant-time algorithm for finding

equal-sized neighbors in linear quadtrees. His algorithm calculates the location codes of equal-sized neighbors; it says nothing, however, about their

existence. To ensure their existence, additional checking of the location codes is needed, which usually takes O(r) computational time. In this paper, a new

algorithm to find the neighbors of a given leaf node in a quadtree is proposed which requires just O(1) (i.e., constant) computational time for the worst

case. Moreover, the algorithm takes no notice of the existence or nonexistence of neighbors. Thus, no additional checking is needed. The new algorithm will

greatly reduce the computational complexities of almost all algorithms based on quadtrees.

在四叉树中寻找邻居

7.A New Distance Measure for Model-Based Sequence Clustering

We review the existing alternatives for defining model-based distances for clustering sequences and propose a new one based on the Kullback-Leibler

divergence. This distance is shown to be especially useful in combination with spectral clustering. For improved performance in real-world scenarios, a model

selection scheme is also proposed

一种序列聚类距离的计算方法

8.3D Shape Recovery of Smooth Surfaces: Dropping the Fixed-Viewpoint Assumption

We present a new method for recovering the 3D shape of a featureless smooth surface from three or more calibrated images illuminated by different light

sources (three of them are independent). This method is unique in its ability to handle images taken from unconstrained perspective viewpoints and

unconstrained illumination directions. The correspondence between such images is hard to compute and no other known method can handle this problem locally

from a small number of images. Our method combines geometric and photometric information in order to recover dense correspondence between the images and

accurately computes the 3D shape. Only a single pass starting at one point and local computation are used. This is in contrast to methods that use the

occluding contours recovered from many images to initialize and constrain an optimization process. The output of our method can be used to initialize such

processes. In the special case of fixed viewpoint, the proposed method becomes a new perspective photometric stereo algorithm. Nevertheless, the introduction

of the multiview setup, self-occlusions, and regions close to the occluding boundaries are better handled, and the method is more robust to noise than

photometric stereo. Experimental results are presented for simulated and real images.

3D形状修复方法

9.A Novel Feature Selection Methodology for Automated Inspection Systems

This paper proposes a new feature selection methodology. The methodology is based on the stepwise variable selection procedure, but, instead of using the

traditional discriminant metrics such as Wilks' Lambda, it uses an estimation of the misclassification error as the figure of merit to evaluate the

introduction of new features. The expected misclassification error rate (MER) is obtained by using the densities of a constructed function of random

variables, which is the stochastic representation of the conditional distribution of the quadratic discriminant function estimate. The application of the

proposed methodology results in significant savings of computational time in the estimation of classification error over the traditional simulation and

cross-validation methods. One of the main advantages of the proposed method is that it provides a direct estimation of the expected misclassification error

at the time of feature selection, which provides an immediate assessment of the benefits of introducing an additional feature into an inspection/class

一种特征选择方法,使用错误分类率,节省计算量

10. Generalized Risk Zone: Selecting Observations for Classification

In this paper, we extend the risk zone concept by creating the Generalized Risk Zone. The Generalized Risk Zone is a model-independent scheme to select key

observations in a sample set. The observations belonging to the Generalized Risk Zone have shown comparable, in some experiments even better, classification

performance when compared to the use of the whole sample. The main tool that allows this extension is the Cauchy-Schwartz divergence, used as a measure of

dissimilarity between probability densities. To overcome the setback concerning pdf's estimation, we used the ideas provided by the Information Theoretic

Learning, allowing the calculation to be performed on the available observations only. We used the proposed methodology with Learning Vector Quantization,

feedforward Neural Networks, Support Vector Machines, and Nearest Neighbors.

在分类器训练集中寻找重点区域

11. Sign Language Spotting with a Threshold Model Based on Conditional Random Fields

Sign language spotting is the task of detecting and recognizing signs in a signed utterance, in a set vocabulary. The difficulty of sign language spotting is

that instances of signs vary in both motion and appearance. Moreover, signs appear within a continuous gesture stream, interspersed with transitional

movements between signs in a vocabulary and nonsign patterns (which include out-of-vocabulary signs, epentheses, and other movements that do not correspond

to signs). In this paper, a novel method for designing threshold models in a conditional random field (CRF) model is proposed which performs an adaptive

threshold for distinguishing between signs in a vocabulary and nonsign patterns. A short-sign detector, a hand appearance-based sign verification method, and

a subsign reasoning method are included to further improve sign language spotting accuracy. Experiments demonstrate that our system can spot signs from

continuous data with an 87.0 percent spotting rate and can recognize signs from isolated data with a 93.5 percent recognition rate versus 73.5 percent and

85.4 percent, respectively, for CRFs without a threshold model, short-sign detection, subsign reasoning, and hand appearance-based sign verification. Our

system can also achieve a 15.0 percent sign error rate (SER) from continuous data and a 6.4 percent SER from isolated data versus 76.2 percent and 14.5

percent, respectively, for conventional CRFs.

基于条件随机场的手语识别

12.An O(N2) Square Root Unscented Kalman Filter for Visual Simultaneous Localization and Mapping

This paper develops a Square Root Unscented Kalman Filter (SRUKF) for performing video-rate visual simultaneous localization and mapping (SLAM) using a

single camera. The conventional UKF has been proposed previously for SLAM, improving the handling of nonlinearities compared with the more widely used

Extended Kalman Filter (EKF). However, no account was taken of the comparative complexity of the algorithms: In SLAM, the UKF scales as O(N^{3}) in the state

length, compared to the EKF's O(N^{2}), making it unsuitable for video-rate applications with other than unrealistically few scene points. Here, it is shown

that the SRUKF provides the same results as the UKF to within machine accuracy and that it can be reposed with complexity O(N^{2}) for state estimation in

visual SLAM. This paper presents results from video-rate experiments on live imagery. Trials using synthesized data show that the consistency of the SRUKF is

routinely better than that of the EKF, but that its overall cost settles at an order of magnitude greater than the EKF for large scenes.

改进的卡尔曼滤波器用于定位和匹配

13.Supervised Learning of Quantizer Codebooks by Information Loss Minimization

This paper proposes a technique for jointly quantizing continuous features and the posterior distributions of their class labels based on minimizing

empirical information loss such that the quantizer index of a given feature vector approximates a sufficient statistic for its class label. Informally, the

quantized representation retains as much information as possible for classifying the feature vector correctly. We derive an alternating minimization

procedure for simultaneously learning codebooks in the euclidean feature space and in the simplex of posterior class distributions. The resulting quantizer

can be used to encode unlabeled points outside the training set and to predict their posterior class distributions, and has an elegant interpretation in

terms of lossless source coding. The proposed method is validated on synthetic and real data sets and is applied to two diverse problems: learning

discriminative visual vocabularies for bag-of-features image classification and image segmentation.


信息损失最小的量子化codebooks方法

14.A Stochastic Filtering Technique for Fluid Flow Velocity Fields Tracking

In this paper, we present a method for the temporal tracking of fluid flow velocity fields. The technique we propose is formalized within a sequential

Bayesian filtering framework. The filtering model combines an It? diffusion process coming from a stochastic formulation of the vorticity-velocity form of

the Navier-Stokes equation and discrete measurements extracted from the image sequence. In order to handle a state space of reasonable dimension, the motion

field is represented as a combination of adapted basis functions, derived from a discretization of the vorticity map of the fluid flow velocity field. The

resulting nonlinear filtering problem is solved with the particle filter algorithm in continuous time. An adaptive dimensional reduction method is applied to

the filtering technique, relying on dynamical systems theory. The efficiency of the tracking method is demonstrated on synthetic and real-world sequences.

液体流速场的跟踪

评论(?)
阅读(?)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
网易公司版权所有 ©1997-2009