Fourth International Conference on Computer Science and Information Technology (CoSIT 2017)

Please download to get full document.

View again

of 9
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Documents

Published:

Views: 4 | Pages: 9

Extension: PDF | Download: 0

Share
Related documents
Description
This paper presents a statistical framework for recognising 2D shapes which are represented as an arrangement of curves or strokes. The approach is a hierarchical one which mixes geometric and symbolic information in a three-layer architecture. Each curve primitive is represented using a point-distribution model which describes how its shape varies over a set of training data. We assign stroke labels to the primitives and these indicate to which class they belong. Shapes are decomposed into an arrangement of primitives and the global shape representation has two components. The first of these is a second point distribution model that is used to represent the geometric arrangement of the curve centre-points. The second component is a string of stroke labels that represents the symbolic arrangement of strokes. Hence each shape can be represented by a set of centre-point deformation parameters and a dictionary of permissible stroke label configurations. The hierarchy is a two-level architecture in which the curve models reside at the nonterminal lower level of the tree. The top level represents the curve arrangements allowed by the dictionary of permissible stroke combinations. The aim in recognition is to minimise the cross entropy between the probability distributions for geometric alignment errors and curve label errors. We show how the stroke parameters, shape-alignment parameters and stroke labels may be recovered by applying the expectation maximization EM algorithm to the utility measure. We apply the resulting shape-recognition method to Arabic character recognition.
Transcript
    Dhinaharan Nagamalai et al. (Eds) : CoSIT, SIGL, AIAPP, CYBI, CRIS, SEC, DMA - 2017 pp. 39– 47, 2017. © CS & IT-CSCP 2017 DOI : 10.5121/csit.2017.70405 H  ANDWRITTEN  C HARACTER R  ECOGNITION U SING  S TRUCTURAL  S HAPE D ECOMPOSITION   Abdullah A. Al-Shaher 1  and Edwin R. Hancock  2   1 Department of Computer and Information Systems, College of Business Studies, Public Authority for Applied Education and Training, Kuwait 2 Department of Computer Science, University of York, York, United Kingdom  A  BSTRACT This paper presents a statistical framework for recognising 2D shapes which are represented as an arrangement of curves or strokes. The approach is a hierarchical one which mixes geometric and symbolic information in a three-layer architecture. Each curve primitive is represented using a point-distribution model which describes how its shape varies over a set of training data. We assign stroke labels to the primitives and these indicate to which class they belong. Shapes are decomposed into an arrangement of primitives and the global shape representation has two components. The first of these is a second point distribution model that is used to represent the geometric arrangement of the curve centre-points. The second component is a string of stroke labels that represents the symbolic arrangement of strokes. Hence each shape can be represented by a set of centre-point deformation parameters and a dictionary of  permissible stroke label configurations. The hierarchy is a two-level architecture in which the curve models reside at the nonterminal lower level of the tree. The top level represents the curve arrangements allowed by the dictionary of permissible stroke combinations. The aim in recognition is to minimise the cross entropy between the probability distributions for geometric alignment errors and curve label errors. We show how the stroke parameters, shape-alignment  parameters and stroke labels may be recovered by applying the expectation maximization EM algorithm to the utility measure. We apply the resulting shape-recognition method to Arabic character recognition.  K   EYWORDS  point distribution models, expectation maximization algorithm, discrete relaxation, hierarchical mixture of experts, Arabic scripts, handwritten characters 1. I NTRODUCTION   The analysis and recognition of curved shapes has attracted considerable attention in the computer vision literature. Current work is nicely exemplified by point distribution models [1] and shape-contexts [2]. However, both of these methods are based on global shape-descriptors. This is potentially a limitation since a new model must be acquired for each class of shape and this is an inefficient process. An alternative and potentially more flexible route is to use a structural approach to the problem, in which shapes are decomposed into arrangements of  40 Computer Science & Information Technology (CS & IT) primitives. This idea was central to the work of Marr [3]. Shape learning may then be decomposed into a two-stage process. The first stage is to acquire a models of the variability is the distinct primitives. The second stage is to learn the arrangements of the primitives corresponding to different shape classes. Although this structural approach has found widespread use in the character recognition community, it has not proved popular in the computer vision domain. The reason for this is that the segmentation of shapes into stable primitives has proved to be an elusive problem. Hence, there is a considerable literature on curve polygonalisation, segmentation and grouping. However, one of the reasons that the structural approach has proved fragile is that it has been approached using geometric rather than statistical methods. Hence, the models learned and the recognition results obtained are highly sensitive to segmentation error. In order to overcome these problems, in this paper we explore the use of probabilistic framework for recognition. We focus on the problem of developing hierarchical shape models for handwritten Arabic characters. These characters are decomposed into concave or convex strokes. Our statistical learning architecture is reminiscent of the hierarchical mixture of experts algorithm. This is a variant of the expectation maximisation algorithm, which can deal with hierarchically structured models. The method was first introduced in 1994 by Jordan and Jacobs [4]. In its simplest form the method models data using a doubly nested mixture model. At the top layer the mixture is over a set of distinct object classes. This is sometimes referred to as the gating layer. At the lower level, the objects are represented as a mixture over object subclasses. These sub-classes feed into the gating later with predetermined weights. The parameters of the architecture reside in the sublayer, which is frequently represented using a Gaussian mixture model. The hierarchical mixture of experts algorithm provides a means of learning both the gating weights and the parameters of the Gaussian mixture model. Here our structural decomposition of 2D shapes is a hierarchical one which mixes geometric and symbolic information. The hierarchy has a two-layer architecture. At the bottom layer we have strokes or curve primitives. These fall into different classes. For each class the curve primitive is represented using a point-distribution model [5] which describes how its shape varies over a set of training data. We assign stroke labels to the primitives to distinguish their class identity. At the top level of the hierarchy, shapes are represented as an arrangement of primitives. The representation of the arrangement of primitives has two components. The first of these is a second point distribution model that is used to represent how the arrangement of the primitive centre-points varies over the training data. The second component is a dictionary of configurations of stroke labels that represents the arrangements of strokes at a symbolic level. Recognition hence involves assigning stroke symbols to curves primitives, and recovering both stroke and shape deformation parameters. We present a probabilistic framework which can be for the purposes of shape-recognition in the hierarchy. We apply the resulting shape-recognition method to Arabic character recognition. 2. S HAPE  R EPRESENTATION We are concerned with recognising a shape },...,{= 1  p wwW   which consists of a set of  p  ordered but unlabelled landmark points with 2D co-ordinate vectors 1 w , ....,  p w . The shape is assumed to be segmented into a set of K   non-overlapping strokes. Each stroke consists of a set  Computer Science & Information Technology (CS & IT) 41 of consecutive landmark points. The set of points belonging to the stroke indexed k   is k  S  . For each stroke, we compute the mean position ik S ik k  wS c  ρρ ∑ ∈ 1=  (1) The geometry of stroke arrangement is captured by the set of mean position vectors },....{= 1 K  ccC  . Our hierarchical model of the characters uses both geometric and symbolic representations of the shapes. The models are constructed from training data. Each training pattern consists of a set of landmark points that are segmented into strokes. We commence by specifying the symbolic components of the representation. Each training pattern is assigned to shape class and each component stroke is assigned to stroke class. The set of shape-labels is c Ω  and the set of stroke labels is s Ω . The symbolic structure of each shape is represented a permissible arrangement of stroke-labels. For shapes of class c Ω∈ ω   the permissible arrangement of strokes is denoted by >,...,=< 21 ω ω ω   λ λ  Λ  (2) We model the geometry of the strokes and stroke-centre arrangements using point distribution models. To capture the shape variations, we use training data. The data consists of a set of shapes which have been segmented into strokes. Let the th t   training pattern consist of the set of  p  landmark co-ordinate vectors },......,{= 1  pt   x x X  . Each training pattern is segmented into strokes. For the training pattern indexed t   there are t  K   strokes and the index set of the points belonging to the th k   stroke is t k  S  . To construct the point distribution model for the strokes and stroke-centre arrangements, we convert the point co-ordinates into long-vectors. For the training pattern indexed t  , the long-vector of stroke centres is T T t  LT t T t  t  ccc X  ))(,....,)(,)((= 21 . Similarly for the stroke indexed k   in the training pattern indexed t  , the long-vector of co-ordinates is denoted by k t   z , . For examples shapes belonging to the class ω  , to construct the stroke-centre point distribution model we need to first compute the mean long vector t T t   X T Y  ∑ ∈ ω ω ω  1=  (3) where ω  T   is the set of index patterns and the associated covariance matrix T t t T t  Y  X Y  X  T  ))((1= ω ω ω ω ω   −−Σ  ∑ ∈  (4)  42 Computer Science & Information Technology (CS & IT) The eigenmodes of the stroke-centre covariance matrix are used to construct the point-distribution model. First, the eigenvalues e  of the stroke covariance matrix are found by solving the eigenvalue equation 0|=|  I e ω ω   −Σ  where  I   is the  L L 22  ×  identity matrix. The eigen-vector i φ   corresponding to the eigenvalue ω  i e  is found by solving the eigenvector equation ω ω ω  φ φ  iii e = Σ . According to Cootes and Taylor [6], the landmark points are allowed to undergo displacements relative to the mean-shape in directions defined by the eigenvectors of the covariance matrix ω  Σ . To compute the set of possible displacement directions, the  M   most significant eigenvectors are ordered according to the magnitudes of their corresponding eigenvalues to form the matrix of column-vectors )|...||(= 21 ω ω ω ω   φ φ φ   M  Φ , where ω ω ω   M  eee ,.....,, 21  is the order of the magnitudes of the eigenvectors. The landmark points are allowed to move in a direction which is a linear combination of the eigenvectors. The updated landmark positions are given by ω ω ω   γ   Φ+ Y  X  =ˆ , where ω  γ    is a vector of modal co-efficients. This vector represents the free-parameters of the global shape-model. This procedure may be repeated to construct a point distribution model for each stroke class. The set of long vectors for strokes of class λ   is }=|{= ,  λ λ  λ  t k k t   Z T  . The mean and covariance matrix for this set of long-vectors are denoted by λ  Y   and λ  Σ  and the associated modal matrix is λ  Φ . The point distribution model for the stroke landmark points is λ λ λ   γ   Φ+ Y  Z  =ˆ . We have recently described how a mixture of point-distribution models may be fitted to samples of shapes. The method is based on the EM algorithm and can be used to learn point distribution models for both the stroke and shape classes in an unsupervised manner. We have used this method to learn the mean shapes and the modal matrices for the strokes. More details of the method are found in [7]. 3. H IERARCHICAL  A RCHITECTURE With the stroke and shape point distribution models to hand, our recognition method proceeds in a hierarchical manner. To commence, we make maximum likelihood estimates of the best-fit parameters of each stroke-model to each set of stroke-points. The best-fit parameters k  λ  γ    of the stroke-model with class-label λ   to the set of points constituting the stroke indexed k   is ),|( maxarg=  γ  γ   λ γ  λ   Φ k k   z p  (5) We use the best-fit parameters to assign a label to each stroke. The label is that which has maximum a posteriori probability given the stroke parameters. The label assigned to the stroke indexed k   is
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks