Stochastic processes and Differential geometry

2D-SHAPES DEFORMATIONS AND SIMILARITY
STOCHASTIC DEFORMATIONS OF 2D-SHAPES
3D-SHAPES RECOGNITION
IMAGE CONTENTS : INDEXATION AND BROWSING

 

2D-SHAPES DEFORMATIONS AND SIMILARITY
1994-1999

In 1994, I prepared and presented to major french national research funding agencies (DRET and CEA) an ambitious three years research program on automatic classification of 2D planar shapes extracted from digital images, and obtained supporting grants for my SUDIMAGE consulting group. This research program focused on several exciting scientific goals to achieve fast automatic classification, compression, and recognition of very large families of bounded connected 2D-shapes automatically extracted from large databases of digital images.
My main goal was to conceive, understand, and test sound mathematical approaches to formalize the fuzzy notions of visual similarity for planar shapes, as well as shape recognition, in order to create fast programs for automatic efficient storage and retrieval of millions of planar shapes. One crucial functionality was to implement automatic fast extraction of all stored shapes sufficiently “similar” to any incoming digitalized planar shape.
The first steps to be computerized were massive extraction of bounded connected 2D-shapes, followed by automatic texture and colour segmentation of these shapes, construction of their piecewise smooth planar outlines, and storage of the results in adequately structured databases. I had then identified eight important scientific subgoals and mathematical methods to attack them :
- define robust distances between any two families of multi-contours
- define geometrically invariant “shape distances” between planar shapes outlines
- extract “signatures” based on curvature extrema, for piecewise smooth curves
- define rigorously “texture distances” and distances between “texture distributions”
- define rigorously “colour distances” and distances between “colour distributions”
- compress extensive bases of 2D-shapes by epsilon-nets coverings
- generate a dictionnary of prototype shapes
- classify new planar shapes by extraction of closest prototype in dictionary.
In collaboration with Jose PAUMARD during his PhD thesis, we introduced simple but powerful ideas to define and compute “visual” distances between any two families of multicontours : we introduced a new very efficient probabilistic “censorship” to compute robust versions of Hausdorf distances between any pair of 2D-vignettes, after pretreatment by continuous contour lines extraction, scale matching, small geometric distortions.
Quite smart programming techniques allowed PAUMARD to implement optimized versions of these censored Hausdorff distances, and to generate highly intensive computerized algorithms for crude but fast scene analysis in 2D-images. This approach was successfully tested to automatically locate and identify “known” 3D-objects, namely buildings, in large scale 2D-images. Each “known” 3D-building was here initially crudely “defined” by a finite sample of 2D-photos taken from various angles.
A whole family of interesting cross-related papers by Yael AMIT, Ulf GRENANDER, TERZOPOULOS, Laurent COHEN, Pierre CINQUIN, Michael MILLER, were at the time actively exploring the automatic matching of images by elastic deformations of curves and surfaces, and I launched several seminars at Ecole Normale Sup. on these topics. After reading a stimulating paper by Yael AMIT on Gaussian modeling for unknown elastic distortions yielding the “best” matching between two images, I had perceived that a key intrinsic mathematical limitations of AMIT’s paper was the fact in his model, the “best” elastic deformation matching two images could very well not be a one-to-one diffeomorphism of the square.
Given my background in McKEAN style “multiplicative” stochastic integrals, I soon understood that a still missing crucial mathematical concept in the interesting Yali AMIT papers I was reading, was to modelize randomness for “paths” of small deformations within diffeomorphisms groups, so that the search for optimal deformations would still yield an effective diffeomorphism. I hence began working on the exciting links between left invariant metrics on infinite dimensional Lie groups of diffeomorphisms of the unit interval, (resp. of the unit square, or of the unit cube), and the rigorous construction of “deformation invariant” distances between two curves, two surfaces, or two images.
It became mathematically clear for me that the computation of such distances involved solving explicit variational problems equivalent to the construction of precise geodesics in the group of deformations considered. Such deformation paths with minimal deformation “energy” had in fact been encountered much earlier by ARNOLD and MARSDEN in fluid mechanics.
I gave a communication on this beautiful theme at a 1994 congress on Monte-Carlo methods in Cortona Italy, and strongly encouraged Laurent YOUNES and Alain TROUVE to undertake a methodic approach of its feasability. They both wrote several elegant mathematical papers solidifying this “Lie group of deformations” approach and testing the associated variational calculus on actual image matching by deformations. Since then, this field has bloomed into the fascinating domain of computational anatomy, where both TROUVE and YOUNES have become remarkable and internationally known experts.
In the simplest case of planar closed curves, I had pointed out that this approach and the use of Euler equations actually would yield smart explicit closed formulas for deformation invariant distances between two geometric curves. L. YOUNES studied this case thoroughly and obtained very satisfying elegant closed mathematical formulas for such distances.
I then launched a major R&D effort for the SUDIMAGE group, with W.F. XIONG, J.P. WANG, F. COLDEFY, J. PAUMARD, to then undertake under YOUNES technical direction the optimized coding implementations of massive computations of such “deformation distances” for very large databases of planar shapes outlines. These shapes databases became a fascinating experimentation and testing ground for storage and fast approximate retrieval of planar shapes in large “shape databases”. SUDIMAGE went on from there to the completion of the 3 years research program I outlined above, which then led in 2001 to the launching of a commercial navigation software in images databases

 

 

 >>Back to Top

STOCHASTIC DEFORMATIONS OF 2D-SHAPES
1994-1999

Simultaneously, I pursued another research path, directing the PhD theses of Pierre BENARD and Eric LOHMER, to explore stochastic deformations of curves (and of interelated sets of curves) in shape recognition. With BENARD, we showed that the discrete schemes for gaussian stochastic deformations of curves introduced by GRENANDER and his school converged towards explicit continuous random deformation paths in the appropriate Lie group of curve deformations.
With LOHMER’s PhD, focused on automatic recognition of handwritten characters, we extracted, from a large US database of handwritten data, 26 sparse and robust stochastic implicit models of random deformations for the 26 alphabet letters. These stochastic deformation models, learned from the image data, naturally then provided maximum likelihood discrimination between alphabet letters, and thus generated efficient automatic recognition of handwritten alphabet letters.
In a similar spirit, the SUDIMAGE group undertook on its own funds, the implementation, by J.P. WANG and L. YOUNES, of automated algorithmic recognition of human faces from key features extraction and elastic deformation matching of key features spatial groupings

 >>Back to Top

 

3D-SHAPES RECOGNITION
1994-1999

With the SUDIMAGE team, and my research group at Ecole Normale Sup. CMLA lab, I had also launched a foray into generic recognition tools for 3D-shapes, based on automatic computation of empirical aspects graph extracted from a multiplicity of 2D views. My practical three years goal was then to conceive and realize fast and efficient retrieval algorithmics among large databases of 3D virtual objects, with numerous potential industrial applications for intensive end-users and conceptors of 3D synthetic objects.
When cameras move around a 3D piecewise smooth object, the outlines of its 2D images undergo abrupt geometric “catastrophes”. These “aspects” changes, identified by discontinuities in local geometric characterestics, provided early computer vision algorithmics with ad hoc 3D-objects identifiers, the so called “aspects graphs” with pairwise connections between contigüous aspects.
The high complexity of exact “aspects graphs” for algebraic surfaces had recently been abundantly proved, and I felt that cruder empirical versions of aspects graphs would be the only ones feasible for fairly practical 3D object recognition. In his thesis under my direction, Patrick HERBIN validated the feasibility of 3D-objects discrimination by randomly moving cameras, through incremental extraction of empirical aspects graphs. A key point was to summarize such empirical graphs by finite time observation of finite states Markov chains, and to estimate maximum likelihood discrimination rates thanks to large deviations asymptotics for Markov chains.
I went on to push these ideas towards more elegant principles : automatic segmentation of the sphere of view of a 3D object by attaching to each viewpoint a complete 2D outline of the object, namely its “apparent contour”, and using Markovian energy functions to segment into “homogeneous zones” this 2D field of 2D outlines. This required the intensive use of adequate distances between 2D closed curves, a crucial point fairly by now well understood after the SUDIMAGE results recalled above. I presented this approach to 3D shape recognition in an advanced course which I gave at the 1998 international image analysis school organized at Institut Henri Poincare (Paris) by Y. MEYER, J.M.MOREL, D. MUMFORD.
A small scale demonstrator of this methodology, handling “synthetic 3D models” of real objects, was realized and tested at SUDIMAGE by L. YOUNES with his PhD student T. FELDMAN.

 

>>Back to Top

IMAGE CONTENTS : INDEXATION AND BROWSING
1996-2001

The promising algorithmic results obtained by SUDIMAGE for compact storage and fast fuzzy retrieval within large databases of planar shapes extracted from digital images, led me to undertake in 1997 the development of a powerful software engine [called EasyGlider] for automatic indexation and retrieval of “fuzzy” graphic contents for large databases of digital images.
The mathematical and coding complexity of the task, which involved sophisticated algorithmics, as well as advanced programming, forced me to seek and obtain a major grant from ANVAR (national agency for support of innovative technologies), and SUDIMAGE completed the first prototype version of EasyGlider in 1999.
After obtaining an ANVAR innovation label, I prepared and filed in 2000 two international patents protecting our software, and SUDIMAGE began the industrialisation and commercialization of EasyGlider, a costly operation which very quickly required recruiting more computer scientists.
This was achieved at the end of 2000, by recruiting an experienced CEO and a sales force, more computer science engineers, gifted scientific advisors (H. KABLA as CTO and L. YOUNES as CSO), and raising $ 6 millions from venture capital funds to finance the EASYGLIDER startup company.
Technically, the initial steps for massive automatic indexation of image contents by EasyGlider are to automatically extract, after fast image segmentations, the textures and colour distributions, the closed contour outlines, etc.
For each type of graphic characteristic extracted, all described by vectors of large dimensions, such as the Gabor wavelets representations for microtextures, it is then essential to compress in auto-adaptive mode, the mass of such vector characteristics exctracted from the whole database. This goal is achieved by the introduction of an adequate “deformation distance” for each type of graphic characteristic, and the automatic generation of associated dictionnaries of content prototypes . These dictionaries are essentially optimized pavings for the universe of all instances of graphic contents characteristics extracted from the image database to be automatically indexed.
The dictionaries and associated distance enable the massive automatic computation of hyperlinks between any pair of images in the database, in such a way that any two images J1 and J2 connected by such a hyperlink must have appoximately similar graphic contents characteristics;
The offline computation of those millions of hyperlinks, and their automated selection, storage, and ordering by decreasing strength in Hyperlinks Tables, provides computerized compact tables of exploitable indexation results, stored in standard databases formats such as the ORACLE databases formats. It is then possible to install such indexation tables on powerful computing servers, accessible by standard SQL database queries through client-server 3-tiered architectures, these multi-users multi-queries Internet or Intranet architectures being coded in commercial softwares involving fairly standard JAVA scripts and software tools.
Thus after automated off line indexation of the images database by EasyGlider, the on line Easyglider navigation architecture and engine enable, via Internet or Intranet, instant intelligent and content driven access to the image database, for wide groups of non specialized end users, accessing it from their standard personal PCs under ordinary Netscape or Internet Explorer environments.
In this context, content driven navigation is implemented very intuitively, through implicit requests by end users, since every image selected in the database by an end user can then automatically launch in real time the search for a short list of images with analogous graphic contents, and display the retrieval results on line.

 

>>Back to Top

 

AUTOMATIC INDEXATION OF SEMANTIC CONTENTS
1997-2001

Most large image documents involved in practical applications, for instance in printed news or wide audience media databases, are quite naturally linked to large databases of textual documents, so that massive indexation of documents contents cannot focus on images only, without handling associated textual document automatic content idexation. In 1997, I had hence started at SUDIMAGE an intensive R&D effort to develop, within our EasyGlider software, an efficient algorithmic engine dedicated to semantic analysis, and naturally based on massive and sophisticated frequential distribution analysis for adequate “words groupings”, in large databases of text documents, after pretreatments by commercial computerized multilingual dictionnaries and words lemmatization softwares. From a theoretical point of view, this was and still is a very active field of inquiry for probabilists, involving for instance sophisticated handling of random tree structures as shown in the interesting thesis written by J.P. VERT under O. CATONI’s scientific direction.
I actually enlisted bright young mathematicians as part time consultants, such as J.P. VERT, to advise our computer scientists, but given the tremendous development of automated textual indexation softwares by commercial giants such as YAHOO or GOOGLE, we had to move quite fast to realize our own textual indexation software engine.
We implemented an automatic computer extraction of “rough pragmatic semantic contents”, based on sophisticated probabilistic clustering techniques, where distances between “words regroupings” are computed from probabilistic models of multi-occurences. Since naturally such multi-occurences are authentic “rare events” , simplified large deviations theory turned out to be an interesting and useful conceptual guide to devise our semantic distances . However the crushing sheer mass of words data to sift through at high speeds always gave a crucial importance to superoptimisation of computing times and computing codes. Automatic texts indexation by EasyGlider followed implemented similar steps to our image data indexation : automatic generation of content prototypes, computation of semantic distances between pairs of texts, creation of a minimal database covering by networks of texts prototypes, massive prec-omputation and pre-selection of texts hyperlinks, optimized storage of hyperlinks tables, on line content driven navigation by Internet 3-tiered query architectures

References:
Geodesics in diffeomorphisms groups : shapes deformation distance
Congress “Stochastic structures and Monte-Carlo optimization", Cortona, Italy 1994

Distance for elastic matching in object recognition
R. Azencott, F. Coldefy, L. Younes ; Int. Conf. Pattern Recogn. ICPR 96, Vienna 1996

Multiscale identification of buildings in compressed aerial scenes
R. Azencott, F. Durbin, J. Paumard; Int. Conf. Pattern Recogn. ICPR 96, Vienna 1996

Robust recognition in compressed large aerial scenes
R. Azencott, F. Durbin, J. Paumard ; Int. Conf. Image Processing, Lausanne 1996

Energy minimization for matching structured object representations
R. Azencott, L. Younes, Lecture Notes in Computer Science vol 1223 1997

Texture classification using windowed Fourier filters
R. Azencott, JP. Wang, L. Younes; IEEE Trans.PAMI, vol 19 n°2, pp 148-152 1997


 

>>Back to Top