Neural Networks and Boltzmann Machines

STOCHASTIC NEURAL NETWORKS AND ARTIFICIAL RETINAS
1985-1992

In the USA communities of computer science and statistical physics, an extremely active thrust began around 1982-84, to study formal models for neural networks activity, mainly focused on automatic learning from examples, either to memorize or to classify complex patterns of stimuli. These artificial neural networks models were an exciting relief after the built in functional weaknesses and unabashed hype of the expert systems US fad, and were crudely inspired by collaborations with neurobiologists, as well as by new magnetic field techniques for indirect observation of brain activity.
I had noticed the strengths of a 1984 article by G. HINTON and H. SEYJNOWSKI on asynchronous stochastic neural networks called Boltzmann machines. Contrary to rigid multilayers perceptrons architectures, the interactivity architecture of these stochastic neural networks was very generic, and their very simple local neural asynchronous dynamics converged to an explicit limit : a Gibbs measure on the set of global network configurations, with only second order interactions. The magic ingredient which clinched my interest was that the associated “learning rule” was derived from perfectly sound probabilistic principles and yet turned out to be a fairly simple version of the celebrated biological HEBB rule, according to which the strength of a neural synaptic connection increases when its endpoints are neurons with strongly correlated activities.
Instead of studying Boltzmann machines at dynamically decreasing temperature, as the original authors had done, which brought their Boltzmann machines very close to variants of simulated annealing schemes, I began exploring their learning capacity at fixed “temperatures”. I discovered exciting major generalizations of the Boltzmann machines paradigm, without loosing much in terms of their esthetic simplicity and learning power.
The first crucial extension was to prove the possibility to handle “multiple neurons” interactions, instead of simple pairwise interactions, and to naturally formalize these interactions by arbitrary finite “clique” structures, which brought these learning machines within the framework of dynamic Markov random fields.
The second extension was to analyze “synchronous” Boltzman machines, where the neuron dynamics was massively synchronized, lending itself naturally to fast simulations on massively parallel computers.
One of my key mathematical result was to show that completely computable efficient learning rules existed for interactive arrays of formal neurons having quite arbitrary connectivity architectures, and that these learning rules were completely “local”, a beautifully striking point since the learning actually optimized global performance criteria for the machine. From an epistemological point of view, these were crucial features for potential implementation on “biological hardware”. Another striking feature of generalized Boltzmann machines was the possibility to englobe recursive feedback loops in their architecture, while preserving the existence of computable local learning rules. My goal in exploring these flexible and versatile learning machines, where memory as well as learning rules were massiveley distributed, was to implement and test fairly complex models of “artificial retinas”, able to perform the basic low level vision tasks, and to auto-adapt their parameters by distributed learning from visual stimuli.
I enthusiastically attacked these ambitious research goals through the scientific direction of cross-related PhDs : Jerome LACAILLE, Mathilde MOUGEOT, Oussama CHERIF, Patrick HERBIN, and collaborations with Antoine DOUTRIAUX, Laurent YOUNES as well as with electronics specialists such as Patrick GARDA, Eric BELHAIRE. In MOUGEOT’s PhD thesis, we explored stochastic models of auto-organisation for the visual cortex of mammals, to define and analyze unsupervised learning dynamics leading, thanks to synaptic plasticity, to progressive emergence of structured hypercolumns of oriented edge detectors. We showed that long simulated sequences visual stimuli of formal retina models, by random images exhibiting jumbles of arbitrary smooth 2D-shapes, were sufficient to trigger the progressive structuration of weakly structured synaptic connectivity of formal neural retinas, into neural connectivity architectures roughly similar to hypercolumns of edge detectors in the human brain.
I knew that properly designed connectivity architectures for Boltzmann machines would emulate such hypercolumns architectures of neurons responding to specific orientations of local edges, and that Boltzmann neural dynamics on such structures could perform efficient continuous contour line extractions on digital images, starting with multilayer inputs generated by rectangular arrays of very simple local edge detectors. I had asked J. LACAILLE to explore in his PhD thesis Boltzmann machines architectures with cliques interactions to emulate hypercolumns low level vision systems, and he attacked it with unusual tenacity and inventivity. His computer science virtuosity coupled with talented algorithmic inventivity, and impressive scientific persistence, was an important factor to enable the successful formalization and simulation of generalized hypercolumns Boltzmann machines able to detect visual contour lines on digital images. The intensive simulations were performed on a challenging massively parallel computer : the Connection Machine with 16.000 processors bought by the ETCA lab in Paris.
My team work on Markov random fields applied to texture identification pointed to Boltzmann machines architectures for multilayer arrays of interconnected local texture indicators. I conjectured that these intelligent “Boltzmann retinas” would perform well for texture segmentation of digital images, and asked a young Polytechnics alumni, Oussama CHERIF, to center his PhD research on this question . O. CHERIF’s PhD thesis, powered by his exceptional computer science wizardry and a remarkably pragmatic perception of complex algorithmics, explored successfully this conjecture, and he showed the efficiency of massively parallel implementation of quasi synchronous dynamics for Boltzman like retinas dedicated to texture segmentation of images. CHERIF’s intensive simulations were performed on massively parallel computers such as Maspar with 4.000 processors, as well as the Connection Machine just mentioned above, to test and validate the feasability of massively distributed Boltzmann neural dynamics to implement texture segmentation tasks on digital images.
With A. DOUTRIAUX and L. YOUNES, we then applied Boltzmann machine learning to generate object recognition algorithms based on shape outline identification, and automatic learning. We designed generic neural architectures to implement multiscale recognition of 2D closed smooth curves, and one key idea I injected was to use neural local detectors to compute discrete piecewise versions of the Euler equation of 2D curves as low level inputs of these automatic classifiers. This enabled us to generate Boltzmann planar curve classifiers naturally endowed with built in invariance by translations, rotations, and scaling. We validated this technique on blurry infra-red images, and L. YOUNES went on to prove that Boltzmann machines architectures had universal modelization capabilities for approximate blackbox emulations of unknown functions.

References :
Synchronous Boltzmann machines and Gibbs fields
R. Azencott, "Neurocomputing", NATO ASI Lect.Notes vol F68 p 51-62 Springer 1990

Synchronous Boltzmann machines and artificial vision
R. Azencott, "Entretiens de Lyon: Neural Networks" p 135-143 Springer-Verlag 1990

Synchronous Boltzmann machines and outline based image classification
R. Azencott, A. Doutriaux, L.Younes; IEEE Proc."NeuralNetworks", Kleuwer, Paris 1990

Self-organization of orientation selective cells in the visual cortex
R. Azencott, M.Mougeot, Proc. Congres Neurosciences, Aussois, France 1990

Unsupervised learning for the visual cortex
R. Azencott, M.Mougeot, Proc. Int.J.Conf.Neural Nets, Seattle 1991

Contour lines extraction in images by synchronous Boltzmann machines
R. Azencott, J.Lacaille; Proc.Congres NeuroNimes, p 507-518 , Nimes, France 1991

Smooth contour lines in images and synchronous Boltzmann machines
R. Azencott, J.Lacaille ; Proc. Int. J. Cong. Neural Nets,World Scientific, Singapore 1991

Boltzmann machines : High order interaction and synchronous learning
R. Azencott, "Stoch. models / Image analysis", Lect.NotesStat. vol 74. p14-45,Springer 1992

Synchronous Boltzmann machines and artificial vision
R. Azencott, J. Lacaille, L. Younes; Courrier CNRS, Paris, vol 79, 1992

Synchronous Boltzmann machines and curve identification tasks
R. Azencott, A.Doutriaux, L.Younes; Network, vol 4, p 461-480 1993

Stochastic Neural Nets applied to optimization and pattern recognition
R. Azencott, Invited Lectures, SummerSchool, EDF,CEA,INRIA ,Le Breau France 1994