Thursday, March 11, 2010

face recognition using neural network

CHAPTER 1
INTRODUCTION
Facial feature extraction consists in localizing the most characteristic face components (eyes, nose, mouth, etc.) within images that depict human faces. This step is essential for the initialization of many face processing techniques like face tracking, facial expression recognition or face recognition. Among these, face recognition is a lively research areawhere it has been made a great effort in the last years to design and compare different techniques.

With the advent of electronic medium, especially computer, society is increasingly dependent on computer for processing, storage and transmission of information. Computer plays an important role in every parts of today life and society in modern civilization. With increasing technology, man becomes involved with computer as theleader of this technological age and the technological revolution has taken place all over the world based on it. It has opened a new age for humankind to enter into a new world, commonly known as the technological world.
Information and Communication Technologies are increasingly entering in all aspects of our life and in all sectors, opening a world of unprecedented scenarios where people interact with electronic devices embedded in environments that are sensitive and responsive to the presence of users. Indeed, since the first examples of “intelligent” buildings featuring computer aided security and fire safety systems, the request for more sophisticated services, provided according to each user’s specific needs has characterized the new tendencies within domotic research. The result of the evolution of the original concept of home automation is known as Ambient Intelligence referring to an environment viewed as a “community” of smart objects powered by computational capability and high user-friendliness, capable of recognizing and responding to the presence of different individuals in a seamless, not-intrusive and often invisible way. As adaptivity here is the key for providing customized services, the role of person sensing and recognition become of fundamental importance.

This scenario offers the opportunity to exploit the potential of face as a not intrusive biometric identifier to not just regulate access to the controlled environment but to adapt the provided services to the preferences of the recognized user. Biometric recognition (Maltoni et al., 2003) refers to the use of distinctive physiological (e.g., fingerprints, face, retina, iris) and behavioural (e.g., gait, signature) characteristics, called biometric identifiers, for automatically recognizing individuals. Because biometric identifiers cannot be easily misplaced, forged, or shared, they are considered more reliable for person recognition than traditional token or knowledge-based methods. Others typical objectives of biometric recognition are user convenience (e.g., service access without a Personal Identification Number), better security (e.g., difficult to forge access). All these reasons make biometrics very suited for Ambient Intelligence applications, and this is specially true for a biometric identifier such as face which is one of the most common methods of recognition that humans use in their visual interactions, and allows to recognize the user in a not intrusive way without any physical contact with the sensor.
A generic biometric system could operate either in verification or identification modality, better known as one-to-one and one-to-many recognition (Perronnin & Dugelay, 2003). In the proposed Ambient Intelligence application we are interested in one-to-one recognition, 2 Face Recognition as we want recognize authorized users accessing the controlled environment or requesting a specific service. We present a face recognition system based on 3D features to verify the identity of subjects accessing the controlled Ambient Intelligence Environment and to customize all the services accordingly. In other terms to add a social dimension to man-machine communication and thus may help to make such environments more attractive to the human user. The proposed approach relies on stereoscopic face acquisition and 3D mesh reconstruction to avoid highly expensive and not automated 3D scanning, typically not suited for real time applications. For each subject enrolled, a bidimensional feature descriptor is extracted from its 3D mesh and compared to the previously stored correspondent template. This descriptor is a normal map, namely a color image in which RGB components represent the normals to the face geometry. A weighting mask, automatically generated for each authorized person, improves recognition robustness to a wide range of facial expression. This chapter is organized as follows. In section 2 related works are presented and the proposed method is introduced. In section 3 the proposed face recognition method is presented in detail. In section 4 the Ambient Intelligence framework is briefly discussed and experimental results are shown and commented. The paper concludes in section 5 showing directions for future research and conclusions.
1.1 A brief history
The subject of face recognition is as old as computer vision. face recognition has always remains a major focus of research despite the fact that methods of identification like fingerprints, or iris scans can be more accurate is the non-invasive nature and because it is people's primary method of person identification. early example of a face recognition system is provided by Kohonen , who showed that a simple neural network could perform face recognition for aligned and normalized face images. In his system, the computation and recognition of a face description was done by approximating the eigenvectors of the face image's autocorrelation matrix. and these eigenvectors are now known as `eigenfaces.' schemes based on edges, inter-feature distances, and other neural net approaches was tried out by many researchers. Kirby and Sirovich introduced an algebraic manipulation in 1989 which made it easy to directly calculate the eigenfaces, and showed that fewer than 100 were required to accurately code carefully aligned and normalized face images. the residual error when coding using the eigenfaces was used both to detect faces in cluttered natural imagery by Turk and Pentland in 1991. They showed that by coupling this method for detecting and localizing faces with the eigenface recognition method, one could achieve reliable, real-time recognition of faces in a minimally constrained environment.
1.2MOTIVATION
• .Identity fraud is becoming a major concern for all the governments around the globe .
• .Reliable methods of biometric personal identification exists ,but these methods rely on the cooperation of the participants.
• Neural networks are good tool for classification.

1.3EXPERIMENTAL RESULTS
As one of the aims in experiments was to test the performance of the proposed method in a realistic operative environment, we decided to build a 3D face database from the face capture station used in the domotic system described above. The capture station featured two digital cameras with external electronic strobes shooting simultaneously with a shutter speed of 1/250 sec. while the subject was looking at a blinking led to reduce posing issues. More precisely, every face model in the gallery has been created deforming a pre-aligned prototype polygonal face mesh to closely fit a set of facial features extracted from front and side images of each individual enrolled in the system. Indeed, for each enrolled subject a set of corresponding facial features extracted by a structured snake method from the two orthogonal views are correlated first and then usedto guide the prototype mesh warping, performed through a Dirichlet Free Form Deformation. The two captured face images are aligned, combined and blended resulting in a color texture precisely fitting the reconstructed face mesh through the feature points previously extracted. The prototype face mesh used in the dataset has about 7K triangular facets, and even if it is possible to use mesh with higher level of detail we found this resolution to be adequate for face recognition. This is mainly due to the optimized tessellation which privileges key area such as eyes, nose and lips whereas a typical mesh produced by 3D scanner features almost evenly spaced vertices. Another remarkableadvantage involved in the warp based mesh generation is the ability to reproduce a broad range of face variations through a rig based deformation system. This technique is commonly used in computer graphics for facial animation (Lee et al., 1995, Blanz & Vetter, 1999) and is easily applied to the prototype mesh linking the rig system to specific subsets of vertices on the face surface. Any facial expression could be mimicked opportunely combining the effect of the rig controlling lips, mouth shape, eye closing or opening, nose 10 Face Recognition tip or bridge, cheek shape, eyebrows shape, etc. The facial deformation model we used is based on (Lee et al., 1995) and the resulting expressions are anatomically correct. We augmented the 3D dataset of each enrolled subject through the synthesis of fifteen additional expressions selected to represent typical face shape deformation due to facial expressive muscles, each one included in the weighting mask. The fiften variations to the neutral face are grouped in three different classes: “good-mood”, “normal-mood” and “badmood” emotional status (see Figure 1.1).


Fig 1.1 Facial Expressions grouped in normal-mood (first row), good-mood (second row),
bad-mood (third row)
We acquired three set front-side pair of face images from 235 different persons in three subjective facial expression to represent “normal-mood”, “good-mood” and “bad-mood” emotional status respectively (137 males and 98 females, age ranging from 19 to 65). Figure 9. Facial Expressions grouped in normal-mood (first row), good-mood (second row), bad-mood (third row) For the first group of experiments, we obtained a database of 235 3D face models in neutral pose (represented by “normal-mood” status) each one augmented with fiften expressive variations. Experimental results are generally good in terms of accuracy, showing a Recognition Rate of 100% using the expression weighting mask and flesh mask, the Gaussian function with 􀇔=4.5 and k=50 and normal map sized 128 × 128 pixels. Theseresults are generally better than those obtained by many 2D algorithms but a more meaningful comparison would require a face dataset featuring both 2D and 3D data. To this aim we experimented a PCA-based 2D face recognition algorithm [Moon and Phillips 1998, Martinez and Kak 2001] on the same subjects. We have trained the PCA-based recognition system with frontal face images acquired during several enrolment sessions (from 11 to 13 images for each subject), while the probe set is obtained from the same frontal images used to generate the 3D face mesh for the proposed method. This experiment has shown that our method produce better results than a typical PCA-based recognition algorithm on the same subjects. More precisely, PCA-based method reached a recognition rate of 88.39% on grayscaled images sized to 200 × 256 pixels, proving that face dataset was really challenging.
CHAPTER 2
Classical Neural Networks
During the last few decades, neural networks have moved from theory to offering solutions for industrial and commercial problems. Many people are interested in neural networks from many different perspectives. Engineers use them to build practical systems to solve industrial problems. For example, neural networks can be used for the control of industrial processes.
2.1 Neural Network History
Attempts to model the human brain appeared with the creation of the first computer. Neural network paradigms were used for sensor processing, pattern recognition, data analysis, control, etc. We analyze, in short, different approaches for neural network development.
2.2 McCulloch and Pitts Neural Networks
The paper of McCulloch and Pitts [5] was the first attempt to understand the functions of the nervous system. For explanation, they used very simple types of neural networks, and they formulated the following five assumptions according to the neuron operation:
1. The activity of the neuron is an “all-or-none” process.
2. A certain fixed number of synapses must be excited within the period of latent addition in order to excite a neuron at any time, and this number is independent of previous activity and position of the neuron.
3. The only significant delay within the nervous system is synaptic delay.
4. The activity of any inhibitory synapse absolutely prevents excitation of the neuron at that time.
5. The structure of the net does not change with time.
2.3 Hebb Theory
Hebb tried to work out the general theory of behavior . The problem of understanding behavior is the problem of understanding the total action of the nervous system, and vice versa. He attempted to bridge the gap between neurophysiology and psychology. Perception, learning in perception, and assembly formation werethe main themes in his scientific investigations. Experiments had shown perceptual generalization. The repeated stimulation of specific receptors will lead to the formation of an “assembly” of association-area cells which can act briefly as a closed system. The synaptic connections between neurons become well-developed.

Fig. 2.2 Neural presentation of a model “building”
Every assembly corresponds to any image or any concept. The idea that an image is presented by not just one neuron but by an assembly is fruitful. Any concept may have different meanings. Its content may vary depending on the context. Only the central core of the concept whose activity may dominate in the system as a whole can be almost unchangeable. The possible presentation of an image or concept with one neuron deprives this concept of its features and characteristics. The presentation with a neuron assembly makes possible a concept or image description with all features and characteristics. These features can be influenced by the context of the situation where the concept is used. For example, we create the model of the concept “building”. We can observe the building from different positions. A perceived object (building) consists of a number of perceptual elements. We can see many windows or a door. But from different positions there are walls and a roof of this building. In an assembly that is the model of the concept “building,” a set of neurons corresponds to the walls, other neurons correspond to windows, and others correspond to the white color of the walls, and so on. The more frequently perceived features of
the building form the core of the assembly, and rare features create a fringe of the assembly (Fig. 2.2). Due to the fringe of the assembly, different concepts may have a large number of associations with other concepts. “Fringe” systems were introduced by Hebb to explain how associations are provided. Different circumstances lead to varying fringe activity. If it is day, the white color of the building will be observed, and in the model the neuron set that corresponds to color will be excited. “Core” is the most connected part of the assembly. In our example, the core will be neurons that correspond to walls and windows. The conceptual activity that can be aroused with limited stimulation must have its organized core, but it may also have a fringe content, or meaning, that varies with the circumstances of arousal. An individual cell or neuron set may enter into more than one assembly at different times. The single assembly or small group of assemblies can be repeatedly aroused when some other activity intervenes. In vision, for example, the perception of vertical lines must occur thousands of times an hour; in conversation, the word“the” must be perceived and uttered with very high frequency; and so on.

2.4 Neural Networks of the 1980s

In the early 1980s, a new wave of interest arose due to the publication of John Hopfield , a researcher in the field of biophysics. He described the analogy between Hebb’s neural network model and the certain class of physical systems. His efforts allowed hundreds of highly qualified scientists and engineers to join in
Fig. 2.3 Example of EXCLUSIVE OR (XOR)

Fig. 2.3 Example of EXCLUSIVE OR (XOR)
EXCLUSIVE OR (XOR) classification problem classification problem12 2 Classical Neural Networks the neural network investigation. At this time, the DARPA (Defense Advanced Research Projects Agency) project was initiated. Around 1986, the new term “neurocomputer” appeared. Many international conferences on neural networks, neurocomputing, and neurocomputers took place all over the world. Hundreds of firms dedicated to neural network technology development and production were established. For example, the neurocomputer
Mark III was built at TRW, Inc. during 1984–1985, followed by Mark IV [1]. In 1988, the firm HNC (Hecht-Nielson Corporation) produced the neurocomputer “ANZA plus,” which can work together with PC 386, Sun. In the same year, the neurocomputer Delta II was produced by the firm SAIC. In the department of network system of information processing, at the Institute of Cybernetics, Kiev, Ukraine, the first neurocomputer “NIC” was created in 1988–1989 [32, 33] under the direction of Ernst Kussul. This neurocomputer is presented in Fig. 2.5. It was built on a domestic element base and was a personal computer add-on. Kussul put forward and analyzed a new neural network paradigm, which enabled the creation of neuron-like structures. These structures are known as associative-projective neuron-like networks [34–36]. After that, in 1991–1992, the Ukrainian-Japanese team created a new neurocomputer that used a more advanced element base. It was named “B-512,” and it is presented in Fig. 2.6. Kussul and his collaborators and disciples Tatiana Baidyk, Dmitrij Rachkovskij, Mikhail Kussul, and Sergei Artykutsa participated in the neurocomputer development together with the Japanese investigators from “WACOM,” Sadao Yamomoto, Masao Kumagishi, and Yuji Katsurahira. The latest neurocomputer version was developed and tested on image recognition tasks. For example, the task of handwritten words recognition was resolved on this neurocomputer.

Fig. 2.5 First neurocomputer“NIC” developed at theInstitute of Cybernetics, Kiev












CHAPTER 3
Description of Facial Recognition System
The basic idea behind proposed system is to represent user’s facial surface by a digital signature called normal map. A normal map is an RGB color image providing a 2D representation of the 3D facial surface, in which each normal to each polygon of a given mesh is represented by a RGB color pixel. To this aim, we project the 3D geometry onto 2Dspace through spherical mapping. The result is a bidimensional representation of original face geometry which retains spatial relationships between facial features. Color info comingfrom face texture are used to mask eventual beard covered regions according to their relevance, resulting in a 8 bit greyscale filter mask (Flesh Mask). Then, a variety of facial expressions are generated from the neutral pose through a rig-based animation technique, and corresponding normal maps are used to compute a further 8 bit greyscale mask (Expression Weighting Mask) aimed to cope with expression variations. At this time the two greyscale masks are multiplied and the resulting map is used to augment with extra 8 bit per pixel the normal map, resulting in a 32 bit RGBA bitmap (Augmented Normal Map). The whole process (see Figure 1) is discussed in depth in the following subsections 3.1 to3.4..

Figure 3.1 Facial and Facial Expression Recognition workflow

3.1 Face Capturing
As the proposed method works on 3D polygonal meshes we firstly need to acquire actual faces and to represent them as polygonal surfaces. The Ambient Intelligence context, in which we are implementing face recognition, requires fast user enrollment to avoidannoying waiting time. Usually, most 3D face recognition methods work on a range image of the face, captured with laser or structured light scanner. This kind of devices offer high resolution in the captured data, but they are too slow for a real time face acquisition. Face unwanted motion during capturing could be another issue, while laser scanning could not be harmless to the eyes. For all this reasons we opted for a 3D mesh reconstruction from stereoscopic images, based on (Enciso et al., 1999) as it requires a simple equipment more likely to be adopted in a real application: a couple of digital cameras shooting at high shutter speed from two slightly different angles with strobe lighting. Though the resulting face shape accuracy is inferior compared to real 3D scanning it proved to be sufficient for recognition yet much faster, with a total time required for mesh reconstruction of about 0.5 sec. on a P4/3.4 Ghz based PC, offering additional advantages, such as precise mesh alignment in 3D space thanks to the warp based approach, facial texture generation from the two captured orthogonal views and its automatic mapping onto the reconstructed face geometry.
3.2 Building a Normal Map
As the 3D polygonal mesh resulting from the reconstruction process is an approximation of the actual face shape, polygon normals describe local curvature of captured face which could be view as its signature. As shown in Figure 2, we intend to represent these normals by a color image transferring face’s 3D features in a 2D space. We also want to preserve the spatial relationships between facial features, so we project vertices’ 3D coordinates onto a 2D space using a spherical projection. We can now store normals of mesh M in a bidimensional array N using mapping coordinates, by this way each pixel represents a normal as RGB values. We refer the resulting array as the Normal Map N of mesh M and this is the signature we intend to use for the identity verification.

Figure 3.2. (a) 3d mesh model, (b) wireframe model, (c) projection in 2D spatial coordinates,(d) normal map
3.3 Normal Map Comparison
To compare the normal map NA from input subject to another normal map NB previously stored in the reference database, we compute through:
The angle included between each pairs of normals represented by colors of pixels with corresponding mapping coordinates, and store it in a new Difference Map D with components r, g and b opportunely normalized from spatia l domain to color domain, sois the angular difference between the pixels with coordinates ( ) NA NA x , y in NA and ( ) NB NB x , y in NB and it is stored in D as a gray-scale color. At this point, the histogram H is analyzed to estimate the similarity score between NA and NB. On the X axis we represent the resulting angles between each pair of comparisons (sorted from 0° degree to 180° degree), while on the Y axis we represent the total number of differences found. The curvature of H represents the angular distance distribution between mesh MA and MB, thus two similar faces featuring very high values on small angles, whereas two unlike faces have more distributed differences (see Figure 3). We define a similarity score through a weighted sum between H and a Gaussian function G, as in:
where with the variation of 􀇔 and k is possible to change recognition sensibility. To reduce the effects of residual face misalignment during acquisition and sampling phases, we calculate the angle 􀇉 using a k × k (usually 3 × 3 or 5 × 5) matrix of neighbour pixels.

Figure 3.3 Example of histogram H to represent the angular distances. (a) shows a typical histogram between two similar Normal Maps, while (b) between two different NormalMaps
3.4 Addressing Beard and Facial Expressions via 8 bit Alpha Channel
The presence of beard with variable length covering a portion of the face surface in a subject previously enrolled without it (or vice-versa), could lead to a measurable difference in the overall or local 3D shape of the face mesh (see Figure 4). In this case the recognition accuracy could be affected resulting, for instance, in a higher False Rejection Rate FRR. To improve the robustness to this kind of variable facial features we rely on color data from the captured face texture to mask the non-skin region, eventually disregarding them during thecomparison.6 Face Recognition.

Figure 3.4 Normal maps of the same subject enrolled in two different sessions with and
without beard
We exploit flesh hue characterization in the HSB color space to discriminate between skinand beard/moustaches/eyebrows. Indeed, the hue component of each given texel is much less affected from lighting conditions during capturing then its corresponding RGB value. Nevertheless there could be a wide range of hue values within each skin region due to factors like facial morphology, skin conditions and pathologies, race, etc., so we need to define this range on a case by case basis to obtain a valid mask. To this aim we use a set of specific hue sampling spots located over the face texture at absolute coordinates, selected to be representative of flesh’s full tonal range and possibly distant enough from eyes, lips and typical beard and hair covered regions.
This is possible because each face mesh and its texture are centered and normalized during the image based reconstruction process (i.e. the face’s median axis is always centered on the origin of 3D space with horizontal mapping coordinates equal to 0.5), otherwise normal map comparison would not be possible. We could use a 2D or 3D technique to locate main facial features (eye, nose and lips) and to position the sampling spots relative to this features, but even these approaches are not safe under all conditions. For each sampling spot we sample not just that texel but a 5 x 5 matrix of neighbour texels, averaging them to minimize the effect of local image noise. As any sampling spot could casually pick wrong values due to local skin color anomalies such as moles, scars or even for improper positioning, we calculate the median of all resulting hue values from all sampling spots, resulting in a main Flesh Hue Value FHV which is the center of the valid flesh hue range. We therefore consider belonging to skin region all the texels whose hue value is within the range: -t 􀂔 FHV 􀂔 t, where t is a hue tolerance which we experimentally found could be set below 10° (see Figure 5-b). After the skin region has been selected, it is filled with pure white while the remaining pixels are converted to a greyscale value depending on their distance from the selected flesh hue range (the more the distance the darker the value).
To improve the facial recognition system and to address facial expressions we opt to the use of expression weighting mask, a subject specific pre-calculated mask aimed to assign different relevance to different face regions. This mask, which shares the same size of normal map and difference map, contains for each pixel an 8 bit weight encoding the local rigidity of the face surface based on the analysis of a pre-built set of facial expressions of the same subject. Indeed, for each subject enrolled, each of expression variations (see Figure 3.6) is compared to the neutral face resulting in difference maps.

Figure3. 5 An example of normal maps of the same subject featuring a neutral pose (leftmostface) and different facial expressions.
The average of this set of difference maps specific to the same individual represent its expression weighting mask. More precisely, given a generic face with its normal map N0 (neutral face) and the set of normal maps N1, N2, …, Nn (the expression variations), we first calculate the set of difference map D1, D2, …, Dn resulting from {N0 - N1, N0 - N2, …, N0 – Nn}. The average of set {D1, D2, …, Dn} is the expression weighting mask which is multiplied by the difference map in each comparison between two faces. We generate the expression variations through a parametric rig based deformation system previously applied to a prototype face mesh, morphed to fit the reconstructed face mesh (Enciso et al., 1999). This fitting is achieved via a landmark-based volume morphing where the transformation and deformation of the prototype mesh is guided by the interpolation of a set of landmark points with a radial basis function. To improve the accuracy of this rough mesh fitting we need a surface optimization obtained minimizing a cost function based on the Euclidean distance between vertices. So we can augment each 24 bit normal map with the product of Flesh Mask and Expression Weighting Mask normalized to 8 bit (see Figure 3.6). The resulting 32 bit per pixel RGBA bitmap can be conveniently managed via various image formats like the Portable Network Graphics format (PNG) which is typically used to store for each pixel 24 bit of colour and 8 bit of alpha channel (transparency). When comparing any two faces, the difference map is computed on the first 24 bit of color info (normals) and multiplied to the alpha channel (filtering mask).

3.5. Testing Face Recognition System into an Ambient Intelligence Framework
Ambient Intelligence (AmI) worlds offer exciting potential for rich interactive experiences. The metaphor of AmI envisages the future as intelligent environments where humans are surrounded by smart devices that makes the ambient itself perceptive to humans’ needs or wishes. The Ambient Intelligence Environment can be defined as the set of actuators and sensors composing the system together with the domotic interconnection protocol. People interact with electronic devices embedded in environments that are sensitive and responsive to the presence of users. This objective is achievable if the environment is capable to learn, 8 Face Recognition build and manipulate user profiles considering from a side the need to clearly identify the human attitude; in other terms, on the basis of physical and emotional user status captured from a set of biometric features.

Figure 3.6. Comparison of two Normal Maps using Flesh Mask and the resulting Difference Map
To design Ambient Intelligent Environments, many methodologies and techniques have to be merged together originating many approaches reported in recent literature (Basten & Geilen, 2003). We opt to a framework aimed to gather biometrical and environmental data, described in (Acampora et al., 2005) to test the effectiveness of face recognition systems to aid security and to recognize the emotional user status. This AmI system’s architecture is organized in several sub-systems, as depicted in Figure 8, and it is based on the following 3D Face Recognition in a Ambient Intelligence Environment Scenario 9
Figure 3.7. Ambient Intelligence Architecture
To design Ambient Intelligent Environments, many methodologies and techniques have to be merged together originating many approaches reported in recent literature (Basten & Geilen, 2003). We opt to a framework aimed to gather biometrical and environmental data, described in (Acampora et al., 2005) to test the effectiveness of face recognition systems to aid security and to recognize the emotional user status. This AmI system’s architecture is organized in several sub-systems, as depicted in Figure 3.7, and it is based on the following 3D Face Recognition in a Ambient Intelligence Environment Scenario 9 sensors and actuators: internal and external temperature sensors and internal temperature actuator, internal and external luminosity sensor and internal luminosity actuator, indoor presence sensor, a infrared camera to capture thermal images of user and a set of color cameras to capture information about gait and facial features. Firstly Biometric Sensors are used to gather user’s biometrics (temperature, gait, position, facial expression, etc.) and part of this information is handled by Morphological Recognition Subsystems (MRS) able to organize it semantically. The resulting description, together with the remaining biometrics previously captured, are organized in a hierarchical structure based on XML technology in order to create a new markup language, called H2ML (Human to Markup Language) representing user status at a given time. Considering a sequence of H2ML descriptions, the Behavioral Recognition Engine (BRE), tries to recognize a particular user behaviour for which the system is able to provide suitable services. The available services are regulated by means of the Service Regulation System (SRS), an array of fuzzy controllers coded in FML (Acampora & Loia, 2004) aimed to achieve hardware transparency and to minimize the fuzzy inference time. This architecture is able to distribute personalized services on the basis of physical and emotional user status captured from a set of biometric features and modelled by means of a mark-up language, based on XML. This approach is particularly suited to exploit biometric technologies to capture user’s physical info gathered in a semantic representation describing a human in terms of morphological features.





















CHAPTER 4
Face Recognition Using Neural Network

Neural Network and the network is trained to create a knowledge base for recognition which is
then used for recognition..

Fig.4.1 Cycle of Genetic Algorithm
In recognition by Genetic Algorithm, matrix crossover, crossover rate 5 and generation 10 have been used. Outline of the system is given in fig.4.1.


Fig.4.2. Outline of Face Recognition System by using Back-propagation Neural Network
As the recognition machine of the system; a three layer neural network has been used that was trained with Error Back-propagation learning technique with an error tolerance of 0.001. Outline of the complete system is given in fig.4.2
Face Image Acquisition
To collect the face images, a scanner has been used. After scanning, the image can be saved into various formats such as Bitmap, JPEG, GIF and TIFF. This FRS can process face images of any format. The face images in the fig.4.3 have been taken as sample.

Fig4.3 Sample of Face Images
Filtering and Clipping
The input face of the system may contain noise and garbage data that must be removed. Filter has been used for fixing these problems. For this purpose median filtering technique has been used. After filtering, the image is clipped to obtain the necessary data that is required for removing the unnecessary background that surrounded the image. This is done by detecting the window co-ordinates (Xmin, Ymin) and (Xmax, Ymax). The clipped form of the previous sample image is shown in fig.4.4

Fig.4.4 Clipped form of the sample Face Images
Edge detection
Several methods of edge detection exits in practical. The procedure for determining edges of an image is similar everywhere but only difference is the use of masks. Different types of masks can be applied such as Sobel, Prewitt, Kirsch, quick mask to obtain the edge of a face image. The performance of different masks has a negligible discrepancy. But here quick mask has been used as this is smaller than any others. It is also applied in only one direction for an image; on the other hand others are applied in eight direction of an image. So, the quick mask is eight times faster than other masks. The detected edge of a face after applying quick mask is shown in fig4.5.

Fig.4.5 Edges of Face Images

Image Scaling
There are various techniques for scaling of the image. Here shrinking technique has been used to get the image 30X30. After scaling, the images are:

Fig.4.6 Scaling images (30X30)
Features Extraction
To extract features of a face at first the image is converted into a binary. From this binary image the centroid (X,Y) of the face image is calculated. Where x, y is the co-ordinate values and m=f(x,y)=0 or 1.Then from the centroid, only face has been cropped and converted into the gray level and the features have been collected.

Fig.4.7 Features of the faces
Recognition
Extracted features of the face images have been fed in to the Genetic algorithm and Back-propagation Neural Network for recognition. The unknown input face image has been recognized by Genetic Algorithm and Back-propagation Neural Network. This is outlined in fig.4.8(a).

Fig.4.8 (a). Recognition phase
The unknown input face image has been recognized by Genetic Algorithm, but has not been recognized by Back-propagation Neural Network. This is outlined in fig4.8(b).


Fig.4.8 (b). Recognition phase

The unknown input face image has been recognized by Back-propagation Neural Network, but has not been recognized by Genetic Algorithm.













CHAPTER 5
APPLICATIONS
Neural networks are applicable in virtually every situation in which a relationship between the predictor variables(independents, inputs) and predicted variables (dependents, outputs) exists, even when that relationship is very complex and not easy to articulate in the usual terms of "correlations" or "differences between groups." A few representative examples of problems to which neural network analysis has been applied successfully are:
Detection of medical phenomena. A variety of health-related indices (e.g., a combination of heart rate, levels of various substances in the blood, respiration rate) can be monitored. The onset of a particular medical condition could be associated with a very complex (e.g., nonlinear and interactive) combination of changes on a subset of the variables being monitored. Neural networks have been used to recognize this predictive pattern so that the appropriate treatment can be prescribed.
Stock market prediction. Fluctuations of stock prices and stock indices are another example of a complex, multidimensional, but in some circumstances at least partially-deterministic phenomenon. Neural networks are being used by many technical analysts to make predictions about stock prices based upon a large number of factors such as past performance of other stocks and various economic indicators.
Credit assignment. A variety of pieces of information are usually known about an applicant for a loan. For instance, the applicant's age, education, occupation, and many other facts may be available. After training a neural network on historical data, neural network analysis can identify the most relevant characteristics and use those to classify applicants as good or bad credit risks.
Monitoring the condition of machinery. Neural networks can be instrumental in cutting costs by bringing additional expertise to scheduling the preventive maintenance of machines. A neural network can be trained to distinguish between the sounds a machine makes when it is running normally ("false alarms") versus when it is on the verge of a problem. After this training period, the expertise of the network can be used to warn a technician of an upcoming breakdown, before it occurs and causes costly unforeseen "downtime."
Engine management. Neural networks have been used to analyze the input of sensors from an engine. The neural network controls the various parameters within which the engine functions, in order to achieve a particular goal, such as minimizing fuel consumption.




















CHAPTER 6
CONCLUSION AND FUTURE

Neural networks are suitable for predicting time series mainly because of learning only from examples, without any need to add additional information that can bring more confusion than prediction effect. Neural networks are able to generalize and are resistant to noise. On the other hand, it is generally not possible to determine exactly what a neural network learned and it is also hard to estimate possible prediction error.
Neural networking promises to provide computer science breakthroughs that rival anything we have yet witnessed. Once neural networks are trained properly, they can replace many human functions in targeted areas. We hope that our application will provide a small but important step in that journey.
We have only begun to scratch the surface in the development and implementation of Neural networks in commercial applications. It is projected that here will be a lot of development in this area in the years to come. This is largely due to the fact that Neural Networks are a very marketable technology. They are flexible, easy to integrate into a system, it adapts to the data and can classify it in numerous fashions under extreme conditions. (Rumelhart and McClelland,
Developments are already in place to create hardware to make Neural Nets faster and more efficient. And though many dream of one day perfecting Neural Nets to create a truly amazing AI System, it is important to remember where the development has taken us, the lessons that have been learned and the barriers that have been over come to get here.




REFERENCES:-
1) WWW.GOOGLE.COM
2) WWW.SEMINARSFORYOU.COM
3) WWW.IEEEXPLOREIEEE.ORG
4) WWW.WIKIPEDIA.COM
5) WWW.FUTUREELECTRONICS.COM

Saturday, March 6, 2010

Report on self healing robots

CHAPTER -1
1. INTRODUCTION
1.1 ROBOTS
A robot is a mechanical or virtual, artificial agent. It is usually an electromechanical system, which, by its appearance or movements, conveys a sense that it has intent or agency of its own.
A typical robot will have several, though not necessarily all of the following Properties:
• Is not 'natural' i.e. has been artificially created.
• Can sense its environment.
• Can manipulate things in its environment.
• Has some degree of intelligence or ability to make choices based on the environment or automatic control / pre-programmed sequence.
• Is programmable.
• Can move with one or more axes of rotation or translation.
• Can make dexterous coordinated movements.
• Appears to have intent or agency (reification or Pathetic fallacy). Robotic systems are of growing interest because of their many practical applications as well as their ability to help understand human and animal behavior, cognition, and physical performance. Although industrial robots have long been used for repetitive tasks in structured environments, one of the long-standing challenges is achieving robust performance under uncertainty. Most robotic systems use a manually constructed mathematical model that captures the robot’s dynamics and is then used to plan actions. Although some parametric identification methods exist for automatically improving these models, making accurate models is difficult for complex machines, especially when trying to account for possible topological changes to the body, such as changes resulting from damage.





1.2. ERROR RECOVERY
Recovery from error, failure or damage is a major concern in robotics. A majority of effort in programming automated systems is dedicated to error recovery. The need for automated error recovery is even more acute in the field of remote robotics, where human operators cannot manually repair or provide compensation for damage or failure. Here, its explained how the four legged robot automatically synthesizes a predictive model of its own topology (where and how its body parts are connected) through limited yet self-directed interaction with its environment, and then uses this model to synthesize successful new locomotive behavior before and after damage. These findings may help develop more robust robotics, as well shed light on the relation between curiosity and cognition in animals and humans.

Fig 1.1 Robot





CHAPTER 2

2. SELF HEALING OR SELF MODELLING ROBOTS

When people or animal get injured, they compensate for minor injuries and keep limping along. But in the case of robots, even a slight injury can make them stumble and fall .Self healing robots have an ability to adapt to minor injuries and continue its job . A robot is able to indirectly infer its own morphology through self-directed exploration and then use the resulting self-models to synthesize new behaviour.If the robot’s topology unexpectedly changes, the same process restructures its internal self-models, leading to the generation of qualitatively different, compensatory behavior. In essence, the process enables the robot to continuously diagnose and recover from damage. Unlike other approaches to damage recovery, the concept introduced here does not presuppose built-in redundancy, dedicated sensor arrays, or contingency plans designed for anticipated failures. Instead, our approach is based on the concept of multiple competing internal models and generation of actions to maximize disagreement between predictions of these models.
2.1 RESEARCHERS

Fig 2.1 Victor Zykov, Josh Bongard, and Hod Lipson


This research was done at the Computational Synthesis Lab at Cornell University. Team members are Josh Bongard, Viktor Zykov, and Hod Lipson. Josh Bongard was a postdoctoral researcher at Cornell while performing this research and since then moved to the University of Vermont where he is now an Assistant Professor. Victor Zykov is a Ph.D. student at CCSL, and Hod Lipson is an Assistant Professor at Cornell, and directs the Computational Synthesis Lab. This project was funded by the NASA Program on Intelligent Systems and by the National Science Foundation program in Engineering Design.

2.2 THE STARFISH ROBOT

2.2.1 CHARACTERIZING THE TARGET SYSTEM

The target system in this study is a quadrupedal, articulated robot with eight actuated degrees of freedom. The robot consists of a rectangular body and four legs attached to it with hinge joints on each of the four sides of the robot’s body. Each leg in turn is composed of an upper and lower leg, attached together with a hinge joint. All eight hinge joints of the robot are actuated with Airtronics 94359 high torque servomotors. However, in the current study, the robot was simplified by assuming that the knee joints are frozen: all four legs are held straight when the robot is commanded to perform some action. The following table gives the overall dimensions of the robot’s parts.
Table 2.1 Overall dimensions of robot

All eight servomotors are controlled using an on-board PC-104 computer via a serial servo control board SV-203B, which converts serial commands into pulse-width modulated signals. Servo drives are capable of producing a maximum of 200 ounce inches of torque and 60 degrees per second of speed. The actuation ranges for all of the robot’s joints are summarized in the following table Table2.2 Actuation ranges


This four-legged robot can automatically synthesize a predictive model of its own topology (where and how its body parts are connected), and then successfully move around. It can also use this "proprioceptive" sense to determine if a component has been damaged, and then model new movements that take the damage into account.

The robot is equipped with a suite of different sensors polled by a 16-bit 32- channel PC-104 Diamond MM-32XAT data acquisition board. For the current identification task, three sensor modalities were used: an external sensor was used to determine the left/right and forward/back tilt of the robot; four binary values indicated whether a foot was touching the ground or not; and one value indicated the clearance distance from the robot’s underbelly to the ground, along the normal to its lower body surface. All sensor readings were conducted manually, however all three kinds of signals will be recorded in future by on-board accelerometers, the strain gauges built into the lower legs, and an optical distance sensor placed on the robot’s belly.




Fig 2.2 The starfish robot with reflection

2.3 SELF MODELLING BRIEFLY
Here, its explained how the four legged robot automatically synthesizes an predictive model of its own topology (where and how its body parts are connected) through limited yet self-directed interaction with its environment, and then uses this model to synthesize successful new locomotive behavior before and after damage. These findings may help develop more robust robotics, as well as shed light on the relation between curiosity and cognition in animals and humans A robot’s most formidable enemy is an uncertain and changing environment. Typically, robots depend on internal maps (either provided or learned), and sensory data to orient themselves with respect to that map and to update their location. If the environment is changing or noisy, the robot has to navigate under uncertainty, and constantly update the probabilities that a particular action will achieve a particular result. The situation becomes even worse if the robot’s own shape and configuration can change, that is, if its internal model becomes inaccurate. In most cases, such an event constitutes the end of that particular robot’s adventure. Although much progress has been made in allowing robotic systems to model their environment autonomously, relatively little is known about how a robot can learn its own morphology, which cannot be inferred by direct observation or retrieved from a database of past experiences. Without internal models, robotic systems can autonomously synthesize increasingly complex behaviors or recover from damage through physical trial and error, but this requires hundreds or thousands of tests on the physical machine and is generally too slow, energetically costly, or risky. Here, we describe an active process that allows a machine to sustain performance through an autonomous and continuous process of self-modeling. A robot is able to indirectly infer its own morphology through self-directed exploration and then use the resulting self-models to synthesize new behaviour’s . If the robot’s topology unexpectedly changes, the same process restructures it’s internal self-models, leading to the generation of qualitatively different, compensatory behavior. In essence, the process enables the robot to continuously diagnose and recover from damage. Unlike other approaches to damage recovery, the concept introduced here does not presuppose built-in redundancy, dedicated sensor arrays, or contingency plans designed for anticipated failures. Instead, our approach is based on the concept of multiple competing internal models and generation of actions to maximize disagreement between predictions of these models. The process is composed of three algorithmic components that are executed continuously by the physical robot while moving or at rest (Fig. 2.3): Modeling, testing, and prediction.

Phases in self healing : Initially, the robot performs an arbitrary motor action and records the resulting sensory data (Fig. 2.3A). The model synthesis component (Fig. 2.3B) then synthesizes a set of 15 candidate self-models using stochastic optimization to explain the observed sensory-actuation causal relationship. The action synthesis component (Fig. 2.3C) then uses these models to find a new action most likely to elicit the most information from the robot. This is accomplished by searching for the actuation pattern that, when executed on each of the candidate self models, causes the most disagreement across the predicted sensor signals. This new action is performed by the physical robot (Fig. 2.3A), and the model synthesis component now reiterates with more available information for assessing model quality. After 16 cycles of this process have terminated, the most accurate model is used by the behavior synthesis component to create a desired behavior (Fig. 2.3D) that can then be executed by the robot (Fig. 2.3E). If the robot detects unexpected sensor-motor patterns or an external signal as a result of unanticipated morphological change, the robot reinitiates the alternating cycle of modeling and exploratory actions to produce new models reflecting the change. The new most accurate model is now used to generate a new, compensating behavior to recover functionality. A complete sample experiment is shown in Fig. 2.4.

Fig.2.3 Outline of the algorithm.



Fig. 2.4 Robot modeling and behavior.


The proposed process was tested on a four-legged physical robot that had eight motorized joints, eight joint angle sensors, and two tilt sensors. The space of possible models comprised any planar topological arrangement of eight limbs, including chains and trees (for examples, see Figs. 2.3 and 2.4). After damage occurs, the space of topologies is fixed to the previously inferred morphology, but the size of the limbs can be scaled (Fig. 2.4, N and O). The space of possible actions comprised desired angles that the motors were commanded to reach. Many other self-model representations could replace the explicit simulations used here, such as artificial neural or Bayesian networks, and other sensory modalities could be exploited, such as pressure and acceleration (here the joint angle sensors were used only to verify achievement of desired angles and orientation of the main body was used only for self-model synthesis). Nonetheless, the use of implicit representations such as artificial neural networks—although more biologically plausible than explicit simulation—would make the validation of our theory more challenging, because it would be difficult to assess the correctness of the model (which can be done by visual inspection for explicit simulations). More important, without an explicit representation, it is difficult to reward a model for a task such as forward locomotion (which requires predictions about forward displacement) when the model can only predict orientation data. The proposed process (Model-driven algorithm) was compared with two baseline algorithms,

Table 2.3 Results of baseline algorithms
Inspection for explicit simulations). More important, without an explicit representation, it is difficult to reward a model for a task such as forward locomotion (which requires predictions about forward displacement) when the model can only predict orientation data. The proposed process (Model-driven algorithm) was compared with two baseline algorithms, both of which use random rather than self-model–driven data acquisition. All three algorithm variants used a similar amount of computational effort (~250,000 internal model simulations) and the same number (16) of physical actions (Table 2.3). In the first baseline algorithm, 16 random actions were executed by the physical robot (Fig. 1A), and the resulting data were supplied to the model synthesis component for batch training (Fig. 2.3B). In the second baseline algorithm, the action synthesis component output a random action, rather than searching for one that created disagreement among competing candidate self-models. The actions associated with Fig. 2.3, A to C, were cycled as in the proposed algorithm, but Fig. 2.3C output a random action, rather than an optimized one. Before damage, the robot began each experiment with a set of random models after damage, the robot began with the best model produced by the model-driven algorithm (Fig. 2.4F). It was found that the probability of inferring a topologically correct model was notably higher for the model-driven algorithm than for either of the random baseline algorithm (Table 2.3), and that the final models were more accurate on average in the model-driven algorithm than in either random baseline algorithm (Table 2.3). Similarly, after damage, the robot was better able to infer that one leg had been reduced in length using the model-driven algorithm than it could, using either baseline algorithm. This indicates that alternating random actions with modeling, compared with simply performing several actions first and then modeling, does not improve model synthesis (baseline 2 does not outperform baseline 1), but a robot that actively chooses which action to perform next on the basis of its current set of hypothesized self-models has a better chance of successfully inferring its own morphology than a robot that acts randomly (the model-driven algorithm outperforms baseline algorithms 1 and 2).Because the robot is assumed not to know its own morphology a priori, there is no way for it to determine whether its current models have captured its body structure correctly. It was found that disagreement among the current model set (information that is available to the algorithm) is a good indicator of model error (the actual inaccuracy of the model, which is not available to the algorithm), because a positive correlation exists between model disagreement and model error across the (n = 30) experiments that use the model-driven algorithm (Spearman rank correlation =0.425, P < 0.02). Therefore, the experiment that resulted in the most model agreement (through convergence toward the correct model) was determined to be the most successful from among the 30 experiments performed, and the best model it produced (Fig. 2F) was selected for behavior generation. This was also the starting model that the robot used when it suffered unexpected damage (Table 2.3). The behavior synthesis component (Fig. 2.3D) was executed many times with this model, starting each time with a different set of random behaviors Although there is some discrepancy between the predicted distance and actual distance, there is a clear forward motion trend that is absent from the random behaviors. This indicates that this automatically generated self-model was sufficiently predictive to allow the robot to consistently develop forward motion patterns without further physical trials. The transferal from the self-model to reality was not perfect, although the gaits were qualitatively similar; differences between the simulated and physical gait were most likely due to friction and kinematic bifurcations at symmetrical postures, both difficult to predict. Similarly, after damage, the robot was able to synthesize sufficiently accurate models (an example is given in Fig. 2.4 O) for generating new, compensating behaviors that enabled it to continue moving forward.








3. ALGORITHM

A number of algorithms based on repeated testing for error recovery have been proposed and demonstrated for both robotics, and electronic circuits. However, repeated generate-and-test algorithms for robotics are not desirable for several reasons: repeated trials may exacerbate damage and drain limited energy; long periods of time are required for repeated hardware trials; damage may require rapid compensation (e.g. power drain due to coverage of solar panels); and repeated trials continuously change the state of the robot, making damage diagnosis difficult. Due to the recent advances in simulation it has become possible to automatically evolve the morphology and the controller of simulated robots together in order to achieve some desired behavior. Here we also use evolutionary algorithms to co-evolve robot bodies and brains, but use an inverse process: instead of evolving a controller given robot morphology, we evolve a root morphology given a controller. Also, instead of evolving to reach a high fitness as a form of design, we evolve towards an observed low fitness (caused by some unknown failure) as a form of diagnosis. By not making a distinction between the robot’s morphology and controller, and by employing an evolutionary algorithm, the algorithm can compensate for damage or failure of the robot’s mechanics, its sensory or motor apparatus, or the controller itself, or some combination of these failure types. This stands in contrast to all other approaches to automated recovery so far, which can only compensate for a few pre-specified failures. Moreover, by using an evolutionary algorithm for recovery, qualitatively different behaviors (such as hopping instead of walking) evolve in response to failure. More traditional analytic approaches can only produce slightly modified behaviors in response to mild damage.







3.1 ALGORITHM OVERVIEW

Estimation-exploration algorithm

The estimation-exploration algorithm is essentially a co-evolutionary process comprising two populations. One population is of candidate models of the target system, where a model’s fitness is determined by its ability to correctly explain observed data from the target system. The other population is of candidate unlabelled sentences, each of whose fitness is determined by its ability to cause disagreement among model classifications (thereby elucidating model uncertainties), or by exploiting agreement among models to achieve some desired output (thereby capitalizing on model certainties). The estimation-exploration algorithm has two functions: damage hypothesis evolution (the estimation phase) and controller evolution (the exploration phase). The algorithm also maintains a database, which stores pairs of data: an evolved controller and the fitness produced by the ‘physical’ robot when that controller is used. Two separate evolutionary algorithms—the estimation EA and the exploration EA–are used to generate hypotheses regarding the failure incurred by the physical robot, as well as controllers for the simulated and ‘physical’ robot, respectively. Figure 3 outlines the flow of the algorithm, along with a comparison against an algorithm for
evolving function recovery all on a physical robot.


Fig. 3.1 Flow chart of estimation-exploration phases

Exploration Phase: Controller Evolution. The exploration EA is used to evolve a controller for the simulated robot, such that it is able to perform some task. The first pass through this phase generates the controller for the intact physical robot: subsequent passes attempt to evolve a compensatory controller for the damaged physical robot, using the current best damage hypothesis generated by the estimation phase. When the exploration EA terminates, the best controller from the run is transferred to and used by the physical robot.

Physical Robot Failure: The physical robot uses an evolved controller to walk forwards. An unanticipated failure occurs to the robot, and the broken robot records its own forward displacement for a period of time. The physical robot is then stopped, and the recorded forward displacement (fitness) is inserted into the database along with the evolved controller on-board the robot at that time: these become an input-output pair used to reverse engineer the damage suffered by the robot. During subsequent passes through the algorithm, the damaged robot attempts to function using the compensatory evolved controller produced by the exploration phase.
Estimation Phase: Damage Hypothesis Evolution. The estimation EA is used to evolve a hypothesis about the actual failure incurred by the physical robot. The estimation EA uses the forward displacements produced by the broken physical robot, along with the corresponding controllers running on the physical robot at that time, to measure the correctness of each of the diagnoses encoded by the estimation EA’s genomes. When the estimation EA terminates, the most fit damage hypothesis is supplied to the exploration EA. The robot simulator is updated to model this damage hypothesis: for example if the hypothesis is that one of the legs has fallen off, that leg is broken off of the simulated robot. The exploration EA then evolves a compensatory controller using this updated model.
.
3.2 EXPERIMENTAL SETUP

The proposed algorithm was applied to the recovery of locomotion of severely damaged legged robots. A robot simulator is used to evolve controllers for the ‘physical’ robot: here the ‘physical’ robot is also simulated. Evolved controllers are uploaded from the simulation to the physical robot, and performance measurements are downloaded from the physical robot to the simulation. The robot simulator is based on Open Dynamics Engine, an open-source 3D dynamics simulation package. The simulated robot is composed of a series of three-dimensional objects, connected with one degree-of freedom rotational joints.

3.2.1 THE ROBOTS

The two hypothetical robots tested in this preliminary work—a quadrupedal and hexapedal robot—is shown in Figure 3.2. The quadrupedal robot has eight mechanical degrees of freedom. There are two one degree-of-freedom rotational joints per leg: one at the shoulder, and one at the knee. The quadrupedal robot contains four binary touch sensors, one in each of the lower legs. The touch sensor returns, 1.0 if the lower leg is on the ground and −1.0 otherwise. There are also four angle sensors in the shoulder joints, which return a signal commensurate with the flex or extension of that joint (−1.0 for maximum flexure up to 1.0 for maximum extension). Each of the eight joints is actuated by a torsional motor. The joints have a maximum flex of 30 degrees from their original setting (shown in Figure 3), and a maximum extension of 30 degrees. The hexapedal robot has 18 mechanical degrees of freedom: each leg has a one degree-of-freedom rotational joint at the knee, and one two degree-of-freedom rotational joints connecting the leg to the spine. Each joint is actuated by a torsional motor, and the joint ranges are the same as for the quadrupedal robot. The hexapedal robot contains six touch sensors, one per lower leg, and six angle sensors, placed on the joints connecting the legs to the spine.


Fig. 3.2 The simulated robots used for experimentation.




3.2.2 The Controllers

The robots are controlled by a neural network, which receives sensor data from the robot at the beginning of each time step of the simulation into its input layer, propagates those signals to a hidden layer, and finally propagates the signals to an output layer. The neural network architecture and connectivity is shown in Figure 3.3. Neuron values and synaptic weights are scaled to lie in the range [−1.00, 1.00]. A threshold activation function is applied at the neurons. There is one output neuron for each of the motors actuating the robot: the values arriving at the output neurons are scaled to desired angles for the joint corresponding to that motor. For both robots here, joints can flex or extend to π/ 4 away from their default starting rotation, π/ 2. The angles are translated into torques using a PID controller, and the simulated
motors then apply the resultant torques. The physical simulator then updates the position, orientation and velocity of the robot based on these torques, along with external forces such as gravity, friction, momentum and collision with the ground plane.


Fig. 3.3 The neural network architecture used for the quadrupedal robot.


3.3 ALGORITHM IMPLEMENTATION

The Exploration EA: The exploration EA is used to generate sets of synaptic weights for the robot’s neural network (Figure 3.3). The fitness function rewards robots for moving forwards as far as possible during 1000 time steps of the simulation. The fitness function is given as f(gi) = d(t1000) - d(t1), where f(g) is the fitness, measured in meters, of the robot whose neural network controller is labeled with the values encoded in genome g. d(t1) is the forward displacement of the robot, measured in meters, at the first time step of the simulation; d(t1000) is the forward displacement of the robot (again in meters) at the final time step of the simulation. The genomes of the exploration EA are strings of floating point values, which encode the synaptic weights. For the quadrupedal robot, there are a total of 68 synapses, giving a genome length of 68. For the hexapedal robot, there are a total of 120 synapses, giving a genome length of 120. The encoded synaptic weights are represented to two decimal places, and lie in the range [-1.00, 1.00]. At the beginning of each run a random population of 100 genomes is generated. If there are any previously evolved controllers stored in the database, these are downloaded into the starting population. A genome is evaluated as follows: the encoded weights are used to label the controller; the robot is then evaluated in the simulator for 1000 time steps using that controller; and the resulting fitness value is returned. Once all of the genomes in the population have been evaluated, they are sorted in order of decreasing fitness, and the 50 least fit genomes are deleted from the population. Fifty new genomes are selected to replace them from the remaining 50, using tournament selection, with a tournament size of 3. Selected genomes undergo mutation: each floating-point value of the copied genome has a 1 per cent chance of undergoing a point mutation. Of the 50 newly generated genomes, 12 pairs are randomly selected and undergo one-point crossover.

The Estimation EA: The estimation EA evolves hypotheses about the failure incurred by the physical robot. The genomes of the estimation EA, like the exploration EA, are strings of floating-point values. Each genome in the estimation EA is composed of four genes: each gene denotes a possible failure. In this preliminary study, the actual robot can undergo three different types of damage—joint breakage, joint jamming, and sensor failure and can incur zero to four of these damages simultaneously. In joint breakage, any single joint of the robot can break completely, separating the two parts of the robot connected by that joint. In joint jamming, the two objects attached by that joint are welded together: actuation has no effect on the joint’s angle. In sensor failure, any sensor within the robot (either one of the touch or angle sensors) feeds a zero signal into the neural network during subsequent time steps. Any type of failure that does not conform to one of these types is referred to henceforth as an unanticipated failure: in order to compensate for such cases, the estimation EA has to approximate the failure using aggregates of the encoded failure types. Each of the four genes encoded in the estimation EA genomes is comprised of four floating-point values, giving a total genome length of 16 values. Like the exploration EA, each of the values is represented to two decimal places, and lies in [−1.00, 1.00]. The first floating-point value of a gene is rounded to an integer in [0, 1] and denotes whether the gene is dormant or active. If the gene is dormant, the damage encoded by this particular gene is not applied to the simulated robot during evaluation. If the gene is active, the second floating point value is rounded to an integer in [0, 2], and indicates which of the three damage types should be applied to the simulated robot. If the damage type is either joint breakage or joint jamming, the third value is scaled to an integer in [0, j − 1], where j is the number of mechanical degrees of- freedom of the robot (j = 8 for the quadrupedal robot, and j = 12 for the hexapedal robot). If the damage type is sensor failure, then the third value is scaled to an integer in [0, s − 1], where s is the total number of sensors contained in the robot (s = 8 for the quadrupedal robot and s = 12 for the hexapedal robot). The fourth value is currently not used in this preliminary study, but will be used for additional damage types that are not binary, but occur with a lesser or greater magnitude (i.e. a sensor that experiences 80% damage, instead of completely failing). For each genome in the estimation EA, the simulated robot is initially broken according to the failure scenario encoded in the genome, and the broken robot is then evaluated using the controller just evolved by the exploration EA and tested on the ‘physical’ robot. The fitness function for the estimation EA is an attempt to minimize the difference between the forward displacement achieved by the ‘physical’ robot using that controller, and the forward displacement achieved by the simulated robot using the encoded damage hypothesis. This is based on the observation that the closer the damage hypothesis encoded in the genome is to the actual damage, the lesser the difference between the two behaviors. During subsequent passes through the estimation phase, there are additional pairs of evolved controllers and forward displacements in the database: the controllers evolved by the exploration EA and the fitness values attained by the ‘physical’ robot when using those controllers, respectively. In these cases, the simulated robot is evaluated once for each of the evolved controllers, and the fitness of the genome is then the sum of the errors between the forward displacements When the estimation EA terminates, the best evolved damage hypothesis is stored in a database: these hypotheses are used to seed the random population at the beginning of the next run of the estimation EA, rather than starting each time with all random hypotheses. The estimation EA is similar to the exploration EA, except for the length of the genomes, what those genomes encode, and the fitness function: in the exploration EA, forward displacement is maximized; in the estimation EA, error between the simulated and ‘physical’ robots’ forward displacements is minimized.


Table 3.1 Damage scenarios tested



3.4 RESULTS OF ESTIMATION-EXPLORATION ALGORITHM

Control experiments were performed which conforms to the algorithm outlined in the right hand panel of Figure all evolution is performed on the ‘physical’ robot after damage. In this case, controller evolution is performed by the exploration EA until generation 30 on the quadrupedal robot. The controller is then transferred to the ‘physical’ robot, which then undergoes separation of one of its lower legs (damage case 1). The exploration EA then continues on the ‘physical’ robot for a further 70 generations. The algorithm proposed here was then applied several times to the quadrupedal and hexapedal robots. During each application of the algorithm, the robots suffered a different damage scenario: the 10 scenarios are listed in Table 3.1 For each run of the algorithm, the exploration EA is run once to generate the initial evolved controller, and then both the estimation and exploration EAs are run three times each after physical robot failure. Each EA is run for 30 generations, using a population size of 100 genomes. Twenty runs of the algorithm were performed (10 damage cases for each of the two robots), in which both the exploration and estimation EAs were initialized with independent random starting populations seed with any previously evolved controllers or damage hypotheses. Damage scenarios 1, 2, 3, 5 and 6 can be described by a single gene in the genomes of the estimation EA. Scenarios 4, 7 and 8 represent compound failures, and require more than one gene to represent them. Case 9 represents the situation when the physical robot signals that it has incurred some damage, when in fact no damage has occurred. Case 10 represents an unanticipated failure: hidden neuron failure cannot be described by the estimation EA genomes. Figure 6 shows the recovery of the quadrupedal robot after minor damage (scenario 3); after suffering unanticipated damage (scenario 10); and recovery of the hexapedal robot after severe, compound damage (scenario 8). The recovery of both robots for all 10 damage scenarios is shown in Figure.

Fig. 3.4 Three typical damage recoveries.

3.5 ANALYSIS OF ESTIMATION-EXPLORATION ALGORITHM

It can be seen, that even after several generations have elapsed after the ‘physical’ robot suffers damage for the control experiment, and 3550 hardware evaluations have been performed, total function has not been restored. The degree of restoration (about 70%) is about the same as that achieved by the quadrupedal robot suffering the same type of damage when the proposed algorithm is used to restore function. However the proposed algorithm only requires three hardware evaluations (more than two orders of magnitude fewer hardware trials) to restore function. Figure 3.4, shows that for three sample damage scenarios, much functionality is restored to the physical robot after only three hardware trials. In the case of sensor failure for the quadrupedal robot, the forward displacement of the physical robot after
the third hardware trial exceeds its original functionality. Often, the compensatory controller produces a much different gait from that exhibited by the original, undamaged robot. For example the robot enduring sensor failure hops after recovery (note the discrete arcs in the trajectory of its center of mass (Figure 3.4d)) compared to a more stable but erratic gait before the failure occurred (Figure 3.4b). It is believed that the reason for the sporadic failure of the algorithm is due to the information-poor method of comparing the simulated robot’s behavior against the physical robot’s behavior, which in this paper is done by simply comparing forward displacement. This method will be replaced in future with a more sophisticated method such as comparing sensor time series or measuring the differences in gait patterns. The algorithm performs equally well for both morphological and controller damage: function recovery for scenarios 1 and 2 (morphological damage) and scenarios 3 and 5 (controller damage), for both robots, approaches or exceeds original performance. Because the algorithm evolves the robot simulator itself based on the ‘physical’ robot’s experience, it would be straightforward to generalize this algorithm beyond internal damage: the estimation EA could evolve not only internal damage hypotheses but also hypotheses regarding environmental change, such as increased ruggedness of terrain or high winds.

3.6 ESTIMATION-EXPLORATION ALGORITHM OVERVIEW

1. Characterization of the target system
• Define a representation, variation operators and similarity metric for the space of systems
• Define a representation and variation operators for the space of inputs (tests)
• Define a representation and similarity metric for the space of outputs
2. Initialization
• Create an initial population of candidate models (random, blank, or seeded with prior information)
• Create an initial population of candidate tests (random, or seeded with prior information)
3. Estimation Phase
• Evolve candidate models; encourage diversity
• Fitness of a model is its ability to explain all input-output data in training set
4. Exploration Phase
• Evolve candidate tests (input sets)
• Fitness of a test is the disagreement it causes among good candidate models
• Carry out best test on target system; add input/output data to training set
5. Termination
• Iterate estimation-exploration (steps 3-4) until the population of models converges on a sufficiently accurate solution, or the target system exhibits some desired behavior.
• If no model is found, the search space may be inappropriate, or the target system may be inconsistent
• If no good test is found, then either:
– All good candidate models are perfect;
– The search method for finding good tests is failing; or
– The target system may be partially unobservable
6. Validation
• Validate best model(s) using unseen inputs
• If validation fails, add new data to training set and resume estimation phase




4. CONCLUSION

Although the possibility of autonomous self-modeling has been suggested, here it was demonstrated for the first time a physical system able to autonomously recover its own topology with little or no prior knowledge, as well as optimize the parameters of those resulting self-models after unexpected morphological change. These processes demonstrate both topological and parametric self-modeling. This suggests that future machines may be able to continually detect changes in their own morphology (e.g., after damage has occurred or when grasping a new tool) or the environment (when the robot enters an unknown or changed environment) and use the inferred models to generate compensatory behavior. Beyond robotics, the ability to actively generate and test hypotheses can lead to general nonlinear and topological system identification in other domains, such as computational systems, biological networks, damaged structures, and even automated science. Aside from practical value, the robot's abilities suggest a similarity to human thinking as the robot tries out various actions to figure out the shape of its world. These findings may help develop more robust robotics, as well as shed light on the relation between curiosity and cognition in animals and humans: Creating models through exploration, and using them to create new behaviors through introspection. Someday similar robots will someday respond not only to damage to their own bodies but also to changes in the surrounding environment. Such responsiveness could lend autonomy to robotic explorers on other planets, a helpful feature, since such robots can't always be in contact with human controllers on earth.


Websites/References

• http://www.google.com
• http://ccsl.mae.cornell.edu/research/selfmodels/
• http://www.mae.cornell.edu/ccsl/papers/Science06_Bongard.pdf
• http://www.mae.cornell.edu/ccsl/papers/Nature05_Zykov.pdf
• http://www.seminarforyou.com
• http://www.seminarsonly.com
• http://www.collegeseminar.com

report on smart note taker

CHAPTER 1
INTRODUCTION
Smart note taker (or digital pen)--- portable handwriting capture device using handwriting recognition technology to capture your hand writing notes, drawings, sketches anytime, anywhere then upload, file, email your handwriting notes, drawings, sketches once connected to computer.
Being responsible for coordinating meetings and taking minutes, your heavy workload can be greatly reduced, and you are assured of complete and accurate minutes if you use a high-quality pen recorder. We strongly recommend you to take a look at our newly developed ballpoint-pen-shaped Pen Recorder which, with a metallic and stylish body and concealed mini buttons, is capable of producing professional sound quality. Our Pen Recorder integrates the four functions�writing, digital recording, MP3 player and USB driver into an ideal recording and entertainment device for business people, investigators, secretaries and students.
Note taking is recognized as a critical activity in learning contexts. Notes are essential for recalling what has been heard or seen, and can promote reflection afterwards. Different solutions have been proposed to support note taking,
but most of the existing tools are focusing on note taking in the context of traditional classroom education. This task aims at designing support for practice-based education, with focus on teacher education. The task will build on a previous project to understand the needs of students and the usage of digital writing systems for supporting note taking.
The main goal of the task is the design of a digital writing system supporting the pedagogical needs of practice-based teacher education.

for both the drawings Since, JAVA Applet is suitable and strings, all these applications can be put together by developing a single JAVA program. The JAVA code that we will develop will also be installed on the pen so that the processor inside the pen will type and draw the desired shape or text on the display panel.



1.1History
People typically take notes as a convenient way to create a written reminder of key issues, points, events, and reminders and to list "To-Do" items in their work or personal life. Note taking is highly personal and each person has his or her own style of note taking. Thus, although the notes taken are typically meant for local and personal use, the notes taker may share the notes with others. For example, a person may make notes and lists his/her TO-DO's but then decides to send the TO-DO list to another person. In this case, the TO-DO's list is used to allocate tasks to another person. Hence, although personal notes are made they are then shared. Another example of personal notes being shared are meeting minutes with various action items allocated to other people thus the notes are personal but also are made to cover teams, groups of people or other individuals charged with carrying out the action items. Often the note taking is done with the assistance or help of small handbooks, pocket books or by using applications available in electronic devices such as Tablet PCs, PDAs, Smart Phones and other similar devices. Other known prior art note taking applications are available, for example, from Microsoft Corporation and 3M Corporation, which allow a user to "stick" a note on a document displayed on a computer screen. The known prior art note taking applications may offer or suggest to the user to embed some application data such as a document and/or link, however the prior art lacks interaction with the note taking applications and note taking application interactions with other applications. Some note taking applications simplify note taking through the use of a stylus or pen in which human handwriting recognition is carried out to transfer what a person doodles or writes on a screen or tablet using natural handwriting-to-text characters that are recognized by software. The notes typically denote a set of things related to a particular event or action that needs be carried out and are generally related to a set of actions that has happened or to actions that are to be executed in the near future. The currently known and available note taking applications do not simplify or provide a means on how such actions can be carried out. Further, the known note taking tools or applications do not help in writing the note in any simple manner even though the necessary information for the note may be accessible to the device. The notes once taken down are not easily accessible to the user based on the actions and events that are occurring. In known note taking applications, notes can be made available by predefined scheduled reminders in an ad hoc manner, for example, in a time or date sequenced order. The notes cannot however be organized and made available in accordance with specific actions and events that occur as recorded in the contents of the taken notes.
It would be desirable therefore to provide a note taking application that overcomes the disadvantages and shortcomings of currently known note taking applications and devices.
It would also be desirable to provide an active note application that allows access to application data which can be used to embed into the notes, enable invocation of local applications, allow access to the notes on the occurrence of specific external events or actions, and enable embedding of the data automatically into the notes in response to actions performed.
















CHAPTER 2
2. Overview of SNT
2.1 Motivation
Note taking in education is perceived as a major component in the learning Process. Students mainly use notes to capture information during lectures and research for usage in assessments. But taking notes is useful for learning also. Vygotsky said:
“Thought is not merely expressed in words, it comes into existence
through them”
A customized, light weight digital note taking system can increase the efficiency of situated note taking and sharing of these notes. This can contribute to enhanced note taking and sharing experiences and therefore stimulate to more note taking and sharing among the students.
fig:2.1.Normal pen

2.2 Thesis goal
A digital note taking system has the possibility to promote more note taking and sharing among students, because it can provide enhanced handling and sharing of notes through digital means. The students are facing a different note taking and sharing environment than students in traditional lecture based educations. The note taking and sharing environment are more mobile and varied while the students are in practice. Facilitating these students with a digital note taking and sharing system that is customized to their needs, can enhance the experience of taking and sharing notes, and therefore promote more note taking and sharing of those notes among the students.
The main goal of this thesis is therefore to design a customized digital pen and paper based note taking and sharing system for the practice based teacher education, PPE.The main customization focus will be on the handling of notes and sharingof them. There will be an additional feature of the product which will monitor the notes, which were taken before, on the application program used in the computer. This application program can be a word document or an image file. Then, the sensed figures that were drawn onto the air will be recognized and by the help of the software program we will write, the desired character will be printed in the word document. If the application program is a paint related program, then the most similar shape will be chosen by the program and then will be printed on the screen.











CHAPTER 3
3.Working of SNT
3.1Components:
1.Motion Sensor
2. Memory Chip
3. Voice recorder
4. Power button

Fig:3.1 smart pen
3.2Working:
The main principal of smart note taker depends on the movement of pen which is sensed by motion sensor .The motion sensor works when the pen moves in specific direction ,with respect to ground. This product will be simple but powerful. The product will be able to sense 3D shapes and motions that user tries to draw.The sensed information will be processed and transferred to the memory chip and then will be monitored on the display device. The drawn shape then can be broadcasted to the network or sent to a mobile device. There will be an additional feature of the product that will monitor the notes, which were taken before, on the application program used in the computer. This application program can be a word document or an image file. Then, the sensed that were drawn into the air will be recognized and with the help of the software program software we will write the desired character will be printed in the word document. If the application program is a paint related program, then the most similar shape will be chosen by the program and then will be printed on screen. The use of motion sensing to perform sophisticated command control and data input into a portable device is disclosed. A motion sensor is embedded or fixedly attached to a portable device to measure movement, motion or tilt of the device in one-, two- or three-dimensions when the portable device is used to air-write or make gestures. The use of full motion information such as rate of change of motion or tilt angle to perform functions and commands is also disclosed. In addition, the use of air-writing to input search criteria and filter schemes for portable devices to manage, search, and sort through various data, files, and information is disclosed.



Fig :3.2.pen with pc connectivity cable


3.3Construction:
Note Taker Digital Pen
1, Capture natural handwritings and drawings while being away from computer.
2, Save captured handwritten notes into built-in flash memory.
3, Upload captured handwritten notes to computer via USB connection.
4, Act as a digital ink pen with hovering and mouse functionality to directly write into Windows Vista and Office 2007.
5, No installation is required to activate digital ink in Vista and Office 2007.
6, No need for special paper.
7, Standard off-the-shelf ink refill and batteries.
Since, JAVA applet is suitable for both the drawings and strings, all these applications can be put [together by developing a single JAVA applet program. The java code that we will develop will also be installed on the pen so that the processor in the pen will be able to draw and type the desired text on the display panel.
Applet:
Applet is a function of java which for example, is a kind of container(file) which contains a set of programs made in java. It is widely used in making various applications based on java. It is one of the best features of java.
The various strings, drawings etc Will be made using a class file and this file will not be a single file. It will be a set of files linked together in a single applet program.
Database:
The system installed in the pen will consist of a database which will help the processor to recognize various words made visually in the air.
Each word written in the air will resemble to a word in the database and the word present in the database will be printed. This will remain the basic principle of the working of a smart note taker.
3.4Special features of SNT
1.Unlike other pen recorders with 4 or 5 pushbuttons on pen body, it has only one concealed button on pen body, which makes it looks more like a normal pen.

2.Unlike other pen recorders with annoying flashing lights while recording, we could set flash lights off during recording.

So you can write it as a normal ballpoint pen without annoying flashing lights.
3. The built-in Mic is good quality and sensitive, it has the best sound recording quality (very clear recording) and could clearly record sound 10 meters away



fig 3.4 uploading of data

4. Elegant looking, super thin,very handy, largest diameter is only 13.9mm,

5. ONE-TOUCH-RECORD
It goes straight into voice record mode after power on;it will start recording with a single push.

(other similar pens on market will first go to music play mode)

Active Voice Memo System
Built-in voice/sound memo capture system for annotating notebooks.
Voice Memo Links.
Create and record standard sound files ( mp3 & wav) .
AppleScript examples for moving voice memo files to iTunes/iPod .



Technical features:
1.USB2.0, mass storage: up to 2GB
2.Built-in Li-battery, support over 12hours recording after fully charged.

3. Included USB Cable Connector easily Docks to any USB jack
Without LCD Display
4. Compatible with windows 98/2000/ME/XP and Vista.
Price and cost
The Smart pen comes in two versions: the 1GB model with 100 hours of audio storage and16000 pages of digital notes, priced at $149, the 2GB model doubles the storage and cost $199. It will be available in March and the Apple version will ship in September.
Sharing Notes
Notebooks that are used for meetings generally contain notes that are taken during the meetings and notes that are taken offline. Both kinds of notes exist uniquely in the author’s notebook. However, we observed the frequent necessity to share and exchange notes between several people. With traditional notebooks, this requires to copy or extract pages from one notebook and insert them in another. These workarounds were used by three (25%) of the interviewed persons in order to share notes with others. In addition to photocopies, digital photos and written duplicates of notes, two people used to share their whole notebook. Depending on the relevance of the notes for the person, the according notebook was chosen. If the notes were important for them, they would let others write in their notebook or vice versa. However, this approach still restricts the notes to a single notebook and moreover affects the notebook owner’s privacy. Another way of sharing notes is to rewrite them as reported in.
Top New Features in version 1.9:
• Enhanced Voice Memos include automatic bidirectional linking between recorded audio and text entries, for quick review
• Spring-loaded tabs and drawer views for conveniently moving and dragging information to other pages
• New Highlight & Summarize features for creating personal search and index reports (dynamic hyperlinks)
• New support for reorganizing by dragging multiple outline entries at the same level
• New Preferences for setting behavior of Return key while in various editing modes
• New multimedia AppleScript examples for using NoteTaker with iSight and iPod
• Built-in support for VoiceOver, the spoken interface for Mac OS X Universal Access
• Print and customize notebook covers
• Toolbar command button for adding a new page
• Paste as HTML option supports richer information display when used with Excel tables and Word content
• New global formatting behavior including control of spacing between outlines on a page (Preferences panel)
• New UI button behavior for toggling attribute column displays
The 1GB model is now $129.95. The 2GB model is $169.95. And, the Pro-Pack which includes the 2GB Pulse model, 5 notebooks (instead of just 1 sample notebook), and a bunch of ink refills now sells for $209.95.

Fig:3.5 Separate component of pen


CHAPTER 4
CATEGORIES
3 different types of digital writing technologies:
1. Anoto: Optical digital pen and augmented paper
2. Pegasus: Ultrasonic digital pen
3. ACECAD: Simple digital pen and an electromagnetic pad
4.1Anoto technology
The digital writing system from Anoto is based on augmented paper, dots pattern printed on normal paper, as seen in figure 4.1(a), and an ink pen with a digital camera/optical sensor and built inn flash memory, as seen in figure 4.1(b). Together it records the pen’s movement over the augmented paper surface. The dots pattern functions as coordinates code and action code.
When transferred to a computer, an application can process and interpret the recorded dots pattern and provide an exact image of the handwritten notes and execute any action chosen on the paper.

(a) Anoto dots pattern (b) Anoto digital pen technology

Fig 4.1: Anoto digital writing technology Anoto

Anoto is behind the pen and paper technology, but only provides the pattern license and a software kit for users to develop their own digital writing solutions, including paper and software. The commercial digital pen and paper are sold by licensed partners of Anoto. Logitech is one of Anoto’s digital pen system partners.
(a) Logitech (b) Nokia
(c) Maxell


(d) HP

Fig 4.2: Anoto compatible digital pens


4.2Welcome to the World of 2nd Generation Digital Pens.

Pegasus Technologies, a leading provider of innovative digital pen technologies and solutions, invites you to experience the 2nd generation digital pen. Pegasus is the solution of choice for some of the world's foremost digital pen manufacturers. The Company's pioneering technology, unrivaled efficiency and unique product offering give global OEMs the tools they need to design and manufacture quality digital pens and complementary applications.

fig:4.3.Uploading of data


4.3 Handwriting recognition and conversion

All the systems mentioned above capture handwritten notes and provide an exact digital copy for use on the computer. Even though it is an easier way to transfer paper notes into computer, the usage is still limited to view, store and edit the notes as an image. Handwriting recognition and conversion programs take it a step further and provide users with automated transcribing into digital text. MyScript Notes is such an application and are compatible with all the above mentioned digital writing systems. MyScript Notes MyScriptNotes is developed by Vision Objects [36]. This application converts handwritten notes to typed text in the computer. In the new version,MyScript Notes 2.0, the handwriting recognition is improved, providing users with the opportunity to add new words to the dictionary for better recognition and conversion rate. Shape recognition has also been included, though it only handles the most basic shape types. The language database has also been expanded to include 14 languages, most of the western language and Japanese and simplified Chinese. All the Nordic country language is represented but Norwegian. Vision Objects also provides a development version called MyScript Builder, the latest version 4.1 does include Norwegian in its language database. This probably means that Norwegian will be included in the next official version release of MyScript Notes. MyScript Notes was included as Logitech io2 digital writing software bundle when the system was bought for the student experiment in autumn 2005. But it now seems like the MyScript Notes no longer is offered as part of the Logitech package, but as a stand alone application.

























CHAPTER 5
Economical issue

Expert Y was concerned about the economical aspect. She wondered if the digital paper (or notebooks) is expensive and if it was possible to print on site, or even design a paper form on the spot and print it. The paper is more expensive than normal paper, but compared to the digital afforandances of the system, then it is a better alternative to PDAs and Tablet PCs. The design of paper requires a licensed pattern purchased from Anoto, so designing on the spot might be a bit too complex to be worth it. Printing on the spot however is possible, but the printing of the paper file takes longer than printing of normal documents, so it might pose as an issue to print in huge amounts. But if the special paper is printed at the practice school it will make the system cheaper to deploy be the students and maybe it will be the system more popular.
When the Note-Taker was initially deployed in quiet classrooms, the PowerPod pan/tilt servo motors were found to be somewhat noisy, causing one professor to ask whether someone was “drilling”. To compensate for this, the software was modified to limit the pan and tilt speeds, thus lowering the loudness and the pitch of the sound emitted by the servos to an acceptable level. The PowerPod’s movements are also somewhat imprecise, allowing the camera to tilt slightly downward as it pans. We expect that a software solution can be found to compensate for this tilt problem. Given that the PowerPod is currently the only off-the-shelf electronic pan/tilt mechanism in its low price range, we find these issues to be acceptable for now. However, given sufficient resources we would consider building our own pan/tilt mechanism. With regard to the Tablet PC, we found that the Gateway Tablet PC that we used had pen-input issues that did not seem to be evident in the Lenvo X-series Tablet PCs. It is interesting to note that the Gateway uses Finepoint digitization, as opposed to the very popular Wacom technology. In large part, this Gateway model was purchased over other Tablet PC models due to its larger screen size – a feature that neither David nor M felt was necessary after using it, and comparing it to a smaller 12.1-inch Tablet PC screen. In both cases, these students needed to be very close to the screen, so differences in screen size seemed less important than reliable pen input for this application. The Note-Taker’s camera is best positioned in the center of the front row of the classroom. However, this was not seen by our users as an unreasonable constraint, given that they typically sit there anyway. The addition of multiple control methods for the camera would be beneficial. The prototype provided only one means for aiming the camera – repeatedly tapping of directional buttons (left, right, up, down). Future prototypes should provide multiple ways to aim the camera, allowing users to choose the method that best suits the situation. In addition to these mechanical and software enhancements, we have a number of additional features planned for the Note-Taker. We will add image warping to compensate for the linear perspective seen when the board is viewed obliquely, as can be seen in Figure 5. We will also experiment with computer vision techniques to automatically follow the professor, while allowing the student to take manual control of the camera as necessary. In doing so, we hope to decrease the time expended in controlling the pan/tilt mechanism. To assist during times when the professor occludes information on the board while the student is still copying it, the Note-Taker could keep a history of frames, and allow the student to “rewind” with simple pen gestures. We also plan to use computer vision techniques to help the Note-Taker decide when to provide a live video feed, and when to show an older frame, with less occlusion. Beyond enhancements to the Note-Taker prototype, we wish to test it on a larger population of students who are legally blind, and we are actively dialoguing with the disability resource centers at our university, and at local community colleges. We would like to construct several additional prototypes and provide them to these schools. Students would then be able to check out a Note-Taker for periods of time, to test it out and provide us with feedback. Since there is a clear learning curve for the Note-Taker, we would need to train these students (or write tutorial software) and we would need to allow for at least week-long use periods. We are also exploring ways to further facilitate note-taking. This approach would equip the Note-Taker prototype to automatically track the professor, find the writing on the board, and convert it to digital ink. (This approach is similar to converting bitmaps to vectorized images – it does not involve optical character recognition). This approach might not be feasible with a VGA resolution camcorder, requiring instead a second, higher-resolution camera. Given the benefits of student-generated notes, we do not intend to replace the note-taking activity entirely. However, this could allow students who are legally blind to pay more attention to the lecture, while capturing and annotating board and projected content in real-time. Regarding the design constraints of the Note-Taker, we cannot overemphasize the importance of developing assistive technologies that are portable, and that require no significant set-up. While some interesting things can be done with solutions that require prior set-up in a classroom, we feel that most of these approaches will ultimately prove to be impractical because different students have different needs. Furthermore, portable solutions like the Note-Taker allow\ students who are legally blind to access virtually any presentation, including guest lectures, conference presentations, and even special events, or live performances. We also feel that, by allowing students who are legally blind to fully control the assistive technologies upon which they rely, we can provide them with a level of independence that they would not otherwise enjoy.

Overall, we are pleased that both of the case studies yielded encouraging results. Even in its early development, the prototype Note-Taker became an integral component of the first author’s classroom workflow, and he feels that it is an essential tool for the completion of his degrees in computer science and mathematics. M felt that the Note-Taker had significant potential, given refinements. Based on these two case studies, we feel that the Note-Taker provides a better solution than existing technologies to the problems that these students who are legally blind encounter in their classrooms.



























CHAPTER 6
Applications of SNT
1. The Smart NoteTaker provides taking fast and easy notes to people who are busy one's self with something.
2. The Smart Note Taker is good and helpful for blinds that think an write freely.
3. Another place, Smart Note Taker can play an important role, is where two people talks on the phone. The subscribers are apart from each other while their talk, and they may want to use figures or texts to understand themselves better.
4. It's also useful especially for instructors in presentations. The instructors may not want to present the lecture in front of the board. The drawn figure can be processed and directly sent to the server computer in the room. The server computer then can broadcast the drawn shape through network to all of the computers which are present in the room. By this way, the lectures are aimed to be more efficient and fun.
5. Voice can also be recorded in mp3 or wav formats.














CHAPTER 7
Conclusion and future work
7.1 Conclusion
Hence we can conclude that our SNT has many advantages over normal pen , since the smart note taker is a device that can store visual recordings and thus can be used widely. Legally blind students(people with partial blindness) find it difficult to take notes at the fast pace that it is being dictated.So SNT can play a great role to help blind students and this technologic and fast life.

7.2 future work
Writing on Air is a superb anthology. It successfully integrates science and imagination, using narrative, memory, and metaphor to convey complex environmental and biospheric concepts."
since smart note taker have several advantages and features still there is a need of some improvement .China is doing the work to add some additional features for the comforts of users.china is doing to add a camera and memory locator which will directly show the status of free memory.
















References
[1] http://www.google.com
[2] http://www.seminarsforyou.com
[3] http://www.seminarsonly.com
[4] http://www.collegeseminars.com
[5] http://www.pdfdatabase.com
[6] http://www.seminars4u.com