1 INTRODUCTIONRNNs are known to be the generalization of the feedforward neuralnetworks 13, 15 . Unlike the feedforward neural network RNNsuse their internal memory to process the arbitrary sequences ofinputs.
The output of a normal feedforward network can be a classor a predicted value but the output of the RNN can be the sequenceof values which is basically the application in which you are usingthe RNNs (for eg. For classification , regression or forcaster ). RNNsare used for mapping the sequences which is used in different kindsof application like Speech Recognition, named entity recognition,machine translation, Named Entity Recognition etc.Restricted Boltzmann machine (RBM) is a widely used densitymodel to sequences which are not well suited for sequence data.
The Temporal Restricted Boltzmann Machine (TRBM) was firstintroduced because the RBM did not exist at that time and thisTRBM could be able to model high complex sequences but theproblem with it was the parameter update required the use of highapproximations which was not acceptable and unsatisfying. Thisissue was solved by modifying the TRBM into RNN-RBM in whichthe parameter update can be computed nearly exactly.As the Hessen Free (HF) Optimization solved the impossibleproblem of training the deep auto encoders so it was assumedthat it could also solve the complex problem of training the RNN.After the successful training of the Recurrent Neural Networks weapplied this approach to the character recognition which means topredict the next character in the natural text.
The RNNs performsvery well in almost every homogeneous Language model and isthe only approach that can exploit long character contexts as forexample it was able to balance the parenthesis and quotes over tensof characters.When it comes to training of the RNNs GPUs are an obviouschoice over normal CPUs. This was validated by the research teamat Indigo which uses these nets on text processing tasks like sentimentanalysis so GPUs can train the nets 250 times faster.Finally, all the beliefs about RNNs that they are very difficult totrain are incorrect.1.
1 Markov ?s chain vs Recurrent NeuralNetworksAn RNN results in the generation of each character which is basedon the entire previous history of characters generated. A Markovchains is only able to condition on a fixed window. Perhaps a particularRNN will learn to truncate its conditioning context and behaveas a Markov chain, or perhaps not but RNNs in general certainly cangenerate formal languages that Markov chains cannot. So, RNNswork more efficiently than Markovâ??s chain or we can also saythat they ae not comparable.For example: RNN was actually capable of generating well-formedXML, generating matching opening and closing tags with an unboundedamount of text between them. Markov chain cannot dothis.The advantages of using a RNN over Markov chains and hiddenMarkov model would be the higher representational power of theneural networks and their ability to perform intelligently by takinginto account syntactic and semantic features .
By comparison ngramshave a number of parameters exploding with the size of thevocabulary and n and rely on simple smoothing techniques likeKneser_Ney (KN) or Good_Turing.