Basic Models

A sequence to sequence model can be used for tasks such as machine translation and image captioning.

For machine translation, a many-to-many encoder-decoder RNN (shown below) can be used. To find the most probable translation, an algorithm called beam search is used.

For image captioning, the image features from the last fully-connected layer of a CNN can be used as an input for a one-to-many RNN that generates a caption.

Last updated