Text-to-Motion generates a sequence of contingent robot animations to accompany the sentiment analyzed from an input sentence and its spoken audio. We trained a linear classifier to transfer learn our corpus of animated robot speech from DeepMoji network, a long short-term memory (LSTM) network with an attention model trained on billion tweets.