Category encoding and string lookup using keras.
one-hot encoding is the representation of categorical variables as binary vectors.
The keras provides a to_categorical() method. It can encode the strings data into numerical or integer data.
from keras.preprocessing.text import one_hot from keras.preprocessing.text import text_to_word_sequence from keras.preprocessing.text import Tokenizer
Define the text that you want to encode.
#Define text text = 'a book or other written or printed work, regarded in terms of its content rather than its physical form' #Size of the vocabulary words = set(text_to_word_sequence(text)) vocab = len(words)
# integer encode the document result = one_hot(text, round(vocab_size)) print(result)
[6, 2, 7, 3, 2, 7, 7, 1, 5, 2, 7, 4, 1, 2, 7, 4, 1, 4, 5]