Tag Archives: BERT

Semantics at Scale: BERT + Elasticsearch

Ashok Chilakapati October 28, 2019 No Comments

Semantic search at scale is made possible with the advent of tools like BERT, bert-as-service, and of course support for dense vector manipulations in Elasticsearch. While the degree may vary depending on the use case, the search results can certainly benefit from augmenting the keyword based results with the semantic ones…

BoW vs BERT: Classification

Ashok Chilakapati October 10, 2019 7 Comments

BERT yields the best F1 scores on three different repositories representing binary, multi-class, and multi-label/class situations. BoW with tf-idf weighted one-hot word vectors using SVM for classification is not a bad alternative to going full bore with BERT however, as it is cheap.

BoW to BERT

Ashok Chilakapati September 23, 2019 No Comments

Word vectors have evolved over the years to know the difference between “record the play” vs “play the record”. They have evolved from a one-hot world where every word was orthogonal to every other word, to a place where word vectors morph to suit the context. Slapping a BoW on word vectors is the usual way to build a document vector for tasks such as classification. But BERT does not need a BoW as the vector shooting out of the top [CLS] token is already primed for the specific classification objective