Semantic search on the cheap

https://miro.medium.com/max/1200/0*RE7SpQw1148P0M0s

Original Source Here

No neural networks

Let’s say PyTorch is still too much for us. We’d rather not run on anything that performs best on GPUs. That can be done! Next we’ll look at using a scikit-learn model to rank text semantically.

Accuracy = 0.8595

Once again, not bad accuracy, 86%. txtai has the capability to use standard text classification models for similarity queries. The only caveat is that the queries must be pre-canned (determined at model training time).

Next we’ll run similarity queries for a couple of the trained labels.

joy
What a cute picture 0.9270861148834229
Glad you found it 0.9023486375808716
Happy to see you 0.8416115641593933

anger
I'm angry 0.9789760708808899
Didn't see that coming 0.2017391473054886
That's upsetting 0.16476769745349884

surprise
That is so troubling 0.04044938459992409
That's upsetting 0.03875105082988739
Never thought I would see that 0.030828773975372314

Not quite as good as the simple embeddings model but not too bad either. Remember that this is just a simple TF-IDF + Logistic Regression model!

This model can be put on top of a traditional search system to filter or re-rank results based on sentiment. Additionally, this same methodology can be used for a different dataset with different labels, lot of different possibilities.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot



via WordPress https://ramseyelbasheer.io/2021/10/20/semantic-search-on-the-cheap/

Popular posts from this blog

I’m Sorry! Evernote Has A New ‘Home’ Now

Jensen Huang: Racism is one flywheel we must stop

5 Best Machine Learning Books for ML Beginners