Our Blog

We post about the things we do, challenges we crack, job openings and more

Data Cleaning using Regular Expression

Data Cleaning using Regular Expression

Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. The data format is not always in a tabular format. As we are getting into the big data era, the data comes in a pretty...

read more
Build a Custom NER model using spaCy 3.0

Build a Custom NER model using spaCy 3.0

SpaCy is an open-source python library used for Natural Language Processing(NLP). Unlike NLTK, which is widely used in research, spaCy focuses on production usage. Industrial-strength NLP spaCy is a library for advanced NLP in Python and Cython. As of now, this is the...

read more
Stemming Vs. Lemmatization with Python NLTK

Stemming Vs. Lemmatization with Python NLTK

Stemming and Lemmatization are text/word normalization techniques widely used in text pre-processing. They basically reduce the words to their root form. Here is an example: Let's say you have to train the data for classification and you are choosing any vectorizer to...

read more
Text Classification using Machine Learning

Text Classification using Machine Learning

Machine Learning, Deep Learning, Artificial Intelligence are the popular buzzwords in present trends. Artificial Intelligence(AI) is the branch of computer science which deals with developing intelligence artificially to the machines which are able to think, act and...

read more
Better Word Embeddings Using GloVe

Better Word Embeddings Using GloVe

We talked about word embeddings a bit in our last article, using word2vec. Word embeddings are one of the most powerful tools available to NLP developers today, and most NLP tasks will require some kind of word embedding in one of the levels. Thus, it is important to...

read more
Feature Extraction in Natural Language Processing

Feature Extraction in Natural Language Processing

In simple terms, Feature Extraction is transforming textual data into numerical data. In Natural Language Processing, Feature Extraction is a very trivial method to be followed to better understand the context. After cleaning and normalizing textual data, we need to...

read more
Abstractive Summarization Using Google’s T5

Abstractive Summarization Using Google’s T5

In this article, we will discuss abstractive summarization using T5, and how it is different from BERT-based models. T5 (Text-To-Text Transfer Transformer) is a transformer model that is trained in an end-to-end manner with text as input and modified text as output,...

read more
Abstractive Summarization Using Pegasus

Abstractive Summarization Using Pegasus

In the last article, we have seen how to perform extractive summarization of some text, which selects important sentences and gives them out by ranking them, without changing any text. While they are suitable for some cases, they do not achieve the sophistication of...

read more

Interested in working @ Turbolab?

We hire the best and brightest, give them competitive salaries, options, and flexible schedules, and remove every barrier we can to doing good work. Head to our careers page to find our latest openings, and tell us what makes you stand out from rest of the pack