GETTING STARTED WITH NATURAL LANGUAGE PROCESSING

#python #nlp #machinelearning #artificaialintelligence

Introduction

Natural language processing (NLP) is concerned with enabling computers to interpret, analyze, and approximate the generation of human speech. Typically, this would refer to tasks such as generating responses to questions, translating languages, identifying languages, summarizing documents, understanding the sentiment of text, spell checking, speech recognition, and many other tasks. The field is at the intersection of linguistics, AI, and computer science.

Roadmap of NLP for Machine Learning

1. Pre-processing

Sentence cleaning
Stop Words
Regular Expression
Tokenization
N-grams (Unigram, Bigram, Trigram)
Text Normalization
Stemming
Lemmatization

read more...

2. Linguistics

Part-of-Speech Tags
Constituency Parsing
Dependency Parsing
Syntactic Parsing
Semantic Analysis
Lexical Semantics
Coreference Resolution
Chunking
Entity Extraction/ Named Entity Recognition(NER)
Named Entity Disambiguation/ Entity Linking
Knowledge Graphs

3. Word Embeddings

a. Frequency-based Word Embedding

One Hot Encoding
Bag of Words or CountVectorizer()
TFIDF of TFIDFVectorizer()
Co-occurrence Matrix, Co-occurrence Vector
Hashing Vectorizer

b. Pretrained Word Embedding

Word2Vec (by Google): CBOW, Skip-Gram
GloVe (by Stanford)
fastText (by Facebook)

4. Topic Modeling

Latent Semantic Analysis (LSA)
Probabilistic Latent Semantic Analysis (pLSA)
Latent Dirichlet Allocation (LDA)
lda2Vec
Non-Negative Matrix Factorization (NMF)

5. NLP with Deep Learning

Machine Learning (Logistic Regression, SVM, Naïve Bayes)
Embedding Layer
Artificial Neural Network
Deep Neural Network
Convolution Neural Network
RNN/LSTM/GRU
Bi-RNN/Bi-LSTM/Bi-GRU
Pretrained Language Models: ELMo, ULMFiT
Sequence-to-Sequence/Encoder-Decoder
Transformers (attention mechanism)
Encoder-only Transformers: BERT
Decoder-only Transformers: GPT
Transfer Learning

6. Example Use cases

Sentiment Analysis
Question Answering
Language Translation
Text/Intent Classification
Text Summarization
Text Similarity
Text Clustering
Text Generation
Chatbots (DialogFlow, RASA, Self-made Bots)

7. Libraries

NLTK
Spacy
Gensim

Conclusion
Thank you very much for taking time to read this. I would really appreciate any comment in the comment section.
Enjoy🎉

DEV Community

GETTING STARTED WITH NATURAL LANGUAGE PROCESSING

Top comments (0)

Read next

🚀 Building a User Management API with FastAPI and SQLite

ChatsAPI — The World’s Fastest AI Agent Framework

New AI Model Uses Document Screenshots to Revolutionize Search Across Text and Images

EchoAPI vs Bruno: A Comprehensive Comparison from Design to Testing 💡