NLTK stands for Natural Language ToolKit. It is a popular library among Python developers who deal with Natural Language Processing.

NLTK provides most of the functions required to process human language.

NLTK Tutorial

Following NLP concepts will be covered in this NLTK Tutorial.

  • Classification
  • Tokenization
  • Stemming
  • Tagging
  • Parsing
  • Semantic Reasoning

Classification

Classification is a supervised machine learning method which helps to classify strings or documents to one of the predefined classes or categories. In the following tutorial, we will learn how to classify text or documents using NLTK.

ADVERTISEMENT

Tokenization

Tokenization is a process of dividing given text into simpler items like sentences or words. NLTK can tokenize given text into words or strings.

  • Word Tokenizer – To convert given text into a list of words.
  • Sentence Tokenizer – To convert given text into a list of strings.

Stemming

Tagging

Parsing

Semantic Reasoning