I&C SCI X426.77

Text Mining and Analytics for Machine Learning

With drastic improvement in computational and algorithmic capabilities over the past decade, and with successful application of rather simple sentiment analytics, the industry is moving on to gaining insight about human emotions and feelings of human beings from natural languages. This has resulted in Natural Language Processing to move from laboratories to the industry with such applications as text classification, language modeling, speech recognition, caption generation, document summarization, and chatbots. Today a large number of organization use Natural Language Processing to process textual data from social media to make decisions in messaging, selling, and in social entrepreneurship. This course provides a solid foundation in Text Mining and In Natural Language Processing. The course starts with an introduction to text mining using Python. Students will learn searching, reading, scrapping, cleaning, and processing text from multiple sources. The course will use Regular Expression, TextBlob, Word Vectors, and NLTK (Natural Language Toolkit). Students will also learn how to implement Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation respectively for text indexing and for topic modeling. Finally, students will learn Natural Language Understanding and its applications.
Prerequisites: I&C SCI X426.75 Tools and Techniques for Machine Learning OR I&C SCI_X426.64 Introduction to Programming with Python; expected ability to program, read from and write to files, use nested loops, conditions, functions, etc.
Python is the programming language in this course - basic experience with Python is helpful. This course also assumes that you are comfortable with statistics mathematics including concepts in random variable and probability.

Icon
Course
Approximate Cost TBD
Format Online
Duration TBD
Total Credits 2