Data Structures, Data Mining and Big Data with Python
A required course in the Python for Data Science, Web and Core Programming Specialized Studies Program.
Learn advanced Python programming features used to solve large data problems. Topics include ETL using command line interface, functional programming, mySQL, MapReduce framework using Hadoop streaming and MRJob, and Spark with SparkML. Students will gain practical experience with Amazon Web Services Elastic Computing and Elastic MapReduce. Explore how the Python built-in data structures such as lists, dictionaries, and tuples can be used to perform increasingly complex data analysis. An introduction to regression and cluster models for data mining and basic machine learning for analysis will also be covered. The course will emphasize the use of cloud computing to solve large data problems.
Prerequisite: I&C SCI X426.64 Introduction to Programming for Python or I&C SCI X426.62 Python for Data Analysis
Click on "See Details" below and refer to "Special Notes" for additional section specific information.