Title | Winter | Spring | Summer | Fall |
Data Science |
Math Review for Data Science and Analytics (2.50 Units)
I&C SCI X414.33
This course provides a foundation of fundamental math skills needed for the emerging fields of Data Science, Predictive Analytics and Data Analytics. Key topics include basic algebra, pre-calculus concepts and fundamental concepts of statistics. Participants will learn these skills by utilizing practical applications of those mathematical concepts key to analytics and eliminates theoretical derivations. The math skills taught will allow you to apply structured thinking, deliver statistical evidence and analytical results for business problems and decision making. A useful course for those who are interested in learning or relearning algebra and pre-calculus concepts. Prerequisite: Basic Mathematics: High School Level
|
|
|
to be scheduled
|
|
Hadoop: In Theory and Practice (3.00 Units)
I&C SCI X425.18
Today, organizations in every industry are being showered with imposing quantities of new information. Along with traditional sources, many more data channels and categories now exist. Collectively, these vastly larger information volumes and new assets are known as Big Data. Enterprises are using technologies such as MapReduce and Hadoop to extract value from Big Data. This course provides an in-depth overview of Hadoop and MapReduce, the cornerstones of big data processing. To crystalize the concepts behind Hadoop and MapReduce, you will work through a series of short/ focused exercises; you will configure and install a Hadoop cluster, write basic MapReduce programs, gain familiarity with advanced MapReduce programming practices, and utilize interfaces such as Pig and Hive to interact with Hadoop. You will also learn about real-world situations were MapReduce techniques can be used.
Pre-requisite: "This course is appropriate for developers who want to get acquainted with Hadoop concepts and who will be writing, maintaining and/or optimizing Hadoop jobs. Data Architects and IT Managers/Directors who want an in-depth look into Big Data technologies such as Hadoop and MapReduce also stand to benefit from this course.
Participants should have some programming experience; a background in Java is preferred, but experience with other programming languages such as PHP, Python, or C# is sufficient. Understanding of common computer science concepts is a plus. Prior knowledge of Hadoop is not required.
Access to a Windows PC. For example: WIndows XP, Windows 7, Windows 10 PC
Hadoop is written in Java, so you will need to have Java installed on your machine, version 6 or later. Sun’s JDK is the most widely used with Hadoop. In case you do not have Java installed, step-by-step instructions will be provided to install Java."
|
|
|
to be scheduled
|
|
Introduction to Big Data (2.00 Units)
I&C SCI X425.80
Web 2.0 has changed the way we conduct business. It affects consumer interactions, information sharing, success measurement in terms of business revenue and customer wallet share, and brand management. Most importantly, it has created a revenue channel like none other. Personalization of products and services by enterprises have increased data volumes and the velocity of data production while creating a variety of data formats. We are now well and truly in the world of Big Data. The key value of this data is the vast amount of intelligence that can be found when data is modeled with geographic and demographic information. This introductory course will help students navigate through the complex layers of Big Data and Data Warehousing while providing information on how to effectively think about using these technologies and architectures to design the next-generation data warehouse. Concepts covered include an introduction to Big Data, discussion of Big Data Processing Architectures, explanation of the integration of Big Data and Data Warehouses, and fundamentals of Big Data Analytics. Examples will be used to step students through an implementation of a Big Data solution.
|
|
|
|
Online
|
Effective Data Preparation (2.00 Units)
I&C SCI X425.63
Broadly speaking, data preparation for data mining consists of three (3) elements:
1. Data Mining Process delineation (understanding the overall process)
2. Data Understanding (data cannot be properly prepared without first understanding it)
3. Data Pre-processing (transforming data into a form compatible with data mining)
This intensive hands-on course gives students the skills necessary to extract stored data elements, understand what they mean in the company, transform their formats and derive new relationships among them to produce a dataset suitable for analytical modeling. Students will learn how to produce a fully processed data set compatible for building powerful predictive models that can be deployed to increase business profitability. Required prerequisite: I&C SCI X425.61 Introduction to Predictive Analytics OR I&C SCI X425.60 Introduction to Data Science.
|
|
to be scheduled
|
|
Online
|
Python for Data Analysis (1.50 Units)
I&C SCI X426.62
Python for Data Analysis is a course for students with some experience using Python who want to learn how to import and analyze data using the popular programming language. Students can immediately use what they have learned to ingest data, produce plots and analysis, and fit models. Note that not everything with-in the python language will be covered (such as user interfaces, web services, and object oriented programming). The main python libraries introduced will be numpy, matplotlib, pandas, and scikit-learn. Major topics include: how to import data and manipulate it efficiently using numpy, how to produce plots and data visualizations with matplotlib, how to run statistical analysis using pandas, and how to build predictive models scikit-learn. A final project will help to tie the main concepts together. Additional topics include: how to use eclipse, a very handy development environment! Prerequisites: I&C SCI X426.64 Introduction to Programming with Python; Knowledge of Python programming is required.
|
Online
|
to be scheduled
|
to be scheduled
|
Online
|
Data Integration, Modeling, and ETL (2.50 Units)
I&C SCI X425.32
This course provides both introductory and advanced concepts and techniques for developing effective dimensional models, data integration, and the ETL process. Learn how to build a high performance dimensional data model. A good dimensional model and its physical database form the hub of a business intelligence data warehouse, serving as the target of the data integration and as the source of business intelligence data. Learn how to design dimensional models for extensibility, employ a proven dimensional design process, apply the process to several representative situations, and understand a variety of advanced dimensional modeling techniques. The Extract, Transform, and Load (ETL) process is typically the most time-consuming, misunderstood, and underestimated task in building a data warehouse and other data integration applications. The ETL process addresses and resolves the challenges of extracting data from disparate operational source systems, storing it in the data staging area, profiling data for errors, cleaning and transforming the data, and mass loading it into the target enterprise data warehouse, data marts, or operational systems. Prerequisite: Experience using a relational database (as a minimum, Microsoft Access or similar product) is highly desirable.
|
Online
|
|
to be scheduled
|
|
Tools and Techniques for Machine Learning (2.00 Units)
I&C SCI X426.75
Simply put, machine learning is a form of data analysis. Using algorithms that continuously learn from data, machine learning allows computers to uncover hidden patterns without being explicitly programmed to do so. The key aspect of machine learning is that as models are exposed to new data sets, they adapt to produce reliable and consistent output. In this course, we will cover the tools/techniques that are currently associated with the discipline of machine learning. We will start by exploring the field of machine learning and then gradually delve into practical examples of how to leverage machine learning algorithms to derive insights from big data. Once we have attained practical exposure to utilizing machine learning algorithms, we will use visualization capabilities to surface the output generated by the machine learning algorithms. Finally, we will explore the field of deep learning and understand how it differs from machine learning.
|
|
to be scheduled
|
|
Online
|
Data Engineering (2.50 Units)
I&C SCI X427.06
This course is designed to enhance student proficiency in data design, data management, data warehouse, data modeling, and query manipulation skills. Topics include techniques and methods for identification, extraction, and preparation of data for processing with database software. Gain an overview of the basic techniques of data engineering, including data normalization, data engineering, relational and non-relational databases, SQL and NoSQL, manipulation of data at scale (big data), algorithms for data operations. Students will work in teams on a final project to explore, analyze, summarize and present findings in a real-world big data set.
|
Online
|
to be scheduled
|
to be scheduled
|
Online
|
R Programming (2.00 Units)
I&C SCI X425.20
R is a scripting language for statistical data manipulation and analysis. R is an open source package available under GNU license at no cost. R competes with SPSS, another very well-known statistical package used heavily in many industries. Statistics is used in every part of business data processing and prediction. Data captured by web analytics services need statistics. Statistics is also the foundation of predictive analytics. R business applications include correlation, regression, hypothesis testing, and all inference testing. This course will focus on R programming which is used for solving business problems related to basic math and statistics. First, all relevant math concepts will be reviewed. This will include functions, regression, descriptive and inferential statistics, and matrix operations. All these basic math problems will be solved using R. The programmatic interface and graphic capabilities of R will also be explored. In the end, several real-world business problems will be solved using R. Prerequisites: I&C SCI X414.13 Math Review for Data Science and Analytics or basic math, statistics, functions matrix and basic programming.
|
Online
|
|
|
|
Business Intelligence & The Data warehouse Development Process (2.50 Units)
I&C SCI X427.01
Learn how to make better business decisions, use fewer resources, and improve your company's bottom line by developing and using a data warehouse. This course provides an overview of business intelligence and data warehousing and gives you a look at all the major facets of developing and using a data warehouse to make effective business decisions. Students will work on a single project to develop a comprehensive project plan and business case for a data warehouse including how to develop a dimensional model, a data staging process, and a data access process. Additional topics will include information on careers working with business intelligence and data warehousing as well as the educational requirements for this field.
|
|
to be scheduled
|
|
Online
|
R Basics
I&C SCI X427.19
This course will focus on foundational concepts of getting started with R programming which is used for math and statistics and data analysis. The programmatic interface and graphic capabilities of R will also be explored. Several case studies will be covered and examined using R. Prerequisites: I&C SCI X425.99 Practical Math & Stats for Data Science and basic experience with programming.
|
|
|
to be scheduled
|
|
Data Preparation, Modeling and Visualization with Python (2.00 Units)
I&C SCI X426.54
Learn how to create business value by effectively importing, preparing, modeling and visualizing data using the Python programming language. Students will learn how to implement various models like linear regression, logistic regression, and decision trees; both supervised and unsupervised modeling techniques will be covered. Pandas and scikit-learn will be the primary Python packages covered in this course. Both of these packages provide power tools for those in machine learning, data science, data mining, and web data professions. Prerequisites: I&C SCI X426.64 Introduction to Programming for Python or I&C SCI X426.62 Python for Data Analysis.
|
Online
|
|
to be scheduled
|
Online
|