You will be assessed a $25 late fee if you are enrolling one day after the course start date. You should work with the instructor to catch up on coursework.
Click OK if you still wish to enroll. Otherwise, click Cancel.
The course you selected is already in progress. You should work with the instructor to catch up on coursework.
Click OK if you still wish to enroll. Otherwise, click Cancel.
Big Data Analysis
A required course in the Data Science Certificate Program
Course Description
Big data is one of the most important technology trends to fundamentally impact the way organizations operate and compete. As more and more companies collect large amounts of data through their daily operations, the ability to analyze and glean knowledge from big data has become an integral part of a successful business. This course will help students navigate through the complex layers of Big Data while providing insight on ways to effectively use technologies and architectures to create and manage big data workflows. Concepts covered include an introduction to Big Data and related technologies, discussion of Big Data Processing Architectures, explanation of major concepts behind Big Data Management, and how all of those topics are applied in Big Data Analysis. Students will gain an understanding of the characteristics of big data and techniques for working on big data platforms through hands-on exercises in the tools and systems used by data scientists and data engineers including Hadoop (HiveQL & PIG), Apache Spark, and SparkSQL.
Required prerequisites: I&C SCI X427.05 Fundamentals of Data Science.
Required prerequisites: I&C SCI X427.05 Fundamentals of Data Science. NOTE: This course may use live sessions via Zoom. While students are highly encouraged to attend, all sessions are optional and will be recorded. A device with audio and visual will be needed to participate. The following student guide provides additional resources/information on how to use and access your courses Zoom sessions.
Instructor
Yu Zhang, Ph.D., has over 5 years of experience of in python programming and uses the big query SQL, python colab, and Tableau in previous and current work. She has over 5 years of experience in academic in statistical analysis and 3 years of experience in industry and government working on complex data and machine learning projects. She is technically sound with full experience in data validation and machine learning modeling. In addition, she has over 6 year’s teaching experience as a lecturer, teaching assistant, and mentor working in UC Santa Barbara, UC Davis, UC Irvine. She is passionate about using data analytics for real-world problem solving and working with diverse students.
Textbook Information
Textbooks for your course may be purchased from any vendor or bookseller of your choice.
No textbooks are required for this course.
Meeting Schedule
Event Date Day Start Time End Time Location Room
START 04/24/2023 Monday --- --- Online (Access Begins) ---
END 06/18/2023 Sunday --- --- Online (Access Ends) ---