ICS 411 Big Data Storage and Processing

Covers the concepts and approaches that are used by big-data systems. Topics covered include: fundamentals of big data storage and processing using distributed file systems, the map-reduce programming paradigm, and NoSQL systems. Students will gain hands-on experience by implementing solutions to big data problems using tools like Hadoop, Apache Pig Latin, Hive, Impala, MongoDB, Cassandra, Neo4J, or Spark.

Prerequisites

ICS 240: Introduction to Data Structures and ICS 311: Database Management Systems

Special information

First day attendance is mandatory.
Note: Students are responsible to both be aware of and abide by prerequisites for ICS courses for which they enroll, and will be administratively dropped from a course if they have not met prerequisites.

4 Undergraduate credits

Effective May 3, 2017 to present

Learning outcomes

General

Identify and justify the storage and processing requirements of data-intensive applications.
Explain the similarities and differences between the requirements of big-data applications and the ACID requirements of traditional database applications.
Analyze and solve data-intensive problems using Hadoop and the distributed file system.
Design and develop algorithms using the map-reduce programming paradigm.
Classify and describe NoSQL systems
Assess the suitability for using a particular type of NoSQL databases for an application