Introduction to Big Data The four dimensions of Big Data: volume, velocity, variety, veracity, Drivers for Big Data, Introducing the Storage, Query Stack, Revisit useful technologies and concepts, Real-time Big Data Analytics.
Distributed File Systems:
Hadoop Distributed File System, Google File System, Data Consistency.
Big Data Storage Models:
Distributed Hash-table, Key-Value Storage Model (Amazon's Dynamo), Document Storage Model (Facebook's Cassandra), Graph storage models
Mining large graphs, with focus on social networks and web graphs. Centrality, similarity, al-distances sketches, community detection, link analysis, spectral techniques. Map-reduce, Pig Latin, and NoSQL, Algorithms for detecting similar items, Recommendation systems, Data stream analysis algorithms, Clustering algorithms, Detecting frequent items
Employing Hadoop Map Reduce:
Creating the components of Hadoop Map Reduce jobs - Distributing data processing across server farms –Executing Hadoop Map Reduce jobs - Monitoring the progress of job flows - The Building Blocks of Hadoop Map Reduce - Distinguishing Hadoop daemons - Investigating the Hadoop Distributed File System Selecting appropriate execution modes: local, pseudo-distributed, fully distributed.
Big Data Issues:
Privacy, Visualization, Compliance and Security, Structured vs Unstructured Data
|Sem 7 BDA-ADVANCED BIG DATA ANALYTICS.pdf||191.13 KB|