2CSE50E15: Data Science & Modelling

Learning Outcomes: 
After learning the course the students should be able to
Learn the fundamentals of data analytics and the data science pipeline
Learn how to scope the resources required for a data science project
Apply statistical methods, regression techniques, and machine learning algorithms to make sense out of data sets both large and small
Demonstrate knowledge of statistical data analysis techniques utilized in business decision making.
Apply principles of Data Science to the analysis of business problems.
Use data mining software to solve real-world problems.
Employ cutting edge tools and technologies to analyze Big Data.
Apply algorithms to build machine intelligence.
Syllabus: 
Unit NoTopics
1

Descriptive Statistics

Introduction to the course, Descriptive Statistics, Probability Distributions

2

Inferential Statistics

Inferential Statistics through hypothesis tests, Permutation & Randomization Test

3

Regression & ANOVA

Regression, ANOVA (Analysis of Variance)

4

Machine Learning Introduction and Concepts

Differentiating algorithmic and model based frameworks, Regression: Ordinary Least Squares, Ridge Regression, Lasso Regression, K Nearest Neighbours, Regression & Classification

5

Supervised Learning with Regression and Classification techniques

Bias-Variance Dichotomy, Model Validation Approaches, Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Regression and Classification Trees, Support Vector Machines, Ensemble Methods: Random Forest, Neural Networks, Deep learning

6

Unsupervised Learning  and data modelling

Clustering, Associative Rule Mining, Logical Modelling :  Converting a conceptual model to logical model , Integrity constraints,  Normalization

7

Introduction to Data Architecture Management 
Governance, Architecture Management, Development, Operations, Security, Reference and Master Data Management, Warehousing and Business Intelligence, Metadata Management, Quality  Management Data Lifecycles

8

Data Model Patterns

Data Hierarchies & Aggregations, Introduction to the Business Information Model (BIM)

Text Books: 
Name : 
The elements of statistical learning
Author: 
Hastie, Trevor,
Publication: 
New York: springer, 2009.
Edition: 
Vol. 2. No. 1.
Reference Books: 
Name: 
The elements of statistical learning
Author: 
Hastie, Trevor
Publication: 
springer, 2009.
Name: 
Applied statistics and probability for engineers
Author: 
Montgomery, Douglas C., and George C. Runger
Publication: 
John Wiley & Sons, 2010
Name: 
Scaling up Machine Learning
Author: 
Bekkerman
Name: 
Hadoop: The Definitive Guide
Author: 
Tom White
Publication: 
O‟reilly Media, 2012.
Edition: 
Third Edition
Name: 
Mining of Massive Datasets
Author: 
AnandRajaraman and Jeffrey David Ullman
Publication: 
Cambridge University Press
Name: 
Developing Analytic Talent: Becoming a Data Scientist
Author: 
Vincent Granville
Publication: 
wiley, 2014.
Name: 
Introduction To Data Science
Author: 
Jeffrey Stanton & Robert De Graaf
Edition: 
Version 2.0, 2013.
Syllabus PDF: 
AttachmentSize
PDF icon ELECTIVE II (DCA) .pdf161.99 KB
branch: 
BDA
Course: 
2014
2016
Stream: 
B.Tech