Data Science with Python

Course Schedule

Enroll Date Week Timings Mode Cost
Mar 26 2020 Thursday 08:30 PM EDT Live Demo Free
Mar 27 2020 Friday 09:30 PM EDT Live Demo Free
Would you like to make your own schedule? Reschedule

Course Description

Data Science is one of the hottest fields of the 21st century. Data science through the Python programming language has much scope in the IT Industry. Data scientists, Data analytics jobs are high in demand in data science platform with better packages. 

Hachion’s Data Science with Python online training is prepared by the trained masters with all basic and advanced concepts of python programming language. This course provides you structured syllabus from scratch including basics of Python, data analysis, data scraping, data visualization, machine learning algorithms, etc. The complete course will enhance your practical knowledge and programming skills by solving the assignments included in the Python Data Science tutorial. 

Course Content

• What is analytics & Data Science?
• Common Terms in Analytics
• Analytics vs. Data warehousing, OLAP, MIS Reporting
• Relevance in industry and need of the hour
• Types of problems and business objectives in various industries
• How leading companies are harnessing the power of analytics?
• Critical success drivers
• Overview of analytics tools & their popularity
• Analytics Methodology & problem-solving framework
• List of steps in Analytics projects
• Identify the most appropriate solution design for the given problem statement
• The project plan for Analytics project & key milestones based on effort estimates
• Build a Resource plan for an analytics project
• Why Python for data science?

• Overview of Python- Starting with Python
• Introduction to installation of Python
• Introduction to Python Editors & IDE's(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
• Understand Jupyter notebook & Customize Settings
• Concept of Packages/Libraries - Important packages(NumPy, SciPy, scikit-learn, Pandas,
Matplotlib, etc)
• Installing & loading Packages & Name Spaces
• Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
• List and Dictionary Comprehensions
• Variable & Value Labels – Date & Time Values
• Basic Operations - Mathematical - string - date
• Reading and writing data
• Simple plotting
• Control flow & conditional statements
• Debugging & Code profiling
• How to create class and modules and how to call them?

• Numpy, scify, pandas, scikitlearn, statmodels, nltk etc

• Importing Data from various sources (CSV, txt, excel, access, etc)
• Database Input (Connecting to the database)
• Viewing Data objects - subsetting, methods
• Exporting Data to various formats
• Important python modules: Pandas, beautiful soup

• Cleansing Data with Python
• Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived
variables, sampling, Data type conversions, renaming, formatting etc)
• Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
• Python Built-in Functions (Text, numeric, date, utility functions)
• Python User Defined Functions
• Stripping out extraneous information
• Normalizing data
• Formatting data
• Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime
etc)

• Introduction exploratory data analysis
• Descriptive statistics, Frequency Tables and summarization
• Univariate Analysis (Distribution of data & Graphical Analysis)
• Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
• Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
• Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and
scipy.stats etc)

• Basic Statistics - Measures of Central Tendencies and Variance
• Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
• Inferential Statistics -Sampling - Concept of Hypothesis Testing
• Statistical Methods - Z/t-tests( One sample, independent, paired), Anova, Correlations and Chisquare
• Important modules for statistical methods: Numpy, Scipy, Pandas

• Concept of model in analytics and how it is used?
• Common terminology used in analytics & modeling process
• Popular modeling algorithms
• Types of Business problems - Mapping of Techniques
• Different Phases of Predictive Modeling

• Need for structured exploratory data
• EDA framework for exploring the data and identifying any problems with the data (Data Audit
Report)
• Identify missing data
• Identify outliers data
• Visualize the data trends and patterns

• Need of Data preparation
• Consolidation/Aggregation - Outlier treatment - Flat Liners - Missing values- Dummy creation -
Variable Reduction
• Variable Reduction Techniques - Factor & PCA Analysis

• Introduction to Segmentation
• Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
• Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage
Segmentation)
• Behavioral Segmentation Techniques (K-Means Cluster Analysis)
• Cluster evaluation and profiling - Identify cluster characteristics
• Interpretation of results - Implementation on new data

• Introduction - Applications
• Assumptions of Linear Regression
• Building Linear Regression Model
• Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global
hypothesis ,etc)
• Assess the overall effectiveness of the model
• Validation of Models (Re running Vs. Scoring)
• Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation,
drivers etc.)
• Interpretation of Results - Business Validation - Implementation on new data

• Introduction - Applications
• Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
• Building Logistic Regression Model (Binary Logistic Model)
• Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov
Test, Gini, KS, Misclassification, ROC Curve etc)
• Validation of Logistic Regression Models (Re running Vs. Scoring)
• Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model
equation, Drivers or variable importance, etc)
• Interpretation of Results - Business Validation - Implementation on new data

• Introduction - Applications
• Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
• Classification of Techniques(Pattern based - Pattern less)
• Basic Techniques - Averages, Smoothening, etc
• Advanced Techniques - AR Models, ARIMA, etc
• Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc

• Introduction to Machine Learning & Predictive Modeling
• Types of Business problems - Mapping of Techniques - Regression vs. classification vs.
segmentation vs. Forecasting
• Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
• Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building,
Validation)
• Overfitting (Bias-Variance Trade off) & Performance Metrics
• Feature engineering & dimension reduction
• Concept of optimization & cost function
• Overview of gradient descent algorithm
• Overview of Cross validation(Bootstrapping, K-Fold validation etc)
• Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall,
precision, sensitivity, specificity, confusion metrics )

• What is the segmentation & Role of ML in Segmentation?
• Concept of Distance and related math background
• K-Means Clustering
• Expectation Maximization
• Hierarchical Clustering
• Spectral Clustering (DBSCAN)
• Principle component analysis (PCA)

• Decision Trees - Introduction - Applications
• Types of Decision Tree Algorithms
• Construction of Decision Trees through Simplified Examples; Choosing the "Best" attribute at
each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
• Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical
Variables; other Measures of Randomness
• Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
• Decision Trees - Validation
• Overfitting - Best Practices to avoid

• Concept of Ensembling
• Manual Ensembling Vs. Automated Ensembling
• Methods of Ensembling (Stacking, Mixture of Experts)
• Bagging (Logic, Practical Applications)
• Random forest (Logic, Practical Applications)
• Boosting (Logic, Practical Applications)
• Ada Boost
• Gradient Boosting Machines (GBM)
• XGBoost

• Motivation for Neural Networks and Its Applications
• Perceptron and Single Layer Neural Network, and Hand Calculations
• Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
• Neural Networks for Regression
• Neural Networks for Classification
• Interpretation of Outputs and Fine tune the models with hyper parameters
• Validating ANN models

• Motivation for Support Vector Machine & Applications
• Support Vector Regression
• Support vector classifier (Linear & Non-Linear)
• Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
• Interpretation of Outputs and Fine tune the models with hyper parameters
• Validating SVM models
Supervised Learning: KNN

• What is KNN & Applications?
• KNN for missing treatment
• KNN For solving regression problems
• KNN for solving classification problems
• Validating KNN model
• Model fine tuning with hyper parameters

• Concept of Conditional Probability
• Bayes Theorem and Its Applications
• Naïve Bayes for classification
• Applications of Naïve Bayes in Classifications

• Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval,
Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level
processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
• Finding patterns in text: text mining, text as a graph
• Natural Language processing (NLP)
• Text Analytics – Sentiment Analysis using Python
• Text Analytics – Word cloud analysis using Python
• Text Analytics - Segmentation using K-Means/Hierarchical Clustering
• Text Analytics - Classification (Spam/Not spam)
• Applications of Social Media Analytics
• Metrics(Measures Actions) in social media analytics
• Examples & Actionable Insights using Social Media Analytics
• Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
• Fine tuning the models using Hyper parameters, grid search, piping etc.

  •    While loop

  •    If loop

  •    For loop

  •    Arithmetic operations

  •   Correlation

  •   Linear Regression

  •   Non-Linear Regression

  •   Predictive time series forecasting

  •   K means clustering

  •   P-value

  •   Find outlier

  •   Neural Network

  •   Error Measure

  •   Overture of R Shiny

  •    What is Hadoop

  •    Integration of Hadoop in R

  •    Data Mining using R

  •    Clinical research preface in R

  •    API in R (Twitter and Facebook)

  •    Word Cloud in R

Data Science with Python Training FAQs

This field intensively deals with mathematics, for analysis of data and algorithms. A decent mathematical background is a necessary and we engineers definitely excel at that! If you have had a mathematical background at school and covered engineering mathematics (such as probability, statistics, linear algebra etc/), you're good to go! Maybe you could just brush up the topics.
The course assumes a working knowledge of key data science topics (statistics, machine learning, and general data analytic methods). Programming experience in some language (such as R, MATLAB, SAS, Mathematica, Java, C, C++, VB, or FORTRAN) is expected. In particular, participants need to be comfortable with general programming concepts like variables, loops, and functions. Experience with Python is helpful (but not required).
Absolutely. Our training materials work with any Python distribution (such as Anaconda), as long as you also have all of the necessary packages, a text or code editor, package manager, interactive IPython shell, and Jupyter notebooks installed.
Machine Learning and Data science has been called as the ‘Hottest job of the 21st century’. If you learn this course well, you’ll be able to impress quite a lot of interviewers across various interviews.
Even if you don’t possess understanding of all the prerequisites, we shall help you cover every topic in detail and provide overview before diving deep into machine learning and data science. Python is a relatively easy language to learn, and you can pick up the basics very quickly. Therefore, you’ll have ample amount of time before the course to brush-up/learn the fundamentals.
This course is perfectly aligned to the current industry requirements and gives exposure to all latest techniques and tools. The course curriculum is designed by specialists in this field and monitored improved by industry practitioners on a continual basis.

Download interview FAQs for Data Science with Python

Mentoring Mode    Not available
  • Trainer Support
  • Self-paced Videos
  • Exercises & Project Work
  • Get certified & Job Assistance
  • Flexible Schedule
  • 24 x 7 Lifetime Support & Access
Enroll Now
Self-Paced     Not available
  • Free Mock Interview
  • Certification Assistance
  • Resume Assistance
  • Lifetime Access and 24x7
Enroll Now