The online training course has been tested and approved by the State Central Agency for Distance Learning (ZFU) in Cologne under the number 73597.
Contents
1. basics of data analytics with Python
- Working with the Data Lab
- Basics and concepts in Python
- Presentation of the tools pandas, matplotlib and Seaborn
- Database queries with SQL Alchemy
2. linear algebra
- Mathematical background
- Basic concepts of linear algebra
- Calculation with vectors and matrices
- Use of the Python library numpy
3. probability distribution
- Statistics in data science algorithms
- Discrete and continuous distributions
- Versioning code in Git
4. supervised learning (regression)
- Use linear regression
- Use of the Python package sklearn
- Understanding regression models
- Evaluation of the forecasts
- Bias variance trade-off and regularization
- Measurement of model quality
5. supervised learning (classification)
- Concepts of supervised learning
- Introduction to classification algorithms
- The k-Nearest Neighbors algorithm
- Assessment of the classification performance
- Optimization of the parameters
- Division of the data into training and evaluation sets
6. unsupervised learning (clustering)
- Concepts of unsupervised learning
- The k-Means algorithm
- Evaluation of performance metrics
- Alternatives to k-means clustering
7. unsupervised learning (dimension reduction)
- Reduce dimensions in the data view
- Principal Component Analysis (PCA)
- Create uncorrelated features from original data
- Introduction to feature engineering
8. identify and exclude outliers
- Methods for detecting outliers
- Criteria for unusual data points
- Robust measurements and reduction of influences due to outliers
9. collect and merge data
- Read data from web pages and PDF documents
- Use of regular expressions
- Structure text data before processing
10. logistic regression
- Concepts of logistic regression
- Performance metrics for evaluation
- Using non-numerical data in models
11. decision trees and random forests
- The concept of Decision Trees
- Combine several models into ensembles
- Methods for improving the quality of predictions
12. support vector machines
- Use of Support Vector Machines (SVM)
- Introduction to Natural Language Processing (NLP)
- Text classification with bag-of-words models
13. neural networks
- Basics of artificial neural networks (ANN)
- The basics of deep learning
- Deeper understanding of the layers in KNN
14. visualization and model interpretation
- Derive and present the functionality of models
- Methods for interpretation and visualization
- Apply model-agnostic methods
15. use distributed databases
- Using the Python package PySpark
- Read data from distributed databases
- Basics of big data analysis
- Using machine learning algorithms in distributed systems
16. exercise project
- Work independently on a comprehensive exercise project
- Solve prediction problem using a larger data set
- Preparation for the final project
17. final project
- Independent analysis of the data project
- Presentation of results and 1:1 feedback meeting with mentoring team
- Obtaining the certificate as a Data Scientist with Python
How do you learn in the course?
This online course offers you a particularly practice-oriented learning concept with comprehensive self-study units and a team of mentors who are available to you at all times. A new chapter is activated for you every week. With a time budget of around 6 hours per week, you are sure to reach your goal in 17 weeks. This is how you learn in the course:
Placement test: In an onboarding meeting at the beginning of the course, you and the mentoring team will determine what knowledge you already have and which parts of the course you should pay particular attention to. This will prepare you optimally for learning in the self-study units.
Data Lab: In the course's learning environment, you can expect videos, interactive graphics, text and, above all, many practical exercises with comprehensive datasets and coding tasks. You carry these out directly in the browser - without installation or configuration effort and with direct success control.
Mentor team: Your learning coaches are available to answer any questions you may have. They are experienced data analysts who will be happy to help you - via chat, audio or video call.
Webinars: Once a week, you have the opportunity to take part in webinars and immerse yourself in selected special topics of data analysis.
Career coaching: What professional goals are you pursuing with your further training and how can you achieve them? A team of mentors is available to help you achieve your career goals.
Final project: In your own data project, you will work independently through the entire data pipeline and answer typical questions. At the end, you will present your project in a 1-to-1 feedback session with your mentoring team.
Certificate: After the final project, you will receive your official certificate as a Data Analyst with Python.
This online training is provided by our partner StackFuel GmbH. StackFuel specializes in training courses on data literacy, data science and AI.
Your benefit
In this practice-oriented training course, you will learn how to carry out data analyses with large data sets independently.
You will learn how to use Python competently, how to use the programming language for data evaluation and how to create effective visualizations.
You will learn how to connect a wide variety of data sources, filter data in them and merge them.
You will learn comprehensive methods, algorithms and technologies of machine learning and how to use them with Python packages.
You will learn everything you need to know about the use of deep learning and create an artificial neural network with several layers.
After the training, you will be able to examine company data, visualize it in a meaningful way and make it interactively accessible in dynamic dashboards.
The technical entry hurdles are minimized by the use of Jupyter notebooks, with which you can carry out the programming exercises directly in the browser.
Recommended for
The online training to become a Data Scientist with Python is suitable for anyone looking for comprehensive training on machine learning and data pipelines. Basic knowledge of Python is required. The course is also suitable for career changers.
Final examination
In your own data project, you will work independently through the entire data pipeline and answer typical questions. At the end, you will present your project in a 1-to-1 feedback session with your mentoring team.
Further recommendations for "Data Scientist with Python"
30354
Can also be booked as English-language training:
Data Scientist with Python