Foundations for Big Data Analysis with SQL
This training course offers a big-picture view of using SQL for big data. It covers data, database systems, and the common querying language, SQL. Learners will explore the unique characteristics of big data and SQL tools for working on big data platforms. Additionally, they will install an exercise environment and have the opportunity to explore databases and tables. This course is available in face-to-face classroom or live virtual class training formats, making it accessible to learners worldwide. By completing this course, learners will have a comprehensive understanding of SQL and its applications for big data.
per person
Level
Duration
Training Delivery Format
Face-to-face / Virtual Class
per person
Level
Duration
Training Delivery Format
Face-to-face (F2F) / Virtual Class
Class types
Public Class
Private Class
In-House Training
Bespoke
About this course
In this course, you’ll get a big-picture view of using SQL for big data, starting with an overview of data, database systems, and the common querying language (SQL). Then you’ll learn the characteristics of big data and SQL tools for working on big data platforms.
You’ll also install an exercise environment (virtual machine) to be used through the specialization courses, and you’ll have an opportunity to do some initial exploration of databases and tables in that environment.
This course is available on face to face classroom training or live virtual class training
Who should attend?
Suitable for data analysts, data scientists, data engineers, and database administrators who want to enhance their knowledge and skills in working with big data using SQL. Additionally, professionals who work with big data in various industries, such as finance, healthcare, marketing, and retail, would benefit from attending this training.
No prior experience with SQL or big data is necessary, making it accessible to a broad range of learners.
Learning Outcome
By the end of the course, you will be able to
- distinguish operational from analytic databases, and understand how these are applied in big data;
- understand how database and table design provides structures for working with data;
- appreciate how differences in volume and variety of data affects your choice of an appropriate database system; • recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis; and
- explore databases and tables in a big data platform. To use the hands-on environment for this course, you need to download and install a virtual machine and the software on which to run it.
Prerequisites
Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements:
• Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work)
• 8 GB RAM or more • 25GB free disk space or more
• Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS)
• For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)
Course Content
Module 1: Data and Databases
You’ll learn about database systems and the distinction between operational and analytic databases.
- What Is Data?
- Why Organize Data?
- What Does a DBMS Do?
- Relational Databases and SQL
- The Success of RDBMSs and SQL
- Operational and Analytic Databases
- Comparing Operational and Analytic DBs: SELECT Statements
- Comparing Operational and Analytic DBs: DML Activity
- Operational and Analytic Databases: Further Comparisons
Module 2: Relational Databases and SQL
- Introducing Table Schemas
- NULL Values
- Data Types
- Primary Keys
- Foreign Keys
- Two Strategies for Database Design
- Database Normalization
- Denormalization
- Differences
- Trade-offs
- Database Transactions
- ACID
- Enforcing Business Rules: Constraints and Triggers
- Business Rules and ACID for Analytics?
Module 3: Big Data
- How Big Is Big Data?
- Distributed Storage
- Distributed Processing
- Structured Data
- Unstructured Data
- Semi-Structured Data
- Strengths of Traditional RDBMSs
- Limitations of Traditional RDBMSs
- SQL and Structured Data
- SQL and Semi-structured Data
- SQL and Unstructured Data
Module 4: SQL Tools for Big Data Analysis
- Big Data Analytic Databases (Data Warehouses)
- NoSQL: Operational, Unstructured and Semi-structured
- Non-transactional, Structured Systems
- Big Data ACID-Compliant RDBMSs
- Search Engines
- Challenges
- What We Keep
- What We Give Up
- What We Add
- Where to Store Big Data
- Coupling of Data and Metadata
At this time, this course is available for private class and in-house training only. Please contact us for any inquiries.