Choose Your Desired Option(s)
Overview
Find out about the range of statistical functions and techniques that are available for analyzing data stored within a PostgreSQL database.
In this course, database expert Adam Wilbert introduces you to a number of advanced query techniques that you can use to better understand your data. Statistical analyses involve understanding the shape and size of the full data set, and deeper insights can be gained by grouping data together to obtain some very useful metrics. Adam shows you how to create basic groups and apply aggregate calculations, then moves into window functions that create subgroups for more granular analysis. He goes over statistics that are based on sorted data within groups, such as the median value, the first and third quartiles of a dataset, the most frequent value, and more. Adam also covers ranking, hypothetical sets, percentile functions, and conditional expressions for further manipulating query result sets. He concludes with some additional querying techniques that you may find useful in solving common problems.
Syllabus
Introduction
- Gain additional insights from your PostgreSQL data
- What you should know
- Using the exercise files
1. Obtain Summary Statistics by Grouping Rows
- Using GROUP BY to aggregate data rows
- Obtain general-purpose aggregate statistics
- Evaluate columns with Boolean aggregates
- Find the standard deviation and variance of a dataset
- Include overall aggregates with ROLLUP
- Return all possible combinations of groups with CUBE
- Segmenting groups with aggregate filters
- Challenge: Group statistics
- Solution: Group statistics
2. Use Window Functions to Perform Calculations across Row Sets
- Create a window function with an OVER clause
- Partition rows within a window
- Streamline partition queries with a WINDOW clause
- Ordering data within a partition
- Calculate a moving average with a sliding window
- Return values at specific locations within a window
- Challenge: Leverage window functions
- Solution: Leverage window functions
3. Statistics Based on Sorted Data within Groups
- Calculate the median value of a dataset
- Calculate the first and third quartiles of a dataset
- Find the most frequent value within a dataset with MODE
- Determine the range of values within a dataset
- Challenge: Retrieve statistics of a dataset with groups
- Solution: Retrieve statistics of a dataset with groups
4. Ranking Data with Windows and Hypothetical Sets
- Rank rows with a window function
- Find a hypothetical rank
- View top performers with percentile ranks
- Evaluate probability with cumulative distribution
- Challenge: Evaluate rankings within a dataset
- Solution: Evaluate rankings within a dataset
5. Define Output Values with Conditional Expressions
- Define values with CASE statements
- Merge columns with COALESCE
- Convert values to null with NULLIF
6. Additional Querying Techniques for Common Problems
- Output row numbers with query results
- Cast values to a different data type
- Move rows within a result with LEAD and LAG
- Use an IN function with a subquery
- Define WHERE criteria with a series
- Challenge: Calculations across rows
- Solution: Calculations across rows
Conclusion
- Next steps
Taught by
Adam Wilbert
Share Now!