Database Engineering & Data Systems Topics
Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).
SQL Fundamentals and Query Writing
Comprehensive query writing skills from basic to intermediate level. Topics include SELECT and WHERE, joining tables with inner and outer joins, grouping with GROUP BY and filtering groups with HAVING, common aggregation functions such as COUNT SUM AVG MIN and MAX, ORDER BY and DISTINCT, subqueries and common table expressions, basic window functions such as ROW_NUMBER and RANK, union operations, and principles of readable and maintainable query composition. Also covers basic query execution awareness and common performance pitfalls and how to write correct, efficient queries for combining and summarizing relational data.
Query Optimization and Execution Plans
Focuses on diagnosing slow queries and reducing execution cost through analysis of query execution plans and systematic query rewrites. Candidates should be able to read and interpret explain output and execution plans including identifying expensive operators such as sequential table scans index scans sorts nested loop join hash join and merge join and explaining why those operators appear. Core skills include cost and cardinality estimation understanding join order and predicate placement predicate pushdown and selectivity reasoning comparing exists versus in versus join patterns and identifying common anti patterns such as N plus one queries. The topic covers profiling and benchmarking approaches using explain analyze and runtime statistics comparing estimated and actual row counts proposing and validating query rewrites and configuration or schema changes and reasoning about trade offs when using materialized views caching denormalization or partitioning to improve performance. Candidates should present step by step approaches to diagnose problems measure improvements and assess impact on other workloads.
Data Aggregation and Filtering
Focuses on using query operations to filter and aggregate datasets efficiently and correctly. Candidates should demonstrate filtering rows by conditions, applying time based filters, grouping by one or more dimensions, and using aggregate functions such as count, sum, average, minimum, and maximum. It includes correct use of pre aggregation filters and post aggregation filters, the difference between filtering rows before aggregation and filtering aggregated results, combining multiple aggregation levels, calculating distinct counts and percentiles, and composing queries that combine conditional logic and aggregation in a single statement. Performance and readability of queries, and choosing appropriate aggregation granularity for business questions, are also relevant.
Advanced SQL: Window Functions & CTEs for Complex Analysis
Advanced SQL techniques using window functions (ROW_NUMBER, RANK, DENSE_RANK, etc.) and common table expressions (CTEs), including recursive queries, for complex data analysis, ranking and analytics patterns, cumulative totals, and multi-step data transformations within relational databases and data warehousing contexts.
Advanced Querying with Structured Query Language
Covers authoring correct, maintainable, and high quality Structured Query Language statements for analytical and transactional problems. Candidates should demonstrate writing Select Insert Update and Delete statements and using filtering grouping ordering and aggregation correctly. Emphasis is on complex query constructs and patterns such as multi table joins and join condition logic self joins for hierarchical data nested and correlated subqueries common table expressions including recursive common table expressions window functions such as row number rank dense rank lag and lead set operations like union and union all and techniques for calculating running totals moving averages cohort metrics and consecutive event detection. Candidates should be able to break down and refactor complex requirements into composable queries for readability and maintainability while reasoning about performance implications on large data sets. Senior expectations may include mentoring on best practices for query composition and understanding how schema and configuration choices influence query performance.