Database Engineering & Data Systems Topics
Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).
SQL Scenarios
Advanced SQL query design and optimization scenarios, including complex joins, subqueries, window functions, common table expressions (CTEs), set operations, indexing strategies, explain plans, and performance considerations across relational databases.
Advanced SQL: Window Functions & CTEs for Complex Analysis
Advanced SQL techniques using window functions (ROW_NUMBER, RANK, DENSE_RANK, etc.) and common table expressions (CTEs), including recursive queries, for complex data analysis, ranking and analytics patterns, cumulative totals, and multi-step data transformations within relational databases and data warehousing contexts.
Query Optimization and Execution Plans
Focuses on diagnosing slow queries and reducing execution cost through analysis of query execution plans and systematic query rewrites. Candidates should be able to read and interpret explain output and execution plans including identifying expensive operators such as sequential table scans index scans sorts nested loop join hash join and merge join and explaining why those operators appear. Core skills include cost and cardinality estimation understanding join order and predicate placement predicate pushdown and selectivity reasoning comparing exists versus in versus join patterns and identifying common anti patterns such as N plus one queries. The topic covers profiling and benchmarking approaches using explain analyze and runtime statistics comparing estimated and actual row counts proposing and validating query rewrites and configuration or schema changes and reasoning about trade offs when using materialized views caching denormalization or partitioning to improve performance. Candidates should present step by step approaches to diagnose problems measure improvements and assess impact on other workloads.
Advanced Querying with Structured Query Language
Covers authoring correct, maintainable, and high quality Structured Query Language statements for analytical and transactional problems. Candidates should demonstrate writing Select Insert Update and Delete statements and using filtering grouping ordering and aggregation correctly. Emphasis is on complex query constructs and patterns such as multi table joins and join condition logic self joins for hierarchical data nested and correlated subqueries common table expressions including recursive common table expressions window functions such as row number rank dense rank lag and lead set operations like union and union all and techniques for calculating running totals moving averages cohort metrics and consecutive event detection. Candidates should be able to break down and refactor complex requirements into composable queries for readability and maintainability while reasoning about performance implications on large data sets. Senior expectations may include mentoring on best practices for query composition and understanding how schema and configuration choices influence query performance.
Storage Services and Data Management
Know primary storage options: Object Storage (S3, Azure Blob, GCS) - for unstructured data at scale, highly available, cost-effective. Block Storage (EBS, Azure Managed Disks) - for VM storage, IOPS/throughput optimized. Databases - Relational (RDS, Azure SQL, Cloud SQL) for structured data with relationships; NoSQL (DynamoDB, Cosmos DB, Firestore) for flexible schemas and scale. Understand access patterns, durability, and consistency models. Know when to use each storage type based on data characteristics and access patterns.
Technology Selection & Deep Technical Knowledge
Deep understanding of specific technologies relevant to complex system design. Master databases (PostgreSQL, Cassandra, DynamoDB, Elasticsearch), message queues (Kafka, RabbitMQ), caching systems (Redis), search engines, and frameworks. Understand their strengths, weaknesses, trade-offs, operational characteristics, scaling patterns, and common pitfalls. Be able to justify technology choices based on specific system requirements.