💾

Database Engineering & Data Systems Topics

Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).

Database Performance and Query Optimization

Evaluate ability to identify and remediate database performance bottlenecks including the N plus one query problem and expensive queries. Candidates should explain how to discover problematic queries through query plan inspection and profiling, and propose remedies such as appropriate indexing, query rewriting to use set based operations or joins, request batching and eager loading, pagination strategies, caching and denormalization when appropriate, and trade offs of read replicas or sharding. Interviewers expect discussion of measurement, monitoring, and the operational costs and consistency trade offs introduced by each optimization.

0 questions

Database and Query Basics

Focuses on foundational database knowledge and simple query skills, including writing basic Structured Query Language select statements, performing joins, filtering and aggregation, understanding transactions and constraints, recognizing connectivity problems, and knowing when queries are appropriate versus when to escalate. Interviewers also assess awareness of basic performance considerations and safe practices for querying production data.

0 questions

Databases and Data Persistence

Thorough knowledge of how to store, model, and protect application data at scale. Evaluate trade offs between relational databases and nonrelational databases and select appropriate store types such as key value stores, document databases, column oriented stores, and graph databases based on access patterns. Design schemas and data models that balance normalization and denormalization, choose effective indexing strategies, and understand partitioning and sharding approaches for scaling. Explain transaction semantics and guarantees including atomicity, consistency, isolation, and durability and explain trade offs between strong consistency and eventual consistency in distributed deployments. Cover replication, leader election, multi region deployments, failover strategies, and mitigation of replication lag. Discuss query optimization and execution plans, secondary indexing, caching versus persistent storage, backup and restore strategies including point in time recovery, schema migration techniques, retention and archiving, and handling common failure scenarios and data corruption. Be able to justify database choice and design decisions with respect to latency, throughput, availability, consistency, cost, and operational complexity.

0 questions

Database Design and Query Optimization

Principles of database schema design and performance optimization including relational and non relational trade offs, normalization and denormalization, indexing strategies and index types, clustered and non clustered indexes, query execution plans, common table expressions for readable complex queries, detecting missing or redundant indexes, sharding and partitioning strategies, and consistency and availability trade offs. Candidates should demonstrate knowledge of optimizing reads and writes, diagnosing slow queries, and selecting the appropriate database model for scale and consistency requirements.

0 questions

Database Performance Tuning and Scaling

Addresses database system level performance and scaling strategies and how they interact with query design. Candidates should describe approaches for identifying and resolving database level bottlenecks including slow query diagnosis using logs and profiling instrumenting metrics and establishing baselines and targets for latency and throughput. Topics include caching strategies materialized views partitioning and sharding replication and read replica architectures connection management and improving cache utilization as well as trade offs when denormalizing schema or adopting alternative data models. Candidates should be able to propose step by step remediation plans measure the impact of changes and reason about operational concerns such as index maintenance windows monitoring and capacity planning.

0 questions

Storage and Database Infrastructure

Storage concepts: SSDs vs. HDDs, RAID configurations, storage protocols. Database troubleshooting basics, replication concepts, backup and recovery strategies, understanding query performance and index behavior, and storage at scale.

0 questions

Data Consistency and Recovery

Covers the spectrum of data consistency models used in distributed systems and the operational practices for detecting and recovering from inconsistency. Topics include strong consistency guarantees provided by atomicity, consistency, isolation, and durability style transactions and synchronous replication, and weaker models such as eventual consistency and causal consistency along with their read guarantees like read your writes and monotonic reads. Explain the trade offs between consistency, availability, and latency and how those trade offs influence architecture decisions, user experience, and cost. Discuss replication strategies including synchronous replication, asynchronous replication, and read replicas, and how replication modes affect staleness and failure behavior. Include coordination and consensus mechanisms for achieving stronger guarantees, for example leader based replication and consensus protocols, and distributed transaction approaches such as two phase commit. Cover operational concerns: how consistency choices change testing, deployment, monitoring, and incident response. Describe detection and recovery techniques for inconsistency such as validation checks, reconciliation and anti entropy processes, tombstones and conflict resolution strategies, use of vector clocks or conflict free replicated data types to resolve concurrent updates, point in time recovery and backups, and procedures for partial repairs, rollbacks, and replays. At senior levels also address how consistency decisions shape runbooks, alerting, and post incident analysis.

0 questions