InterviewStack.io LogoInterviewStack.io
đŸ’¾

Database Engineering & Data Systems Topics

Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).

Database Scalability and High Availability

Architectural approaches and operational practices for scaling and maintaining database availability. Topics include vertical versus horizontal scaling trade offs; replication topologies, leader and follower roles, read replicas and replica lag; read write splitting and connection pooling; sharding and partitioning strategies including range based, hash based, and consistent hashing approaches; handling hot partitions and data skew; federation and multi database federation patterns; cache layers and cache invalidation; rebalancing and resharding strategies; distributed concurrency control and transactional guarantees across shards; multi region deployment strategies, cross region failover and disaster recovery; monitoring, capacity planning, automation for failover and backups, and cost optimization at scale. Candidates should be able to pick scaling approaches based on read and write patterns and explain operational complexity and trade offs introduced by distributed data.

0 questions

Database Selection and Trade Offs

How to evaluate and choose data storage systems and architectures based on workload characteristics and business constraints. Coverage includes differences between relational and nonrelational families such as document stores, key value stores, wide column stores, graph databases, time series databases, and search engines; mapping query patterns and latency requirements to storage options; trade offs between strong consistency and eventual consistency and their impact on availability and complexity; partition key design, replication strategies, and high availability considerations; operational concerns including backups, monitoring, vendor and cost trade offs, migration or hybrid strategies, and when to adopt polyglot persistence. Senior level discussion includes selecting specific managed services and reasoning about expected load patterns, failure modes, and operational burden.

0 questions

Data Migration and Consistency

Plan and execute data migrations while preserving correctness and availability. Topics include zero downtime migration techniques, schema evolution patterns, backward and forward compatibility, dual writes and shadow writes, incremental and bulk migration strategies, data validation and reconciliation, canary migrations, rollbacks and fallback plans, and how to minimize user impact during transitions. Understand trade offs between consistency and speed of migration and techniques to detect and correct drift after migration.

0 questions

Database Fundamentals and Storage Engines

Core principles and components of data storage and persistence systems. This includes storage engine architectures and how they affect query processing and performance; transactions and isolation including atomicity, consistency, isolation, and durability; concurrency control and isolation levels; indexing strategies and how indexes affect read and write amplification; physical versus logical storage and object, block, and file storage characteristics; caching layers and cache invalidation patterns; replication basics and how replication affects durability and read performance; backup and recovery techniques including snapshots and point in time recovery; trade offs captured by consistency, availability, and partition tolerance reasoning; compression, cost versus performance trade offs, data retention, archival, and compliance concerns. Candidates should be able to reason about durability, persistence guarantees, operational recovery, and storage choices that affect latency, throughput, and cost.

0 questions

Database and Data Platform Selection

Evaluation and selection of database and data platform technologies to meet analytical and operational needs. Covers assessment of relational, non relational, columnar, and specialized systems such as time series and search engines; data warehouse platforms and cloud analytics platforms; query patterns and workload characteristics; consistency and transactional guarantees; partitioning and clustering strategies; storage formats and compression; performance and scalability trade offs; operational complexity and administration overhead; data ingestion and incremental loading patterns; pricing and cloud platform considerations; and how to choose the right solution based on data volume, concurrency, latency requirements, and total cost of ownership.

0 questions

Database Selection and Optimization

Assess and justify data storage choices and optimizations across a cloud architecture. Topics include selecting managed relational databases such as Amazon Relational Database Service and Amazon Aurora, non relational key value and document stores such as Amazon DynamoDB, data warehousing solutions such as Amazon Redshift, and in memory caches such as Amazon ElastiCache. Candidates should explain trade offs between consistency, latency, and cost; choose row oriented versus columnar storage for online transaction processing versus analytics; design schemas that balance normalization and denormalization; apply indexing and query plan analysis; and use partitioning or sharding to scale throughput. Also cover caching strategies, storage tiering and lifecycle policies, backup and replication modes, read replica and failover choices, capacity planning and autoscaling behavior, and operational monitoring and tuning techniques.

0 questions

Database Design and Architecture

Designing and architecting databases for production and cloud environments with attention to data modeling, schema design, and access pattern optimization. Topics include normalization and denormalization trade offs, schema versus query driven modeling, entity and relationship design for transactional and analytical workloads, indexing and query optimization techniques, partitioning and sharding design decisions, schema evolution and migration strategies, materialized views and caching strategies, selection of storage layers for different data shapes, and practical operational runbooks for provisioning, monitoring, alerting, backups, disaster recovery, and capacity planning. Candidates should justify schema and architecture choices with respect to latency, throughput, development and operational complexity, maintainability, and cost.

0 questions

Managed Databases and Data Services

Covers choosing and operating managed database offerings and complementary cloud data services. Candidates should understand managed relational database services such as Amazon Relational Database Service for MySQL PostgreSQL MariaDB Microsoft SQL Server and Oracle, and NoSQL document and key value stores such as Amazon DynamoDB Azure Cosmos Database Google Cloud Firestore and Datastore. Expect to explain when to choose relational versus NoSQL based on data shape query complexity transactional guarantees including atomicity consistency isolation and durability read and write patterns latency and scalability requirements. Understand scaling techniques including vertical scaling read replicas for read scaling horizontal scaling via partitioning or sharding and multi region replication and failover strategies. Be familiar with backup and restore approaches including snapshots point in time recovery cross region replication and disaster recovery planning. Know consistency models and trade offs such as strong eventual and causal consistency, and understand provisioned capacity versus serverless autoscaling models and their cost and operational implications. Candidates should also be able to discuss performance tuning topics such as indexing query optimization caching connection pooling storage and input output optimization monitoring and alerting, as well as security and compliance considerations including encryption access control and network isolation. Finally be prepared to recommend a database solution given workload characteristics such as data size read to write ratio latency targets and operational constraints.

0 questions