InterviewStack.io LogoInterviewStack.io
đŸ’¾

Database Engineering & Data Systems Topics

Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).

Database Scalability and High Availability

Architectural approaches and operational practices for scaling and maintaining database availability. Topics include vertical versus horizontal scaling trade offs; replication topologies, leader and follower roles, read replicas and replica lag; read write splitting and connection pooling; sharding and partitioning strategies including range based, hash based, and consistent hashing approaches; handling hot partitions and data skew; federation and multi database federation patterns; cache layers and cache invalidation; rebalancing and resharding strategies; distributed concurrency control and transactional guarantees across shards; multi region deployment strategies, cross region failover and disaster recovery; monitoring, capacity planning, automation for failover and backups, and cost optimization at scale. Candidates should be able to pick scaling approaches based on read and write patterns and explain operational complexity and trade offs introduced by distributed data.

0 questions

Database Architecture and Optimization

Designing and tuning data storage systems to meet requirements for availability, latency, throughput, and cost. Topics include choosing between managed relational services and NoSQL key value or document stores, data modelling and schema design, partitioning and sharding strategies, replication and read replica patterns, indexing and query optimization, transaction and consistency trade offs, connection pooling and resource management, caching and cache invalidation strategies, backup and retention policies, capacity planning and monitoring, and approaches for migrating or scaling databases in production. Candidates should be able to discuss concrete techniques for improving performance, diagnosing slow queries, and balancing operational complexity against performance and cost.

0 questions

Data Management and Storage

Knowledge of data storage and management strategies for large scale systems. Includes choosing between relational and non relational stores, understanding consistency models and transactional guarantees, replication and partitioning strategies, indexing and query patterns, caching approaches, data retention and backup policies, and the operational trade offs between latency throughput durability and cost. Candidates should explain how data choices constrain application design and influence program decisions.

0 questions

Azure Storage and Database Options

Be able to compare Azure storage services and managed database offerings and explain when each is appropriate. Cover object storage for unstructured data, file shares for lift and shift legacy workloads, queue storage for messaging patterns, and table storage for simple NoSQL key value needs. For databases describe managed relational options such as Azure SQL Database and Azure Database for PostgreSQL or MySQL, and NoSQL options such as Cosmos DB, including differences in consistency, global distribution, latency, and operational trade offs. Discuss redundancy and durability options such as locally redundant, geo redundant, and read access geo redundant storage, and touch on performance tuning, backup and restore, lifecycle management, and security considerations that influence selection.

0 questions

Database Selection and Trade Offs

How to evaluate and choose data storage systems and architectures based on workload characteristics and business constraints. Coverage includes differences between relational and nonrelational families such as document stores, key value stores, wide column stores, graph databases, time series databases, and search engines; mapping query patterns and latency requirements to storage options; trade offs between strong consistency and eventual consistency and their impact on availability and complexity; partition key design, replication strategies, and high availability considerations; operational concerns including backups, monitoring, vendor and cost trade offs, migration or hybrid strategies, and when to adopt polyglot persistence. Senior level discussion includes selecting specific managed services and reasoning about expected load patterns, failure modes, and operational burden.

0 questions

Data Migration and Consistency

Plan and execute data migrations while preserving correctness and availability. Topics include zero downtime migration techniques, schema evolution patterns, backward and forward compatibility, dual writes and shadow writes, incremental and bulk migration strategies, data validation and reconciliation, canary migrations, rollbacks and fallback plans, and how to minimize user impact during transitions. Understand trade offs between consistency and speed of migration and techniques to detect and correct drift after migration.

0 questions

Storage Services and Data Management

Know primary storage options: Object Storage (S3, Azure Blob, GCS) - for unstructured data at scale, highly available, cost-effective. Block Storage (EBS, Azure Managed Disks) - for VM storage, IOPS/throughput optimized. Databases - Relational (RDS, Azure SQL, Cloud SQL) for structured data with relationships; NoSQL (DynamoDB, Cosmos DB, Firestore) for flexible schemas and scale. Understand access patterns, durability, and consistency models. Know when to use each storage type based on data characteristics and access patterns.

0 questions

Database Design and Architecture

Designing and architecting databases for production and cloud environments with attention to data modeling, schema design, and access pattern optimization. Topics include normalization and denormalization trade offs, schema versus query driven modeling, entity and relationship design for transactional and analytical workloads, indexing and query optimization techniques, partitioning and sharding design decisions, schema evolution and migration strategies, materialized views and caching strategies, selection of storage layers for different data shapes, and practical operational runbooks for provisioning, monitoring, alerting, backups, disaster recovery, and capacity planning. Candidates should justify schema and architecture choices with respect to latency, throughput, development and operational complexity, maintainability, and cost.

0 questions

Managed Databases and Data Services

Covers choosing and operating managed database offerings and complementary cloud data services. Candidates should understand managed relational database services such as Amazon Relational Database Service for MySQL PostgreSQL MariaDB Microsoft SQL Server and Oracle, and NoSQL document and key value stores such as Amazon DynamoDB Azure Cosmos Database Google Cloud Firestore and Datastore. Expect to explain when to choose relational versus NoSQL based on data shape query complexity transactional guarantees including atomicity consistency isolation and durability read and write patterns latency and scalability requirements. Understand scaling techniques including vertical scaling read replicas for read scaling horizontal scaling via partitioning or sharding and multi region replication and failover strategies. Be familiar with backup and restore approaches including snapshots point in time recovery cross region replication and disaster recovery planning. Know consistency models and trade offs such as strong eventual and causal consistency, and understand provisioned capacity versus serverless autoscaling models and their cost and operational implications. Candidates should also be able to discuss performance tuning topics such as indexing query optimization caching connection pooling storage and input output optimization monitoring and alerting, as well as security and compliance considerations including encryption access control and network isolation. Finally be prepared to recommend a database solution given workload characteristics such as data size read to write ratio latency targets and operational constraints.

0 questions
Page 1/2