Site Reliability Engineer - OCI Data Services
Oracle
Prepare for this role
Benefits
Job Type
Description
Who We Are
The OCI GoldenGate and Database Migration Service team is responsible for operating and scaling Oracle's cloud-native data movement services that help customers migrate, replicate, and synchronize mission-critical workloads across cloud and hybrid environments.
Our team owns the reliability, availability, operational excellence, and customer experience of these services. We work closely with software engineering, cloud infrastructure, and customer-facing teams to ensure our services remain secure, resilient, and highly available as adoption continues to grow across commercial and sovereign cloud environments.
This role offers the opportunity to solve complex distributed systems challenges, influence service reliability at scale, and contribute to cloud services used by enterprise customers around the world.
Learn more about OCI GoldenGate
What You'll Do
As a Senior Site Reliability Engineer, you will be responsible for the operational health, reliability, and continuous improvement of OCI GoldenGate and OCI Database Migration Service.
You will:
- Operate and support production cloud services that power critical customer migration and data replication workloads.
- Monitor service health, investigate incidents, and lead troubleshooting efforts for complex production issues.
- Participate in incident response, root cause analysis, and post-incident reviews while driving corrective and preventative actions.
- Act as an escalation point for critical service issues and customer-impacting events.
- Partner with software engineering teams to improve service architecture, reliability, scalability, and operational readiness.
- Design and implement automation to reduce operational overhead and improve service efficiency.
- Build and maintain observability solutions, including monitoring, alerting, logging, dashboards, and operational metrics.
- Contribute to capacity planning, performance analysis, disaster recovery readiness, and service resilience initiatives.
- Support services operating across Oracle Cloud commercial and sovereign cloud environments.
- Leverage Infrastructure-as-Code and CI/CD practices to improve deployment consistency, scalability, and operational efficiency.
- Drive continuous service improvement through operational reviews, reliability initiatives, and engineering best practices.
This role provides the opportunity to work on large-scale distributed systems supporting enterprise customers worldwide, helping shape the future of Oracle's cloud-native data movement platform.
What You'll Bring
We are looking for engineers who are passionate about reliability, automation, and solving complex operational challenges.
Preferred qualifications include:
- Experience operating and supporting production cloud services.
- Strong troubleshooting, incident management, and root cause analysis skills.
- Experience with at least one major public cloud platform (OCI, AWS, Azure, GCP).
- Experience with automation and Infrastructure-as-Code (Terraform or similar).
- Scripting experience (Python, Shell, Go, or similar).
- Experience with monitoring, alerting, logging, and observability tools.
- Understanding of Linux, networking fundamentals, and distributed systems.
- Ability to work cross-functionally with engineering and infrastructure teams.
- Willingness to participate in on-call rotations.
Nice to Have
- Experience with Oracle Cloud Infrastructure (OCI) services and cloud operations.
- Familiarity with data replication technologies such as Oracle GoldenGate or other large-scale data integration platforms.
- Experience with Kubernetes, containerized workloads, and cloud-native architectures.
- Knowledge of high-availability architectures, disaster recovery strategies, and service resilience best practices.
- Experience supporting regulated or sovereign cloud environments.
What We Offer
- The opportunity to work on cloud services that support enterprise customers at global scale.
- Exposure to complex distributed systems, cloud-native technologies, and large-scale operational challenges.
- A collaborative environment where reliability, automation, and engineering excellence are core priorities.
- Opportunities to influence service architecture, operational strategy, and platform evolution.
- Career growth through ownership, technical leadership, and collaboration with world-class engineering teams.
- Competitive compensation, comprehensive benefits, and flexible work arrangements.
Career Level - IC4
This job is found at InterviewStack.io
Skills
About Oracle
Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud.