InterviewStack.io LogoInterviewStack.io
Browse more Site Reliability Engineer jobs

Site Reliability Engineer

Jolera

Colombo, Western Province, Sri Lanka1 week ago
37 views7 saves1 applies

Prepare for this role


Job Type

full time

Description

Job Purpose

Operate as a hands-on site reliability engineer delivering after-hours incident detection, triage, and runbook-based remediation for production cloud-native environments, to support our North American customers, escalating cleanly when actions exceed agreed authority.

Key Responsibilities

• Monitor and respond to alerts; perform initial triage to determine severity, scope, and impact, to support our North American customers.

• Execute approved runbooks: workload/node restarts, scaling within agreed bounds, rollbacks, and database stabilization.

• Investigate and contain availability, performance, scaling, access, and replication issues within defined permissions.

• Engage cloud-provider support (AWS / GCP) when required.

• Prepare clear escalation summaries and shift-handoff notes.

• Contribute to runbook improvements and flag automation and detection gaps.

People Management

• Individual contributor; no direct reports.

Financial Responsibility

• No budget responsibility.

• Responsible for disciplined, in-scope operations that protect service levels.

Key Performance Indicators (KPIs)

• Service-level (SLO/SLA) attainment

• Runbook-execution accuracy

• Quality and timeliness of escalations and handoffs

• Ticket hygiene and documentation

• Contribution to runbook / automation improvements

Requirements

Education & Certifications

• Bachelor’s degree in Computer Science/Engineering or equivalent experience.

• CKA or CKAD preferred; AWS/GCP associate-level certification an asset.

Experience

• 3–5 years in SRE/DevOps/cloud operations, including hands-on Kubernetes.

• Exposure to incident response and on-call / shift operations preferred.

Skills & Competencies

Technical Skills

• Kubernetes operations on AWS EKS and/or GCP GKE

• Working knowledge of AWS and GCP core services

• Relational database operational basics (e.g., PostgreSQL)

• Observability platforms (e.g., Datadog)

• Scripting in Bash and Python

Soft Skills

• Composure under pressure

• Problem-solving

• Clear English communication

• Team collaboration

• Time management

Tools / Software

• Datadog

• Jira / ServiceNow

• Confluence / GitHub Wiki

• AWS & GCP consoles

• Slack / Microsoft Teams

Benefits

What We Offer

  • Competitive compensation package
  • Competitive benefits package
  • Company Perks, Good Life gym, and various brand discounts
  • Company events, recognitions, and celebrations
  • Career development and growth opportunities

This job is found at InterviewStack.io

Skills

node.jsawsgcpautomationkubernetesekspostgresqlobservabilitydatadogbashpythonjiraincident responsepeople management

About Jolera

Jolera is a Global Systems Integrator (GSI) dedicated to transforming IT operations into secure, efficient environments. With a diverse team of over 500 professionals across 24 countries, we combine global reach with localized expertise.

it services, cybersecurityWebsite