InterviewStack.io LogoInterviewStack.io
Browse more Site Reliability Engineer jobs

Devops/Sre

Io Tech Solutions Limited

Hong Kong Island, Hong Kong1 month ago
18 views8 saves2 applies

Prepare for this role


Job Type

full time

Description

About the Role:

We are seeking a skilled and motivated DevOps / Site Reliability Engineer (SRE) with 2+ years of experience to help us build, scale, and maintain robust, secure, and high-availability infrastructure. As a DevOps/SRE team member, you will work closely with development, QA, and operations teams to automate processes, monitor system health, and ensure the reliability of our services.

This is a hands-on role that requires strong technical skills, a deep understanding of modern DevOps tools and practices, and a problem-solving mindset.

Key Responsibilities:

  • Design, implement, and maintain CI/CD pipelines for reliable code deployment
  • Monitor application performance and system reliability using tools like Prometheus, Grafana, or Datadog
  • Maintain and improve cloud infrastructure (e.g., AWS, GCP, Azure) following best practices
  • Manage infrastructure as code using tools such as Terraform, Ansible, or CloudFormation
  • Troubleshoot infrastructure and application issues, ensuring minimal downtime and fast resolution
  • Automate repetitive operational tasks and improve development workflows
  • Implement and enforce security, backup, and disaster recovery strategies
  • Participate in on-call rotation and respond to incidents with root cause analysis and postmortem reviews
  • Work closely with development teams to ensure applications are designed for performance, availability, and scalability
  • Optimize resource usage and costs across cloud environments

Qualifications:

Required:

  • Bachelors degree in Computer Science, Engineering, or a related field
  • 2+ years of experience in a DevOps, SRE, or Systems Engineering role
  • Hands-on experience with Linux/Unix system administration
  • Experience with CI/CD tools such as Jenkins, GitHub Actions, CircleCI, or GitLab CI
  • Working knowledge of cloud platforms (AWS, GCP, Azure)
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes)
  • Experience with infrastructure as code tools like Terraform, Ansible, or similar
  • Proficient in at least one scripting or programming language (e.g., Bash, Python, Go)
  • Strong understanding of monitoring, logging, and alerting systems
  • Version control with Git

Preferred:

  • Experience with Kubernetes administration in production environments
  • Familiarity with security best practices and compliance standards
  • Understanding of networking, load balancing, and DNS configurations
  • Exposure to incident management and SLA/SLO/SLI concepts
  • Experience working in Agile environments

This job is found at InterviewStack.io

Skills

ci/cdprometheusgrafanadatadogawsgcpazureinfrastructure as codeterraformansiblecloudformationscalabilitylinuxjenkinsgithub actionscirclecigitlabcontainerizationdockerkubernetesbashpythonmonitoringgitdnsagileincident managementroot cause analysissystems engineeringdisaster recoveryhigh availabilityload balancingsystem administration