【感兴趣的朋友将简历发送至huchen@redhat.com 】
OpenShift作为当下红帽最火的企业容器编排平台,我们目前有几个相关的开发/运维开发/测试的岗位都在招聘中,如果你想加入红帽开源大家庭,参与upstream项目,和国外技术大牛交流切磋,并且对k8s/container等技术感兴趣的话,那就快快加入我们吧!
Job summary
The Red Hat OpenShift Site Reliability Engineering (SRE) team is looking for a Senior Site Reliability Engineer to join our team in Beijing, China. In this role, you will work with Red Hat OpenShift, which is a leading enterprise Kubernetes container platform, as part of the first team to host and manage the code in the public cloud. You’ll play a key part within the team, as you’ll be responsible for keeping the Red Hat OpenShift Container Platform environment available and secure. Along with the rest of your team, you will interact with other service reliability engineers and product engineering associates around the world to deliver large, containerized cluster environments. You'll be responsible for provisioning, upgrades, problem detection and automated recovery scenarios, incident management, and understanding complicated, interconnected data points to resolve faults when issues arise. As a Senior Site Reliability Engineer, you’ll need to be able to work in a complicated and fast-paced environment while quickly learning new skills. In addition, you’ll create ways to consistently meet service-level agreements (SLAs) and keep the globally distributed, cloud-based, and containerized enterprise Kubernetes running smoothly for our customers.
Primary job responsibilities
- Interact with automated monitoring and healing infrastructure to ensure healthy environments
- Design and develop highly-available Red Hat OpenShift infrastructure components to meet the needs of our growing and evolving offering
- Join a development team on a rotation to help them reduce toil and increase availability
- Develop automation to autocorrect or completely prevent issues in our online solutions
- Participate in release cycles of our offerings, deploying code to integration, staging, and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and providing change management
- Perform software updates, peer code reviews, testing, and common vulnerabilities and exposures (CVE) analysis; respond to security threats
- Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
- Resolve customer issues in cooperation with Red Hat's global customer support team
- Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment
- Participate in a regular shift and on-call rotation; this will include a weekend working schedule
Required skills:
- 2+ years of experience with functional programming languages like Go, C Sharp, Java, PHP, Python, or Ruby
- 5+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider like Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure
- 3+ years of experience with enterprise system monitoring; knowledge of Zabbix or Nagios is a plus
- 3+ years of experience with enterprise configuration management software like Red Hat Ansible Automation, Puppet, or Chef
- Experience delivering a hosted service
- Demonstrated ability to quickly and accurately troubleshoot system issues
- Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP
- Solid communication skills; experience working directly with and presenting to customers
- Experience with Kubernetes is a plus
- Experience with Docker-based containers is a plus
【感兴趣的朋友将简历发送至huchen@redhat.com 薪水范围:18k-28k/month】
--
FROM 61.158.146.*