핵심 정보
- 경력
- 경력 5년 ↑
- 학력
- 대졸(4년제) 이상
- 근무형태
- 정규직
- 급여
- 면접 후 결정
- 근무지역
- 서울 강남구
상세요강
Gauss Labs is seeking a highly skilled Site Reliability Engineer to join our team. As an SRE at Gauss Labs, you will play a critical role in ensuring our industrial AI platform's reliability, performance, and scalability. You will be responsible for building and maintaining a robust solution that supports our growing business at the customer site.
[Responsibilities]
• Monitoring and Alerting: Creating and maintaining robust monitoring systems to proactively identify and resolve issues before they impact customers. Implementing effective alerting mechanisms to ensure timely response to critical events.
• Incident Response: Participating in on-call rotations and leading incident response efforts to minimize downtime and restore service quickly.
• Automation: Developing and implementing automation tools and scripts to streamline operations, reduce manual effort, and improve efficiency.
• Capacity Planning: Forecasting resource needs, optimizing resource utilization, and ensuring the customers’ infrastructure can handle increasing workloads.
• Performance Optimization: Identifying and resolving performance bottlenecks, optimizing system performance, and improving response times.
• Collaboration: Partnering with software engineers, data scientists, and other teams to ensure alignment and efficient operations.
• Customer Focus: Working closely with the AI program manager and Technical Account Manager to understand customer issues, provide technical support, and improve customer satisfaction.
• Continuous Improvement: Driving a culture of continuous improvement by identifying opportunities to enhance system reliability, performance, and efficiency.
[Basic Qualifications]
• Bachelor's degree in computer science, engineering, or a related discipline
• 5+ years of industry experience as a Site Reliability Engineer
• Experience with cloud platforms (e.g., AWS, GCP, Azure).
• Experience with scripting languages (e.g., Python).
• Experience with monitoring and alerting tools (e.g., Prometheus, Grafana).
• Experience in ticket management, issue resolution, and troubleshooting
• Strong problem-solving and troubleshooting skills.
• Ability to work independently and as part of a team.
• Excellent customer communication and interpersonal skills.
[Preferred Qualifications]
• Knowledge of containerization technologies (Docker, Kubernetes).
• Knowledge of AI/ML infrastructure and workloads.
• Knowledge of big data technologies (Hadoop, Spark).
• Fluency in verbal and written English
[유의사항]
- Recruiting Process: Application Review - Phone interview - Onsite interview - CEO interview/Core Value interview
- For inquires: recruiting@gausslabs.ai
- Job link: https://jobs.lever.co/gausslabs/2b0910f0-6cf3-487a-8b55-3a6d54dcad1a
[Responsibilities]
• Monitoring and Alerting: Creating and maintaining robust monitoring systems to proactively identify and resolve issues before they impact customers. Implementing effective alerting mechanisms to ensure timely response to critical events.
• Incident Response: Participating in on-call rotations and leading incident response efforts to minimize downtime and restore service quickly.
• Automation: Developing and implementing automation tools and scripts to streamline operations, reduce manual effort, and improve efficiency.
• Capacity Planning: Forecasting resource needs, optimizing resource utilization, and ensuring the customers’ infrastructure can handle increasing workloads.
• Performance Optimization: Identifying and resolving performance bottlenecks, optimizing system performance, and improving response times.
• Collaboration: Partnering with software engineers, data scientists, and other teams to ensure alignment and efficient operations.
• Customer Focus: Working closely with the AI program manager and Technical Account Manager to understand customer issues, provide technical support, and improve customer satisfaction.
• Continuous Improvement: Driving a culture of continuous improvement by identifying opportunities to enhance system reliability, performance, and efficiency.
[Basic Qualifications]
• Bachelor's degree in computer science, engineering, or a related discipline
• 5+ years of industry experience as a Site Reliability Engineer
• Experience with cloud platforms (e.g., AWS, GCP, Azure).
• Experience with scripting languages (e.g., Python).
• Experience with monitoring and alerting tools (e.g., Prometheus, Grafana).
• Experience in ticket management, issue resolution, and troubleshooting
• Strong problem-solving and troubleshooting skills.
• Ability to work independently and as part of a team.
• Excellent customer communication and interpersonal skills.
[Preferred Qualifications]
• Knowledge of containerization technologies (Docker, Kubernetes).
• Knowledge of AI/ML infrastructure and workloads.
• Knowledge of big data technologies (Hadoop, Spark).
• Fluency in verbal and written English
[유의사항]
- Recruiting Process: Application Review - Phone interview - Onsite interview - CEO interview/Core Value interview
- For inquires: recruiting@gausslabs.ai
- Job link: https://jobs.lever.co/gausslabs/2b0910f0-6cf3-487a-8b55-3a6d54dcad1a
함께하기 위한 방법
- 접수기간 : 2024년 08월 02일 (금)13시 00분 ~ 2024년 09월 01일 (일) 23시 59분
- 접수방법 : 홈페이지 지원
- 이력서양식 : 자유양식
함께하기 위한 여정
- Application review
- Phone interview
- Onsite interview
- CEO/CoreValue interview
- Hiring
접수기간 및 방법
남은 기간
00
일
00:00:00
- 시작일
- 2024.08.02 13:00
- 마감일
- 2024.09.01 23:59
- 지원방법
- 홈페이지 지원
- 접수양식
- 자유양식
- 담당자
- 인재영입팀
마감일은 기업의 사정, 조기마감 등으로 변경될 수 있습니다.
기업정보
이 기업의 다른 공고 (2건)- 대표자명
- 김영한
- 기업형태
- 중소기업
- 업종
- 컴퓨터시스템 통합 자문 및 구축 서비스업
- 사원수
- 45 명 (2024년 기준)
- 매출액
- 7억 4,154만원 (2021년 기준)
- 기업주소
- 서울 강남구 테헤란로 201, 6층 (역삼동,아주빌딩)
이어보는 Ai매치 채용정보
사람인 인공지능 기술 기반으로 맞춤 공고를 추천해드리는 사람인의 채용정보제공 서비스입니다.
사람인 인공지능 기술 기반으로 맞춤 공고를 추천해드리는 사람인의 채용정보제공 서비스입니다.