Business Segment: Digital Predix Products & Technology
Location(s): United States; California; San Ramon
About Us: GE is the world's Digital Industrial Company, transforming industry with software-defined machines and solutions that are connected, responsive and predictive. Through our people, leadership development, services, technology and scale, GE delivers better outcomes for global customers by speaking the language of industry. GE offers a great work environment, professional development, challenging careers, and competitive compensation. GE is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law.
Role Summary: The Site Reliability Engineering team at GE Digital is responsible for the reliability and performance of Predix worldwide. We obsess over availability by building tools and engineering new systems to automate our platform. We are software engineers with full visibility and influence across the entire stack.
Essential Responsibilities: In this role, you will:
Develop automated solutions to predict and address potential problems before they result in a service interruption
Oversee and adapt monitoring and alerting systems
Collaborate with all GE business units worldwide, providing a bastion technical expertise
Identify potential process improvements across the entire engineering organization
Define and drive architectural enhancements into system to mitigate potential failure points
Provide impact assessment and mitigation plan for changes going into the production environment
Investigate root cause of severe and systemic outages, identify corrective actions
Able to troubleshoot and debug applications (C, Java, Go)
Proficient in configuration management systems (Chef, Terraform, Ansible, Puppet, Salt)
Experience with configuring, customizing, and extending monitoring tools (Sensu, Grafana, Prometheus, Graphite, Splunk, etc.)
Experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure)
Comfortable using Git on the command line
Influences through others; builds direct and "behind the scenes" support for ideas. Pre-emptively sees downstream consequences and effectively tailors influencing strategy to support a positive outcome.
Able to verbalize what is behind decisions and downstream implications. Continuously reflecting on success and failures to improve performance and decision-making. Understands and encourages change when needed.
Proactively identifies and removes project obstacles or barriers on behalf of the team. Able to navigate accountability in a matrixed organization.
Self-starter; communicates and demonstrates a shared sense of purpose. Learns from failure.
Critical thinker; able to quickly adapt to changing environments
A hacker or tinkerer at heart
Risk taker, not afraid to think outside the box or challenge the status quo
Emotional Intelligence, ability to influence up and out and the ability to work independently
Must be a team player with a strong desire to win
Passionate about continuously learning
Highly organized and efficient; able to balance competing priorities and execute accordingly
We are in the process of transitioning to an improved job application system and in the interim we are operating with two systems. Have your Job ID ready (from the email you received when you applied) to log in and check your application status.
Click the appropriate button. If you don't know your job ID, you can still check your status: use both buttons.