22.06.2024
Sr Engineer - Observability
Rockwell Automation (USA)
México, Mexico Mexico City (remote)
JavaScriptBachelors degree
You may be interested in the following jobs
Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better. We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us! Job Description Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better. We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us! Job Description Sr Engineer - Observability Executive Summary As a Senior Engineer specializing in observability, you will play a critical role in ensuring the stability, performance, and reliability of our systems through advanced monitoring, logging, and tracing practices. You will collaborate with cross-functional teams to design and implement observability solutions, empowering our organization to gain deep insights into the behavior of our systems and applications. Your expertise will drive improvements in system observability, enabling proactive identification and resolution of issues before they impact our customers. Key Responsibilities: Analyzes, designs, programs, debugs, and modifies observability tools and interfaces. Code may be used to enrich and correlate telemetry from many data sources in order to isolate events that indicate future or immediate IT availability issues. Will interact with users to define system requirements and/or necessary modifications. Design and Implement Observability Solutions: Develop and implement comprehensive observability solutions utilizing industry-standard tools and technologies such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Jaeger, and Open Telemetry. System Monitoring and Alerting: Establish robust monitoring and alerting mechanisms to provide real-time visibility into system performance, resource utilization, and application behavior. Configure alerts to effectively notify relevant stakeholders of potential issues or anomalies. Log Management: Architect centralized logging solutions to collect, store, and analyze logs from various components of our infrastructure and applications. Ensure logs are structured effectively to facilitate efficient querying and troubleshooting. Distributed Tracing: Implement distributed tracing techniques to trace and visualize the flow of requests across microservices architectures. Utilize tracking data to identify performance bottlenecks and optimize system performance. Performance Analysis and Optimization: Analyze system performance metrics and identify opportunities for optimization. Collaborate with development teams to implement performance improvements and ensure scalability of systems. Incident Response and Post-Mortems: Actively participate in incident response activities, providing expertise in diagnosing and resolving complex issues. Conduct thorough post-incident reviews to identify root causes and recommend preventive measures. Documentation and Knowledge Sharing: Document observability best practices, standards, and procedures. Share knowledge and insights with team members through presentations, workshops, and documentation to foster a culture of continuous learning and improvement. Cross-Functional Collaboration: Collaborate with cross-functional teams including DevOps, SRE, and software engineering to drive observability initiatives and ensure alignment with organizational goals and objectives. Qualifications: Bachelor's or Master's degree in Computer Science, Information Technology, or related field. 2+ years of experience in software engineering, with a focus on observability, monitoring, and/or site reliability engineering. 1-2 years of experience with one or more of the following: Application Performance Management APM, Monitoring / Alerting, New Relic, DynaTrace, AppDynamics, Zabbix, Big Panda and ServiceNow. Proficiency in designing and implementing observability solutions using tools such as Prometheus, Grafana, ELK Stack, Jaeger, and OpenTelemetry. Strong understanding of distributed systems, microservices architectures, and cloud computing platforms (e.g., AWS, Azure, GCP). Experience with containerization technologies such as Docker and Kubernetes. Ideally 2+ years of development experience with programming languages such as C#, .NET or JavaScript. Excellent analytical and problem-solving skills, with a strong attention to detail. Effective communication and collaboration skills, with the ability to work across teams and influence stakeholders. Experience working in an Agile/Scrum environment is preferred. #LI-AP2 Rockwell Automation, Inc. (NYSE: ROK) es líder mundial en automatización industrial y transformación digital. Conectamos la imaginación de las personas con el potencial de la tecnología para ampliar lo que es humanamente posible, lo que hace al mundo más productivo y sostenible. Con sede en Milwaukee, Wisconsin, Rockwell Automation emplea a aproximadamente 23.000 personas capaces de resolver problemas dedicados a nuestros clientes en más de 100 países. Para obtener más información sobre cómo estamos acercando The Connected Enterprise a la vida a las empresas industriales, visita www.rockwellautomation.com.
Apply to Job
Attention! You will be redirected to another site