Dieses Jobangebot ist archiviert und steht nicht mehr zur Verfügung.
Vakante Jobangebote finden Sie unter Projekte.

Site Reliability Engineer

Eingestellt von Gazelle Global Consulting

Gesuchte Skills: Engineer, Engineering, Net, Python

Projektbeschreibung

SITE RELIABILITY ENGINEER

A Site Reliability Engineer is urgently required for an immediate start with a major financial services client in Amsterdam.

The Site Reliability Engineering (SRE) team is a multidisciplinary team that has as primary goal to continuously and structurally improve the reliability and maintainability of the IT environments in different domains. The SRE team also provides temporary support to other teams when required, then focussing the available capacity of the team to providing support. The focus then becomes restoration of services back to normal. Once this is dealt with, the SRE Team then moves on to the next team.

The Site Reliability Engineer will ideally have 65% + of the following skills and experience:

- Mastery in at least one programming language, Java and/or .net are a plus;
- Working knowledge of configuration tools like Puppet, Chef or Ansible
- Scripting experience in at least one of the following: Ruby, Python, Bash, Powershell
- Experience with virtualization environments and tools eg, VMware, cloud forms
- Solid foundation in Linux or Windows administration and troubleshooting;
- Proven experience with automation. Knowledge of configuration management tools like Puppet or Chef is a plus;
- Additional experience in Networking, Security or Storage is a plus;
- Be able to understand and formulate meaningful business metrics;
- Professional and intellectual IT ability at university level - at least Bachelor's degree
- eExperience in Agile/scrumCertifications and/or proven proficiency in one or more programming/Scripting languages
- Fluent English, Dutch speakers very strongly preferred

The Site Reliability Engineer will have responsibilities including:

- Prevention of Incident Reoccurrence
- Improving Incident Mitigation Capabilities
- Follow up on improvement actions after high impact incidents (root cause)
- Support in creating structural solutions instead of workarounds
- Support in automation of the services (create consumable services)
- Continuous Improvement of Continuous Delivery & Software Engineering Practices
- Improve the MTTR (Mean Time To Repair) and MTBSF (Mean Time Between Service Failures) of service impacting incidents
- Create and sustain excellent monitoring of chains
- Helping teams improve the resilience of their assets
- Create Real Time and standardised insights of production chain for faster incident analysis
- Help improving teams in data mining - and analysis skills
- Develop consumable, standardized infrastructure services (resilient, high quality, highly automated and up-to-date)
- Support and educate other teams and consumers on using the standardized infrastructure services (consumer is responsible for their own instances)
- Develop an ecosystem of tools to provide self-service capabilities to the consumer
- Manage all resources in version-controlled repositories (incl., code, scripts, configurations, artefacts, static resources)
- Refactor and reuse existing code/modules/functionality
- Write fully automated tests (eg, unit-, functional-, non-functional - and integration testing)
- Build, enhance and maintain tooling and scripts to automate repetitive or error prone tasks
- Develop micro services and API's
- Implement user stories from backlog as agreed with the Product Owner, without spending time on work outside the backlog
- Continuously improve yourself, your squad and the service
- Collaborate within and across teams to understand your services end-to-end

This is an urgent requirement and interviews are scheduled for next week.

Projektdetails

  • Einsatzort:

    Amsterdam, Niederlande

  • Projektbeginn:

    asap

  • Projektdauer:

    6 - 18 months +

  • Vertragsart:

    Contract

  • Berufserfahrung:

    Keine Angabe

Geforderte Qualifikationen

Gazelle Global Consulting