Dieses Jobangebot ist archiviert und steht nicht mehr zur Verfügung.
Vakante Jobangebote finden Sie unter Projekte.

Architect (H/U/M/a/N)* Data Science Python Pyspark, Scikit-Learn, Xgboost, Mlflow, Matplotlib

Eingestellt von Cegeka Deutschland GmbH aus München

Gesuchte Skills: Engineering, Python

Projektbeschreibung

- Good communicator, but great at independent work. Solution oriented. Analytical thinking
- Experience in Data Science, machine learning, statistical modelling, multivariate analysis, exploratory data analysis, software development and data engineering
- Optimally, candidate should have a degree in mathematics, physics, computer sciences or in a related field
- Several years of experience in Python programming are required, especially in working with PySpark, Scikit-Learn, XGBoost, MLflow, Matplotlib and related libraries
- Good familarity with Azure, Git, GitLab, CI/CD, Docker and Databricks are beneficial.
- Candidate should be familiar or inclined to working in an agile environment.
- Prior experience with predictive maintenance tasks is a plus.

IHRE AUFGABEN

Our customer has five different machine learning-based solutions that identify weak points in the medium voltage grid. While targeting the same purpose, they differ in their data source systems, ML features, ML algorithms and especially the Distribution System Operator (DSO) for which they have been developed. Currently, none of these five solutions is easily applicable to a different target DSO or implemented in a scalable way. A newly developed data platform (iPEN) that collects, prepares, and provides data from all DSOs now allows to develop a new solution that consumes all needed data from a single source system. Based on unified data preparation, feature extraction, and machine learning steps, this new solution should be applicable to all customers DSOs. Particular attention will be paid to the scalability, stability, and maintainability of the resulting software. Data Science tasks related to projects, e.g. Predictive Maintenance Solutions.

- Advice and design business-critical data science use cases, from the business problem to delivery and operation.
- Statistical analysis and exploration of static, mixed, and time-series data.
- Writing of exploration and production code in Python and PySpark on a Databricks Tech-Stack.
- Design and implementation of ML algorithms related to failure predictions.
- Improvement and optimization of existing ML algorithms using various tuning techniques.
- Study of new features, feature importances, correlations, causations, data leakage, up- and downsampling.
- Advice data engineers in writing production code for feature engineering.

IHR ANSPRECHPARTNER

Manuela Fentrohs
Telefon: +49 89 74833 873
E-Mail: [email protected]

Projektdetails

  • Projektbeginn:

    asap

  • Projektdauer:

    01.08.2023 - 31.12.2023 (parttime, 60%)
    remote

  • Vertragsart:

    Remote

  • Berufserfahrung:

    Keine Angabe

Geforderte Qualifikationen

Cegeka Deutschland GmbH

  • Straße:

    Wilhelm-Wagenfeld-Str. 30

  • Ort:

    80807 München, Deutschland

  • Projekte:

    8 Projekte Alle anzeigen