Dieses Jobangebot ist archiviert und steht nicht mehr zur Verfügung.
Vakante Jobangebote finden Sie unter Projekte.

Data Engineer

Eingestellt von Masento

Gesuchte Skills: Engineer, Sql, Python, Datastage

Projektbeschreibung

Data Engineer whose core objectives will be:

Collect, clean, prepare and load the necessary data - structured or unstructured - onto Hadoop, our Big Data analytics platform, so that they can be used by the data scientists to create insights and answer business challenges

Act as a liaison between the team and other stakeholders and contribute to support the Hadoop cluster and the compatibility of all the different software's that run on the platform (Spark, R, Python)

Experiment new tools and technologies related to data extraction, exploration or processing (eg. OCR engines)

Job description

Identify the most appropriate data sources to use for a given purpose and understand their structures and contents
Extract structured and unstructured data from the source systems (relational databases, data warehouses, document repositories, file systems), prepare such data (cleanse, re-structure, aggregate) and load them onto Hadoop.
Actively support data scientists in the data exploration and data preparation phases. Where data quality issues are detected, liaise with the data supplier to do root cause analysis
Where a use case is meant to become a production application, contribute to the design, build and launch activities
Ensure the maintenance and support of production applications (watch duty)

YOUR PROFILE:

Experience with understanding and creating data flows, with data architecture, with ETL/ELT development (MS SQL Server SSIS, Datastage) and with processing structured and unstructured data
Proven experience with using data stored in RDBMSs and experience or good understanding of NoSQL databases
Ability to write performant SQL statements
Understanding of the Hadoop ecosystem including Hadoop file formats like Parquet and ORC
knowledge of Spark & Scala
Ability to write MapReduce & Spark jobs
Experience with open source technologies used in Big Data analytics like Pig, Hive, HBase, Kafka,
Ability to design solutions that are fit for purpose whilst keeping options open for future needs

Projektdetails

  • Einsatzort:

    Brussel, Belgien

  • Projektbeginn:

    asap

  • Projektdauer:

    3 - 6 months

  • Vertragsart:

    Contract

  • Berufserfahrung:

    Keine Angabe

Geforderte Qualifikationen

Masento