--
Müller`s Solutions

Job Details

As an AI Infrastructure & MLOps Engineer at Müller’s Solutions for a 6-month contract, This role is primarily operations-focused (90%) , with hands-on involvement in implementation, configuration, and setup of AI infrastructure and MLOps workflows.
You will play a key role in managing, operating, and guiding the deployment of a strategic AI environment , working closely with the customer as a technical advisor and hands-on engineer.
  What about the role responsibilities?
  Operate and maintain AI infrastructure and MLOps platforms in a production environment.
Monitor, manage, and troubleshoot Kubernetes-based AI workloads.
Perform Acceptance Testing Planning and Execution for AI infrastructure and platforms.
Ensure stability, performance, and availability of AI systems.
Support day-to-day operational tasks across compute, storage, and networking layers.
Install and configure NVIDIA Enterprise AI Stack (NVAI) .
Configure and manage MLOps platforms such as Kubeflow and MLflow .
Assist in setting up end-to-end AI workflows , including data pipelines.
Support the initial implementation phase of the AI environment.
Act as a technical guide and advisor to the customer during the early stages of their AI adoption.
What should you have to fit in this role?
  Technical Requirements AI / MLOps Stack Proficient experience with the NVIDIA Enterprise AI Stack Familiarity with Ubuntu Linux Experience with Kubernetes Knowledge of Kubeflow / MLflow Experience with QFLOW (an open-source AI data pipeline management tool) Programming & Automation 4–6 years of practical experience in: Python Jupyter Notebook / JupyterLab Competence in writing, testing, and maintaining operational scripts and AI workflows.
Infrastructure Experience Practical experience with enterprise infrastructure, encompassing: Dell PowerScale (5 nodes) XE Server (1 node) Dell R570 Servers (5 nodes) Dell Network Switches (2 switches) GPU-based AI servers (in a small-scale environment) Environment Overview Initial implementation of AI Compact configuration: 1 GPU server 1 PowerScale 5 control plane servers Opportunity to shape best practices from the ground up     To succeed in this role, it's nice to have:   •   Familiarity with data frameworks like Apache Spark or Hadoop for data processing.
•   Understanding of ML model monitoring and logging practices to ensure system reliability.
•   Experience with security best practices in AI systems.

Similar Jobs