Senior Cloud Engineer (SD/TX/DC/Boston)

2 Months ago • 4 Years + • $110,646 PA - $165,970 PA

Devops

Job Description

Shield AI is looking for a Cloud Engineer to support its leadership in applied artificial intelligence development. In this role, you will be responsible for engineering, deploying, provisioning, and managing critical cloud systems that drive innovation across Shield AI’s public and private cloud environments, both domestically and internationally. As part of the Cloud and Infrastructure team within Enterprise Operations, you will play a key role in ensuring the performance, scalability, and reliability of these systems to support various business units. This position may involve occasional travel to Shield AI locations.

Good To Have:

Proven engineering experience with deploying and maintaining workloads in Azure public cloud
Fundamental understanding of at least one type of virtualization platform for private cloud (i.e. VMware, Hyper-V, KVM, etc.).
Experience in DevOps, Site Reliability Engineering, or cloud infrastructure roles.
Familiarity with configuration management tools like Ansible, Chef, or Puppet.
Experience building robust monitoring and alerting systems for mission-critical applications.
Solid understanding of CI/CD pipelines and possesses the ability to optimize.

Must Have:

Oversee the day-to-day management and optimization of cloud-based infrastructure (e.g., Azure, AWS).
Support and optimize cloud and virtual machine environments, assisting with capacity planning, performance monitoring, security compliance, and vulnerability remediation.
Assist in implementing and maintaining infrastructure systems, including servers, storage, backup solutions, and disaster recovery processes, for both public and private clouds.
Demonstrate a willingness to learn and work with familiar or unfamiliar operating systems and workloads with the desire to leverage automation tasks for repeatable tasks.
Author and produce the necessary documentation for engineered and maintained systems along with associated processes which supporting teams can leverage.
Assist in researching, recommending, and developing innovative solutions for complex requirements and issue resolution.
Participate in Agile methodologies and sound engineering principles.
Perform daily system monitoring, verifying the integrity and availability of all server resources, systems and key processes, reviewing system and application logs.
Support system maintenance and upgrades, including OS patching, software configuration, hardware updates, and performance tuning to ensure optimal cloud infrastructure performance.
Provide escalated support for operational issues possibly during and after normal business hours for systems, workloads, and Kubernetes AI infrastructure.
Analyze, troubleshoot and resolve system infrastructure and software issues.
Possess the capacity to participate in on-call, emergency, or maintenance roles.

Perks:

Bonus
Benefits
Equity

Add these skills to join the top 1% applicants for this job

game-texts

agile-development

sound-editing

aws

azure

kubernetes

Job Description:

What you'll do:

Engineering:
Oversee the day-to-day management and optimization of cloud-based infrastructure (e.g., Azure, AWS).
Support and optimize cloud and virtual machine environments, assisting with capacity planning, performance monitoring, security compliance, and vulnerability remediation.
Assist in implementing and maintaining infrastructure systems, including servers, storage, backup solutions, and disaster recovery processes, for both public and private clouds.
Demonstrate a willingness to learn and work with familiar or unfamiliar operating systems and workloads with the desire to leverage automation tasks for repeatable tasks.
Author and produce the necessary documentation for engineered and maintained systems along with associated processes which supporting teams can leverage.
Assist in researching, recommending, and developing innovative solutions for complex requirements and issue resolution.
Participate in Agile methodologies and sound engineering principles.

Operations and Support:
Perform daily system monitoring, verifying the integrity and availability of all server resources, systems and key processes, reviewing system and application logs.
Support system maintenance and upgrades, including OS patching, software configuration, hardware updates, and performance tuning to ensure optimal cloud infrastructure performance.
Provide escalated support for operational issues possibly during and after normal business hours for systems, workloads, and Kubernetes AI infrastructure.
Analyze, troubleshoot and resolve system infrastructure and software issues.
Possess the capacity to participate in on-call, emergency, or maintenance roles.

Set alerts for more jobs like Senior Cloud Engineer (SD/TX/DC/Boston)

Set alerts for new jobs by Shield AI

Set alerts for new Devops jobs in United States

Set alerts for new jobs in United States

Set alerts for Devops (Remote) jobs