Kubernetes Platform Engineer Job at Bay Systems Consulting Inc., Berkeley, CA

OU1RWThqV2hMeEQ2dUxvMnBOSDRjeVVUbVE9PQ==
  • Bay Systems Consulting Inc.
  • Berkeley, CA

Job Description

Job Description

Job Description

We are seeking a Kubernetes Platform Engineer to join the Platform Engineering team as a hands-on individual contributor. This role focuses on day-to-day operations and administration of Kubernetes clusters, primarily on-premises (K3s/RKE2) with additional support for cloud environments on Google Cloud Platform (GCP) and Amazon Web Services (AWS). You will manage cluster lifecycle operations, implement and maintain Cilium-based networking, troubleshoot complex platform issues, and enable development teams to successfully deploy and operate their workloads. This position balances infrastructure operations with developer enablement, requiring both deep technical expertise and strong collaboration skills.

The Team

The Platform Engineering team is a small team within ESnet's Systems and Software department that is dedicated to streamlining the software development lifecycle by establishing standardized processes for building, configuring, and deploying applications. The team supports the engineering, implementation, and maintenance of ESnet's platform systems and services including GitLab, Ansible, and Kubernetes environments, with responsibility for both on-premises and cloud-based services deployed across Google Cloud Platform (GCP) and Amazon Web Services (AWS).

Major ResponsibilitiesCluster Operations & Administration
  • Manage the full lifecycle of Kubernetes clusters (on-premises K3s/RKE2, GKE, and EKS), including upgrades, security patching, scaling, and capacity planning
  • Troubleshoot cluster-level issues including control plane problems, node failures, and resource constraints
  • Implement and maintain cluster security hardening based on CIS benchmarks and organizational security policies
  • Manage etcd cluster health, backup procedures, and disaster recovery capabilities
  • Monitor cluster performance and optimize resource utilization across multi-tenant workloads
  • Coordinate with datacenter operations team for physical infrastructure changes and maintenance windows
Networking & Cilium CNI

  • Implement, configure, and maintain Cilium CNI across on-premises and cloud Kubernetes environments
  • Design and enforce network policies to achieve secure multi-tenant isolation
  • Troubleshoot complex pod networking issues including DNS resolution, service discovery, and connectivity problems
  • Configure and maintain BGP peering with physical network infrastructure for on-premises integration
  • Work with network engineering team on firewall rules, VLANs, IPv6 networking, and network architecture

Internal Developer Platform & Enablement

  • Contribute to building a next-generation internal developer platform inspired by tools like Backstage, focused on increasing development efficiency and security
  • Work with the security team to define secure image baselines and automate the patching pipeline for container images
  • Assist development teams with deploying, configuring, and troubleshooting Kubernetes workloads
  • Review application deployment manifests and provide guidance on best practices and optimization
  • Develop and maintain platform documentation, runbooks, and self-service guides
  • Engage with development teams to understand platform needs and tailor the cluster experience to meet evolving requirements

Required Qualifications

  • Typically requires a minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or equivalent experience.
  • Demonstrated experience administering Kubernetes on on-premises infrastructure (K3s, RKE2, or similar bare-metal distributions)
  • Experience with cloud-managed Kubernetes (GKE and/or EKS)
  • Strong understanding of Linux networking fundamentals: iptables/nftables, routing tables, DNS, TCP/IP stack, network troubleshooting
  • Experience with GitOps methodologies and tools such as ArgoCD or Flux
  • Proficiency in scripting and automation: Bash, Python, Go
  • Cilium CNI or equivalent production experience
  • Ability to work collaboratively in a team environment and communicate technical concepts clearly
  • Understanding of Kubernetes security best practices including Pod Security Standards, RBAC, and secrets management
  • GCP (Google Cloud Platform) and/or AWS (Amazon Web Services) cloud platform experience

Preferred Qualifications

  • Go programming experience for operator maintenance and platform tooling development
  • CKA (Certified Kubernetes Administrator) or CKS (Certified Kubernetes Security Specialist) certification
  • Background in BGP routing protocols and network engineering concepts
  • IPv6 networking experience
  • Infrastructure as Code experience with Terraform or Ansible
  • Experience with internal developer platform (IDP) tools such as Backstage or similar
  • Experience with service mesh technologies (Istio, Linkerd)
  • Excellent understanding of code review and familiarity with GitHub and GitLab workflows

Job Tags

Similar Jobs

TRC Talent Solutions

Production Associate - Days Job at TRC Talent Solutions

 ...Production Operator Day Shift - 4:45a -5:00p Night Shift - 4:45p -5:00a Rotating 2-2-3 Schedule Pay : $18/hr (+$1 shift differential for nights) Location : Van Wert, OH No experience in production necessary! We are looking for energetic and trainable... 

Issue One

Digital Content Manager Job at Issue One

 ...requires constant vigilance and that protecting it demands both policy reform and grassroots action.Position Summary The Digital Content Manager is a creative and strategic communicator responsible for executing Issue Ones social media and digital content strategy. This... 

Crothall Healthcare

HOUSEKEEPER (FULL TIME) Job at Crothall Healthcare

 ...are hiring immediately for full time HOUSEKEEPER positions. Location : Ochsner Health...  .... Schedule : Full time schedule; night shifts available. Monday - Friday, 4:00 pm - 1...  ...to maintain establishments, including hotels, restaurants and hospitals, in a clean... 

Techolution

UX/UI design Interns Job at Techolution

 ...Techolution is looking for an out of the box thinker UX Designer who is able to provide creative solutions and has good knowledge about...  ...as part of a team, wed love to meet you.TitleUX Design InternLocationRemote WorkLevel of Experience3-6 monthsNo of Openings... 

The Brock Group

Helper - Scaffold Job at The Brock Group

 ...floor to the waist and 35 pounds of force from the waist to above shoulder. CLIMBING - Ascending or descending ladders, stairs, scaffolding, ramps, and the like, to elevations in excess of 100 feet, to maintain three-point contact at all times. LADDERS - Using...