Unlimited Job Postings Subscription - $99/yr!

Job Details

Kafka Operations Administrator

  2026-05-29     ACESTACK     Tyler,TX  
Description:

Role: Kafka Operations Administrator Location: Seattle, WA/St. Louis, Mo / TX FTE OnlyJob Description Must Have Technical/Functional Skillsâ ¢ Production-grade Apache Kafka operations experience, managing, maintaining and upgrading Kafka clusters in production environments with a focus on high availability, disaster recovery, fail-over and overall reliabilityâ ¢ Proficiency in installing and configuring monitoring systems using Grafana (building dashboards), Prometheus, Splunk , JMX metrics. â ¢ Automation and orchestration experience: Terraform , Ansible, Helm, Kubernetes (EKS/AKS/GKE).â ¢ Strong Linux system administration experience, including troubleshooting, automation and scripting for efficient infrastructure management. â ¢ Experience in Production Support (ITIL processes followed) and participating in 24x7 on-call rotations , documenting incidents/postmortems.â ¢ Experience in supporting JVM tuning, GC Analysis, network and disk I/O diagnostics â ¢ Experience in TCP/IP, routing, switching and firewall configurations relevant to Kafka operationsGood to Have:â ¢ Deep Kafka performance tuning and capacity planning experienceâ ¢ Knowledge of message delivery semantics and guarantees (at-least-once, exactly-once)â ¢ Cloud-native security/compliance experience (IAM, VPC, KMS, Security Groups)â ¢ Certifications: Confluent Certified Administrator, AWS/Azure/GCP certificationsâ ¢ Experience with Apache Kafka in KRaft mode, including set up, configuration, troubleshooting and cluster managementâ ¢ Containerization and Container Orchestration Tools experience: Docker, Kubernetesâ ¢ Experience with CI/CD pipelines and Git-based workflowsâ ¢ Experience building custom Kafka connect libraries and understanding of data serialization formats (eg: Avro, JSON)â ¢ Knowledge of networking concepts across on-prem VMs and cloud environments, ensuring seamless integration and communication between services. â ¢ Strong understanding of topic management and security best practices for streaming platforms: TLS, ACLs, RBAC, encryption at rest/in transitâ ¢ Kafka ecosystem tooling experience: Kafka Connect, Schema RegistryRole and Responsibilities :â ¢ Deploy, configure and manage Kafka clusters and related services to meet SLA requirementâ ¢ Participate in 24x7 on-call rotation to respond to incidents, alerts, and escalationsâ ¢ Triage, diagnose, and remediate production incidents; coordinate with stakeholders, developers and infrastructure teamsâ ¢ Implement automation for provisioning, scaling, server/data backups, and disaster recoveryâ ¢ Maintain monitoring, alerting thresholds, dashboards, and Kafka ecosystem healthâ ¢ Harden Kafka deployments: configure TLS, ACLs, RBAC, encryption, and vulnerability remediationâ ¢ Perform routine maintenance: Kafka ecosystem upgrades (controllers, brokers, connect, and schema registry), rolling restarts, etc.â ¢ Create and maintain runbooks, runbook automation, and post-incident reportsâ ¢ Optimize performance and resource utilization; benchmark and tune clustersâ ¢ Support Kafka Connect/Schema Registry service and troubleshoot connector issuesâ ¢ Contribute to CI/CD pipeline improvements for infrastructure and deployment automation


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search