OpenShift Telemetry Engineer

New Yesterday

OpenShift Telemetry Engineer The Role
We are seeking a
skilled OpenShift Telemetry Engineer
to join our team. In this role, you will be responsible for implementing, managing, and optimizing the
observability stack within a Red Hat OpenShift Container Platform environment
to ensure system health, performance, and security. You will act as a bridge between
application monitoring and infrastructure observability , leveraging modern telemetry and data streaming tools. Key Responsibilities
Design, implement, and maintain
data pipelines
to ingest and process OpenShift telemetry data (metrics, logs, and traces) at scale.
Stream OpenShift telemetry through
Kafka
(producers, topics, schemas) and build resilient
consumer services
for transformation and enrichment.
Engineer
data models and routing mechanisms
for multi-tenant observability while ensuring data lineage, quality, and SLA adherence across streaming layers.
Integrate processed telemetry into
Splunk
for dashboards, visualization, alerting, and analytics to achieve
Observability Level 4 (proactive insights) .
Implement
schema management, governance, and versioning
using Avro or Protobuf for telemetry events.
Build
automated validation, replay, and backfill mechanisms
to ensure data reliability and recovery.
Instrument services with
OpenTelemetry , standardizing tracing, metrics, and structured logging across platforms.
Utilize
LLM-based capabilities
to enhance observability (e.g., query assistance, anomaly summarization, runbook generation).
Collaborate with
Platform, SRE, and Application teams
to integrate telemetry, alerts, and SLOs.
Ensure
security, compliance, and best practices
for telemetry data pipelines and observability platforms.
Document
data flows, schemas, dashboards, and operational runbooks .
Required Skills & Experience
Hands-on experience building
streaming data pipelines with Kafka
(producers/consumers, schema registry, Kafka Connect, KSQL, Kafka Streams).
Strong experience with
OpenShift / Kubernetes telemetry , including OpenTelemetry and Prometheus.
Experience integrating telemetry into
Splunk
(HEC, Universal Forwarder, source types, CIM) and building dashboards and alerts.
Strong
data engineering skills using Python
(or similar languages) for ETL/ELT, enrichment, and validation.
Experience with
event schemas
(Avro, Protobuf, JSON) and schema compatibility strategies.
Familiarity with
observability frameworks and maturity models , driving toward
Level 4 observability
(proactive monitoring and automated insights).
Understanding of
hybrid cloud and multi-cluster telemetry architectures .
Preferred Skills:
Security and compliance practices for
data pipelines , including: Secret management
RBAC
Encryption in transit and at rest
Strong
problem-solving and analytical skills .
Ability to work effectively in
cross-functional teams .
Excellent
communication and documentation skills .
TPBN1_UKTJ
Location:
United Kingdom
Job Type:
FullTime
Category:
IT;Engineering