More jobs:
Senior ITSMA Observability Engineer
Job in
Fort Worth, Tarrant County, Texas, 76102, USA
Listed on 2026-05-31
Listing for:
HedgeServ Corporation
Full Time
position Listed on 2026-05-31
Job specializations:
-
IT/Tech
Cloud Computing, Data Engineer, Systems Engineer
Job Description & How to Apply Below
At Hedge Serv, we’re redefining what's possible in fund administration. With more than $700billion in assets under administration, we partner with leading investment managers across private equity, private credit, endowments, hedge funds and more to deliver tech‑enabled solutions that drive performance.
Role Responsibilities- Design, build, secure, maintain and optimize the Elastic & Prometheus stack, including AWS observability tools, for critical applications and infrastructure.
- Create alerting mechanisms, escalation paths, dashboards and the overall framework to support infrastructure, systems and application monitoring.
- Lead IT infrastructure monitoring projects, vendor management, and daily operations, providing SME escalation support as needed.
- Collaborate with application owners, engineers and development teams to gather requirements and engineer solutions using existing monitoring capabilities or custom scripts.
- Architect and support an Elasticsearch stack solution, structuring queries to improve system performance and efficiency.
- Design and configure ETL data pipelines using the Elastic Common Schema for onboarding application logs and metrics; configure index templates and manage data lifecycle (ILM) for retention.
- Develop Ansible playbooks for automated deployment of Beat agents across on‑premises and AWS systems and use Terraform to manage production infrastructure as code.
- Create Elastic alerting solutions via Watcher and Kibana alerts integrated with ticketing tools and MSTeams.
- Develop machine learning jobs to monitor metrics and KPIs, and build AI observability solutions that enable infrastructure engineering and operations teams to address production issues efficiently.
- Adhere to lifecycle processes for moving solutions from Development → QA → Production and participate in agile sprint meetings and collaborative group sessions.
- Technical degree in Information Technology or related field.
- Experience with Elastic Cloud (ELK Stack) and AWS‑managed Prometheus.
- Installation, system tasks, data collection, network troubleshooting, data pipelines and cluster administration skills.
- Proficiency in Python, Bash, Power Shell, Painless and other scripting languages.
- Extensive ELK Stack expertise:
Elasticsearch, Logstash, Kibana, Beats, Machine Learning, APM, X‑Pack, REST API integration. - Evaluation and tuning of Elastic clusters, configurations, indexing, search performance, security and administration.
- Strong knowledge of Prometheus, Grafana, AWS observability tools, performance, security and management.
- Experience with security integrations (Windows SAML, LDAP, Kerberos) in Elasticsearch.
- Expertise with AWS services:
Cloud Watch, Cloud Trail, Kubernetes, Docker, Lambda. - Integration of Elastic alerting with third‑party ticketing tools and observability AI agents/frameworks for automated analysis and incident detection.
We offer competitive salary & benefits packages and ongoing learning and development opportunities.
#J-18808-LjbffrPosition Requirements
10+ Years
work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×