Site Reliability Engineer
Listed on 2026-06-13
-
IT/Tech
SRE/Site Reliability, IT Support, Systems Engineer, Cloud Computing: Infrastructure & Operations
Employment Type
Full-time, W2 position with Cogent People Inc. This is a direct hire position with full benefits.
LocationHybrid Columbia, MD 3 times per week OR Remote (as applicable to role)
Work Authorization RequirementsTo comply with government contracting requirements, candidates must meet all of the following:
- Must be a U.S. Citizen, Permanent Resident, or valid EAD holder
- Must have lived in the United States for at least 3 of the past 5 years
- Must be currently authorized to work in the U.S. without sponsorship
Sponsorship is not available for this position (now or in the future). Candidates who do not meet these requirements will not be considered.
Clearance RequirementPublic Trust required or ability to obtain, depending on assignment.
About Cogent People Inc.Cogent People Inc. is a government consulting and technology services firm supporting mission-critical federal and commercial programs. We deliver secure, scalable, and modern digital solutions across complex IT environments. Our teams thrive at the intersection of engineering excellence and mission impact, building systems that matter.
Job OverviewCogent People Inc. is seeking a Site Reliability Engineer to support system reliability, monitoring, and operational stability across environments. This role is responsible for implementing observability and automation practices, supporting production systems, and ensuring system performance and availability. The position plays a key role in incident response, root cause analysis, and ongoing system optimization in collaboration with Dev Ops and development teams.
The ideal candidate will bring experience in system monitoring, Dev Ops practices, and production support, along with the ability to collaborate across cross‑functional engineering teams in a fast‑paced environment. This position may be contingent upon contract award.
- Support system reliability, monitoring, and operational stability across environments
- Implement and maintain observability practices, including monitoring, logging, and alerting
- Contribute to automation efforts that improve system reliability and operational efficiency
- Participate in incident response activities and production support
- Perform root cause analysis for system issues and outages
- Support performance optimization and tuning of applications and infrastructure
- Work with Dev Ops and development teams to maintain production readiness
- Contribute to continuous improvement of deployment and operational processes
- Collaborate across engineering teams to support stable and scalable systems
- Bachelor’s degree in Computer Science, Information Systems, or a related field, or an equivalent combination of education and experience
- Experience in system reliability, Dev Ops, or production support roles
- Experience with monitoring, logging, and observability tools
- Understanding of incident management and root cause analysis processes
- Familiarity with cloud environments and infrastructure concepts
- Experience supporting automated deployment or operational workflows
- Strong problem‑solving and troubleshooting skills
- Excellent written and verbal communication skills
- Ability to work effectively in fast‑paced, production‑critical environments
- Strong collaboration skills across development and operations teams
- Experience with AWS or other cloud platforms
- Familiarity with infrastructure-as-code tools (e.g., Terraform or similar)
- Experience with tools such as Splunk, Datadog, Prometheus, or similar observability platforms
- Experience with CI/CD pipelines and Dev Ops automation tools
- Prior experience supporting enterprise‑scale or regulated environments
- Knowledge of application performance tuning and distributed systems behavior
At Cogent People, we combine technical excellence with a mission‑driven culture. Our teams work on meaningful, high‑impact projects that support government and enterprise transformation initiatives.
We Offer- Competitive compensation
- Career growth and professional development opportunities
- Exposure to complex, mission‑critical systems
- A collaborative and supportive team environment
- Long‑term client…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).