Posted 2w ago

Engineering Team Lead - Cloud Stability

@ Bloomberg
New York, New York, United States
OnsiteFull Time
Responsibilities:lead team, set strategy, drive capacity
Requirements Summary:4+ years as a software team lead; BS/MS/PhD in CS, Engineering, or related field; strong collaboration and communication; experience with cloud infrastructure, automation, and SRE principles.
Technical Tools Mentioned:Ansible, Airflow, Flask, OpenStack, Ceph, VMware, CI/CD, Python, UNIX, Distributed Systems, Terraform, AWX/Ansible, Kubernetes
Save
Mark Applied
Hide Job
Report & Hide
Job Description
The Cloud Stability team is essential, managing Bloomberg’s private cloud, a massive, critical infrastructure hosting over 100k virtual machines that underpin Bloomberg's most vital applications, including the flagship Terminal and News systems. The team's core mission is to efficiently guarantee the high-availability infrastructure that all clients depend on. Key responsibilities include ensuring reliability of cloud services, establishing comprehensive observability across the fleet, managing infrastructure maintenance and failure response, and executing predictive capacity planning. The team drives this through extensive automation built on tools like Ansible, Airflow, and Flask. The private cloud consists of two key offerings: Bloomberg Cloud Compute (BCC), an in-house cloud leveraging open source technologies such as OpenStack and Ceph, and a VMWare-based commercial software cloud.

We are seeking an experienced Team Lead for the Cloud Stability team based out of New York. This leadership role is pivotal: you will define and drive the strategic execution for running the private cloud, ensuring it continuously enables the vast majority of engineering workflows within Bloomberg. Furthermore, you will be trusted to articulate a future-state vision for the Bloomberg cloud's operation and forge critical partnerships across internal and external teams to realize that vision.

Some of our initiatives include:
Set a strategy for capacity management that takes into account changes in real time capacity and planned demand

Develop orchestration to coordinate competing automation

Enable twice annual reboots of a fleet of ~7k servers

Expand BCC’s global footprint with new production clusters

As a Team Lead, we’ll trust you to 
Inspire and motivate the team to achieve outstanding results, while supporting individual growth and development.

Build partnerships with internal teams, and external stakeholders so that problems are well understood and solutions are aligned with expectations.

Organize and prioritize the backlog of work with the team and stakeholders so that the most important and impactful work is addressed first

Work with the engineers in the team to deliver high quality solutions that adhere to best practices.

Develop a vision of an optimally run cloud and a roadmap for getting there.

You'll need to have:
At least 4+ years experience as a Team Lead of a software development team

BS/MS/PhD in Computer Science, Engineering, or a related technology field

Ability to foster a collaborative team environment; through driving a strong culture of teamwork and taking advantage of team diversity

Ability to effectively listen, communicate, challenge, and influence team members, peers, and senior managers

Experience building trust based relationships with stakeholders to pave the way for cross team collaboration and alignment 

A solid foundation in software development, including best practices, code quality, modular design, testing strategies, CI/CD pipelines, and maintainability.

Ability to reason about system behavior, failure domains, and scaling characteristics, enabling effective guidance on stability, reliability, and performance.

Architectural fluency across compute, storage, networking, & orchestration technologies.

We'd love to see experience in:
Openstack

Cloud infrastructure or SRE team

Virtual networking, Software Defined Networking

Workflow orchestrators such as Airflow and AWX/Ansible 

Unix and distributed systems

Agile Scrum