Posted 3mo ago

Senior Storage Architect

@ CommonAI
Cambridge, England, United Kingdom
OnsiteFull Time
Responsibilities:Design storage, Manage storage, Troubleshoot data
Requirements Summary:Strong Linux administration; Ceph/Lustre/BeeGFS in production; distributed filesystems and cloud storage; tiered storage, scripting (Bash/Python/Terraform/Ansible); security and compliance; container tech (Docker/Kubernetes); observability; communications and documentation.
Technical Tools Mentioned:Linux, Ceph, Lustre, BeeGFS, EC2, Bash, Python, Terraform, OpenTofu, Ansible, Docker, Kubernetes, Prometheus, Grafana, S3, RADOS Gateway, MinIO, AWS, Azure, GCP
Save
Mark Applied
Hide Job
Report & Hide
Job Description

CommonAI CIC is a non-profit membership organisation, founded on a belief in collaborative engineering for the safe and responsible development of foundational AI technologies. A place where AI startups, enterprises large and small, public sector bodies and academia can share resources and knowledge, to codevelop and grow businesses, fast.

We support technology-focused start ups, each with unique data management challenges, and are seeking an experienced Storage Architect to help them design, deploy and maintain high-performance storage systems for their AI and data-driven workloads. The successful candidate will combine deep experience architecting and managing distributed, cloud, and tiered storage solutions with strong Linux and automation skills.

In this role you will:

  • Design, implement, and maintain storage platforms that support large-scale AI and data pipelines
  • Manage distributed storage systems such as Ceph, Lustre, or BeeGFS.
  • Oversee tiered storage architectures, optimizing data movement across high-performance, object, and archival tiers.
  • Ensure data integrity, availability, and security across on-premises and cloud environments.
  • Develop automation and monitoring tools using Bash, Python, or similar scripting languages.
  • Manage and secure container images and related storage used for AI and ML workloads.
  • Integrate storage systems with public cloud services (AWS, Azure, GCP) and hybrid environments.
  • Troubleshoot complex storage and data flow issues, collaborating closely with AI platform and infrastructure teams.
  • Contribute to ongoing architecture improvements, performance tuning, and capacity planning.