Staff platform engineer & site reliability engineer with 9+ years of experience in startups at various stages of growth, from being the first engineer at a company to designing and managing a developer platform for 150+ engineers. I'm experienced in leading cross-functional teams and organizational-scale initiatives that improve reliability, scale systems, and accelerate engineering velocity. I have a proven track record of managing complex vendor relationships, driving platform strategy, and building organizational trust through transparent communication and data-driven decision-making. I have successfully led the migration and modernization of three companie's infrastructure to the cloud, leveraging containerization and Kubernetes. Deep expertise in AWS, Kubernetes, and observability practices, especially OTEL, with experience leading incident response, implementing SLOs, and balancing reliability goals with business priorities. Previous experience as a software engineer provides me deep understanding of software engineering principles, enabling effective communication and collaboration with development teams.

Skills

General

  • Incident Command & Response
  • SLO/SLI Design
  • Vendor Relationships
  • Developer Experience
  • Self-Service Platforms
  • Process Automation
  • Technical Mentoring

Cloud

  • AWS
  • Kubernetes
  • Helm
  • Terraform
  • Kustomize
  • Cloudflare
  • Postgres
  • Redis
  • Kafka

Code

  • Python
  • Bash
  • Ruby
  • Java

Monitoring & CI/CD

  • OpenTelemetry (OTEL)
  • Honeycomb
  • Sumo Logic
  • Prometheus & Grafana
  • Github Actions
  • Buildkite
  • ArgoCD

Work Experience (3)

Staff Platform Engineer
Wrapbook
Jan 2022 - Current
  • Led design and execution of migration from Heroku to AWS EKS, an organization-wide infrastructure transformation affecting 40+ engineers across 8 product teams.

  • Reduced CI/CD costs by 60% moving to self-hosted Github Actions Runners and leveraging ARC and Karpenter for dynamic provisioning and auto-scaling.

  • Architected SQL-based monitoring solution integrated with Honeycomb, providing real-time visibility into critical business operations and early warning capabilities for incident prevention for product engineering teams.

  • Designed and coded incident response slack bot that democratized access to common incident response tools, reducing the dependency on platform engineers to be available for every incident and reducing time to incident resolution.

  • Led incident response as de facto Incident Commander, coordinating teams during production issues and managing post-incident retrospectives and action items.

  • Reduced application build times by 50% through optimization of Docker builds, caching strategies, and parallel execution workflows

  • Pioneered adoption of OpenTelemetry logging SDK with custom rails logger and formatter. Implemented structured logging with additional business metadata sent to Honeycomb, enabling unified correlation between logs and traces.

  • Designed real-time data pipeline from PostgreSQL to DMS to Snowflake via S3, establishing foundation for organization-wide analytics capabilities.

  • Provide technical guidance and mentorship to team members, and lead company wide demos and training sessions for internal tooling.

Devops Engineer / Manager
Create Music Group
Jun 2020 - Dec 2021
  • Led and managed a team of 3, covering all company infrastructure, IT, and QA operations.

  • Successfully migrated legacy architecture to Kubernetes, improving both Laravel and Node.js application deployment and management lifecycle.

  • Led the transformation of operations practices from legacy methods to continuous deployment and containerization, enabling faster and more efficient software releases.

  • Developed and implemented guidelines and standard practices for cloud security and governance, ensuring compliance with industry regulations and best practices.

  • Collaborated cross-functionally to automate infrastructure provisioning, configuration, and monitoring, resulting in streamlined operations and reduced manual efforts.

Director of Engineering
Cognitive3D
Feb 2016 - Apr 2020
  • Led and managed an engineering team of 5.

  • Designed and implemented ETL systems, including an analytics pipeline built with Flink, Kafka, and Cassandra. The pipeline efficiently processed tens of thousands of requests per second from consumer devices, providing the core product offering.

  • Collaborated with Fortune 100 clients on large-scale cloud integrations and deployments of PaaS (Platform as a Service) solutions, ensuring seamless integration and meeting client requirements for security and compliance.

  • Spearheaded the migration of the company's legacy infrastructure to containers and Kubernetes, leveraging KOPS on AWS.

  • Microservices development with Java / Play framework, actively contributing to the development of critical components and features.

  • Developed a 3D data viewer utilizing Three.js and AngularJS, contributing to the front end development efforts and creating an immersive visualization experience for users.

Education (1)

Bachelor
: Business Technology Management / E-Business
University of British Columbia
2011 - 2015

Certifications (2)

Certified Solutions Architect – Associate
Amazon Web Services (AWS)
May 2020 - Current
Web Development
Lighthouse Labs
Aug 2015

Hobbies & Projects (3)

Home Network / Home Lab
  • PfSense. Proxmox Cluster. 4 node talos cluster. VLANs for IoT, servers, and more. NAS with ZFS and OMV, Backblaze B2, 3-2-1 backup strategy.

Electronics
  • Love for retro. RGB modded CRTs. Modchipped consoles with Raspberry Pi picos. Custom cart flasher built on arduino.

3D Printing
  • Heavily modified CR-10s with new hotends, fans, and custom firmware. Love of tinkering and building things. Focus on functional prints for my home and electronics projects.