Skip to content

Resume

2025-12-10 Current

SRE with over 6 years of experience leading observability establishment for large-scale internal IaaS/PaaS. Led the design, construction, and operation of scalable monitoring infrastructure for over 10,000 physical servers and 100,000+ VMs. Proven track record in improving service reliability from alert notification to automatic recovery. Experienced in team management, recruitment, and leading cross-team technical standardization.

  • Cloud & DevOps: Proficient at designing DevOps systems and using tools (Ansible, Jenkins, StackStorm). Expertise in networking and virtualization related to OpenStack.
  • Technical Stack: Kubernetes, OpenTelemetry, Grafana Loki, Prometheus, Terraform, Go, Python.
  • Leadership: Great leadership in system development and Incident Handling as TechLead. Experience in team management and recruitment.
  • Communication: Excellent communication skills to clarify customer problems and propose solutions. English (Business), Japanese (Native).
  • Achievements:
    • StackStorm User Group Japan - Board Member
    • Winner of RedHat’s Ansible Blog Contest
  • Certifications:
    • Network Specialist (2019)
    • Applied Information Technology Engineer (2017)

Senior Engineer, IaaS Dev Department Oct 2023 ~ Present

Lead the monitoring design for a new internal cloud platform. Directed technical selection, architecture design, and implementation for the company’s first joint product following the merger.

  • Platform Design: Led the formulation of standard configurations for the company-wide monitoring platform after the merger.
  • Scalability: Designed, implemented, and operated a physical server metrics/log storage and high-speed query platform handling 100,000-scale servers.
  • Automation: Designed, implemented, and operated a physical server recovery automation system (Kubernetes operator).
  • Observability: Designed and implemented long-term SLI calculation procedures. Executed the introduction of new monitoring components to the existing cloud platform.
  • Process: Led the standardization of monitoring platform usage processes and internal coordination.
  • Tech Stack: Kubernetes, OpenTelemetry, Grafana Loki, Prometheus, Terraform, Go, Python.

Reliability Engineer (Manager / Engineer) Jun 2019 ~ Sep 2023

Led improvements and new development of IaaS monitoring and deployment systems for the Verda Reliability Engineering Team.

  • Management (July 2021 ~):
    • Led the identification and resolution of cross-team issues, including system design for storing/querying metrics/logs at a TB/day scale.
    • Standardized documentation policies and templates.
    • Managed recruitment strategies, designing technical/behavioral interviews.
    • Proactive public speaking and support for team members to speak at conferences.
    • Led data center-level compute resource migration.
  • Engineering:
    • Improved scalability and high availability of metrics monitoring infrastructure for tens of thousands of physical servers.
    • Reduced physical server deployment time from 9 hours to 1 hour.
    • Built an automatic API metrics collection system by introducing Kubernetes sidecar API gateway.
    • Designed and implemented SLO definitions and SLI measurement systems, including for OpenStack Nova.
    • Conducted solution architect activities for users and provided incident handling direction.
  • Tech Stack: Kubernetes, Prometheus, Grafana, Terraform, Go, Python.

NTT Technocross Corporation (Tokyo, Japan)

Section titled “NTT Technocross Corporation (Tokyo, Japan)”

Engineer, Cloud & Security Division Apr 2016 ~ May 2019

  • Responsible for maintenance/operation of NTT Group’s OpenStack-based cloud services and design/development of a new internal cloud platform.
  • Led network and deployment methods for cloud platform design.
  • Conducted R&D with NTT Laboratories on construction interface abstraction and code-based construction procedures for numerous network devices.
  • Developed and supported CI/CD systems for OpenStack-based cloud and low-layer infrastructure.
  • Lectured and wrote technical content related to DevOps as the youngest instructor in the company.
  • Tech Stack: OpenStack, Ansible.