Apex Fintech Solutions
Join to apply for the
Staff Platform Resilience Event Manager
role at
Apex Fintech Solutions .
Who We Are Apex Fintech Solutions (AFS) powers innovation and the future of digital wealth management by processing millions of transactions daily, to simplify, automate, and facilitate access to financial markets for all. Our robust suite of fintech solutions enables us to support clients such as Stash, Betterment, SoFi, and Webull, and more than 20 million of our clients' customers.
Collectively, AFS creates an environment in which companies with the biggest ideas in fintech are empowered to change the world. As a global organization, we have offices in Austin, Dallas, Chicago, New York, Portland, Belfast, and Manila.
If you are seeking a fast‑paced and entrepreneurial environment where you'll have the opportunity to make an immediate impact, and you have the guts to change everything, this is the place for you.
Awards
2021, 2020, 2019, and 2018 Best Wealth Management Company – presented by FinTech Breakthrough Awards
2021 Most Innovative Companies – presented by Fast Company
2021 Best API & Best Trading Technology – presented by Global FinTech Awards
About This Role The
Staff Platform Resilience Event Manager
is responsible for the strategic planning, coordination, and/or execution of platform resilience events across our technology ecosystem. This includes game day events, disaster recovery testing, business continuity exercises, vendor disaster recovery coordination, and regulatory‑driven resilience demonstrations.
This role serves as the central orchestrator across the organization to ensure our resilience posture is continuously validated, documented, and improved. You will transform resilience testing into a strategic capability that demonstrates enterprise operational maturity.
This is not a traditional event planning role—it requires deep understanding of distributed systems, financial services regulations, incident command structures, and cross‑functional program management in a high‑stakes environment.
Resilience Event Strategy & Annual Planning
Develop and maintain the annual Platform Resilience Event Calendar spanning all disaster recovery tests, business continuity exercises, game days, and vendor coordination events.
Align event schedule with regulatory examination cycles, customer audit requests, and internal risk assessment priorities.
Define success criteria and maturity progression for resilience events (e.g., tabletop → walkthrough → full failover → automated chaos).
Maintain our risk register with updates based on resilience event findings.
Game Day & Chaos Engineering Program Leadership
Design and facilitate “game day” exercises that inject controlled failures into production or staging environments to validate system resilience.
Partner with Engineering, SRE, Product and Ops teams to develop realistic failure scenarios (database outages, network partitions, dependency failures, traffic spikes).
Build game day playbooks, observer guides, and scoring rubrics to measure system and team response effectiveness.
Evolve game day maturity from scheduled events to surprise/unannounced exercises with appropriate stakeholder buy‑in.
Vendor Disaster Recovery Coordination
Align and coordinate the Platform participation and preparation along with the Enterprise Risk team for DR/BC events.
In partnership with the Enterprise Risk Team, organize and coordinate vendor‑led DR tests, ensuring Apex participation and validation of vendor recovery capabilities.
Ensure vendor DR documentation (runbooks, RTO/RPO commitments, contact lists) is current and accessible during incidents.
Ensure inventory of critical third‑party vendors is maintained with contractual DR/BC obligations (cloud providers, tech vendors, service providers, SaaS/IaaS/PaaS services).
Cross‑Functional Stakeholder Alignment
Serve as primary liaison between Platform and Compliance, Legal, Enterprise Risk Management, Internal Audit.
Integrate Security incident response scenarios into resilience events (e.g., ransomware recovery, insider threat).
Translate technical resilience outcomes into compliance artifacts, audit evidence, and regulatory examination responses.
Coordinate with Legal on customer contractual obligations for DR demonstrations and availability SLAs.
Regulatory, Audit and Client Contractual Readiness
Maintain compliance with FINRA Rule 4370 (Business Continuity Plans), SEC regulations, and state‑level financial services resilience requirements.
Produce request‑ready documentation: GameDay Results and findings, resilience metrics, improvement tracking.
Support regulatory examinations by providing examiner‑requested evidence of resilience testing and improvement trends.
Track and report findings.
Metrics, Reporting & Continuous Improvement
Define and track key resilience metrics: RTO/RPO actuals, DR test success rates, mean time to failover, game day findings, vendor DR SLA compliance.
Produce quarterly executive dashboards on resilience posture, event outcomes, and improvement initiatives.
Maintain centralized repository of runbooks, event after‑action reports, and lessons learned.
Drive continuous improvement by converting event findings into actionable engineering backlogs and process improvements.
Benchmark Apex resilience maturity against industry standards (e.g., Gartner, NIST, financial services peers).
Incident Response Integration
Ensure resilience events validate and improve actual incident response capabilities (not just technical recovery).
Integrate platform events with ITSM Incident Management training to build muscle memory for real outages.
Validate incident communication plans during events (customer notifications, executive escalations, status pages).
Use real incidents as inputs for future game day scenarios (e.g., replay last quarter’s outage in a controlled environment).
Education And/or Experience
Bachelor’s degree in a technical field (or equivalent work experience) required.
10+ years in technology operations, site reliability engineering (SRE), DevOps, or infrastructure roles.
3+ years in financial services technology (preferably broker‑dealer, clearing, custody, or payments).
Hands‑on experience with disaster recovery planning and execution in complex, distributed systems environments.
Experience supporting regulatory examinations and producing compliance documentation.
Required Skills/Abilities Financial Services & Regulatory Knowledge
Working knowledge of FINRA, SEC, and financial services regulatory requirements for business continuity and disaster recovery.
Understanding of third‑party risk management in regulated environments.
Technical Knowledge and Program & Project Management
Understanding of cloud infrastructure (AWS, Azure, GCP), database failover, load balancing, and multi‑region architectures.
Familiarity with incident command systems, runbook automation, and monitoring/observability platforms.
Proven ability to manage complex, cross‑functional programs with multiple stakeholders and competing priorities.
Experience leading high‑stakes, time‑sensitive events requiring real‑time coordination and decision‑making.
Strong project management skills: planning, scheduling, resource coordination, status reporting.
Comfort with ambiguity and ability to build new programs from the ground up.
Communication & Leadership
Executive presence: able to brief C‑suite and board on resilience posture and event outcomes.
Exceptional written communication: producing regulatory reports, audit evidence, executive summaries.
Incident command or crisis management experience preferred.
Ability to influence without authority across technical and non‑technical teams.
Other Preferred Qualifications
Certifications: CBCP, CISSP, GCP/AWS/Azure certifications, ITIL.
Chaos engineering experience (Chaos Monkey, Gremlin, etc.).
Background in internal audit, GRC, or compliance roles.
Experience with tabletop exercises and red team/blue team scenarios.
Work Environment
This job operates in a hybrid, office environment 3 days per week.
Location Austin, TX
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting and Information Services
Our Rewards We offer a robust package of employee perks and benefits, including healthcare benefits (medical, dental and vision, EAP), competitive PTO, 401k match, parental leave, and HSA contribution match. We also provide our employees with a paid subscription to the Calm app and offer generous external learning and tuition reimbursement benefits. At AFS, we offer a hybrid work schedule for most roles that allows employees to have the flexibility of working from home and one of our primary offices.
EEO Statement Apex Fintech Solutions is an equal opportunity employer that does not discriminate on the basis of race, color, religion, sex (including pregnancy, sexual orientation, and gender identity), national origin, age, disability, veteran status, marital status, or any other protected characteristic. Our hiring practices ensure that all qualified applicants receive fair consideration without regard to these characteristics.
Disability Statement Apex Fintech Solutions is committed to creating an inclusive and accessible workplace for all candidates, including those with disabilities. We are dedicated to ensuring equal employment opportunities and providing reasonable accommodations to qualified individuals with disabilities. If you require reasonable accommodations to participate in the application or interview process, please submit your request via the Candidate Accommodation Requests Form.
Additional Information Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties, or responsibilities required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.
#J-18808-Ljbffr
Staff Platform Resilience Event Manager
role at
Apex Fintech Solutions .
Who We Are Apex Fintech Solutions (AFS) powers innovation and the future of digital wealth management by processing millions of transactions daily, to simplify, automate, and facilitate access to financial markets for all. Our robust suite of fintech solutions enables us to support clients such as Stash, Betterment, SoFi, and Webull, and more than 20 million of our clients' customers.
Collectively, AFS creates an environment in which companies with the biggest ideas in fintech are empowered to change the world. As a global organization, we have offices in Austin, Dallas, Chicago, New York, Portland, Belfast, and Manila.
If you are seeking a fast‑paced and entrepreneurial environment where you'll have the opportunity to make an immediate impact, and you have the guts to change everything, this is the place for you.
Awards
2021, 2020, 2019, and 2018 Best Wealth Management Company – presented by FinTech Breakthrough Awards
2021 Most Innovative Companies – presented by Fast Company
2021 Best API & Best Trading Technology – presented by Global FinTech Awards
About This Role The
Staff Platform Resilience Event Manager
is responsible for the strategic planning, coordination, and/or execution of platform resilience events across our technology ecosystem. This includes game day events, disaster recovery testing, business continuity exercises, vendor disaster recovery coordination, and regulatory‑driven resilience demonstrations.
This role serves as the central orchestrator across the organization to ensure our resilience posture is continuously validated, documented, and improved. You will transform resilience testing into a strategic capability that demonstrates enterprise operational maturity.
This is not a traditional event planning role—it requires deep understanding of distributed systems, financial services regulations, incident command structures, and cross‑functional program management in a high‑stakes environment.
Resilience Event Strategy & Annual Planning
Develop and maintain the annual Platform Resilience Event Calendar spanning all disaster recovery tests, business continuity exercises, game days, and vendor coordination events.
Align event schedule with regulatory examination cycles, customer audit requests, and internal risk assessment priorities.
Define success criteria and maturity progression for resilience events (e.g., tabletop → walkthrough → full failover → automated chaos).
Maintain our risk register with updates based on resilience event findings.
Game Day & Chaos Engineering Program Leadership
Design and facilitate “game day” exercises that inject controlled failures into production or staging environments to validate system resilience.
Partner with Engineering, SRE, Product and Ops teams to develop realistic failure scenarios (database outages, network partitions, dependency failures, traffic spikes).
Build game day playbooks, observer guides, and scoring rubrics to measure system and team response effectiveness.
Evolve game day maturity from scheduled events to surprise/unannounced exercises with appropriate stakeholder buy‑in.
Vendor Disaster Recovery Coordination
Align and coordinate the Platform participation and preparation along with the Enterprise Risk team for DR/BC events.
In partnership with the Enterprise Risk Team, organize and coordinate vendor‑led DR tests, ensuring Apex participation and validation of vendor recovery capabilities.
Ensure vendor DR documentation (runbooks, RTO/RPO commitments, contact lists) is current and accessible during incidents.
Ensure inventory of critical third‑party vendors is maintained with contractual DR/BC obligations (cloud providers, tech vendors, service providers, SaaS/IaaS/PaaS services).
Cross‑Functional Stakeholder Alignment
Serve as primary liaison between Platform and Compliance, Legal, Enterprise Risk Management, Internal Audit.
Integrate Security incident response scenarios into resilience events (e.g., ransomware recovery, insider threat).
Translate technical resilience outcomes into compliance artifacts, audit evidence, and regulatory examination responses.
Coordinate with Legal on customer contractual obligations for DR demonstrations and availability SLAs.
Regulatory, Audit and Client Contractual Readiness
Maintain compliance with FINRA Rule 4370 (Business Continuity Plans), SEC regulations, and state‑level financial services resilience requirements.
Produce request‑ready documentation: GameDay Results and findings, resilience metrics, improvement tracking.
Support regulatory examinations by providing examiner‑requested evidence of resilience testing and improvement trends.
Track and report findings.
Metrics, Reporting & Continuous Improvement
Define and track key resilience metrics: RTO/RPO actuals, DR test success rates, mean time to failover, game day findings, vendor DR SLA compliance.
Produce quarterly executive dashboards on resilience posture, event outcomes, and improvement initiatives.
Maintain centralized repository of runbooks, event after‑action reports, and lessons learned.
Drive continuous improvement by converting event findings into actionable engineering backlogs and process improvements.
Benchmark Apex resilience maturity against industry standards (e.g., Gartner, NIST, financial services peers).
Incident Response Integration
Ensure resilience events validate and improve actual incident response capabilities (not just technical recovery).
Integrate platform events with ITSM Incident Management training to build muscle memory for real outages.
Validate incident communication plans during events (customer notifications, executive escalations, status pages).
Use real incidents as inputs for future game day scenarios (e.g., replay last quarter’s outage in a controlled environment).
Education And/or Experience
Bachelor’s degree in a technical field (or equivalent work experience) required.
10+ years in technology operations, site reliability engineering (SRE), DevOps, or infrastructure roles.
3+ years in financial services technology (preferably broker‑dealer, clearing, custody, or payments).
Hands‑on experience with disaster recovery planning and execution in complex, distributed systems environments.
Experience supporting regulatory examinations and producing compliance documentation.
Required Skills/Abilities Financial Services & Regulatory Knowledge
Working knowledge of FINRA, SEC, and financial services regulatory requirements for business continuity and disaster recovery.
Understanding of third‑party risk management in regulated environments.
Technical Knowledge and Program & Project Management
Understanding of cloud infrastructure (AWS, Azure, GCP), database failover, load balancing, and multi‑region architectures.
Familiarity with incident command systems, runbook automation, and monitoring/observability platforms.
Proven ability to manage complex, cross‑functional programs with multiple stakeholders and competing priorities.
Experience leading high‑stakes, time‑sensitive events requiring real‑time coordination and decision‑making.
Strong project management skills: planning, scheduling, resource coordination, status reporting.
Comfort with ambiguity and ability to build new programs from the ground up.
Communication & Leadership
Executive presence: able to brief C‑suite and board on resilience posture and event outcomes.
Exceptional written communication: producing regulatory reports, audit evidence, executive summaries.
Incident command or crisis management experience preferred.
Ability to influence without authority across technical and non‑technical teams.
Other Preferred Qualifications
Certifications: CBCP, CISSP, GCP/AWS/Azure certifications, ITIL.
Chaos engineering experience (Chaos Monkey, Gremlin, etc.).
Background in internal audit, GRC, or compliance roles.
Experience with tabletop exercises and red team/blue team scenarios.
Work Environment
This job operates in a hybrid, office environment 3 days per week.
Location Austin, TX
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting and Information Services
Our Rewards We offer a robust package of employee perks and benefits, including healthcare benefits (medical, dental and vision, EAP), competitive PTO, 401k match, parental leave, and HSA contribution match. We also provide our employees with a paid subscription to the Calm app and offer generous external learning and tuition reimbursement benefits. At AFS, we offer a hybrid work schedule for most roles that allows employees to have the flexibility of working from home and one of our primary offices.
EEO Statement Apex Fintech Solutions is an equal opportunity employer that does not discriminate on the basis of race, color, religion, sex (including pregnancy, sexual orientation, and gender identity), national origin, age, disability, veteran status, marital status, or any other protected characteristic. Our hiring practices ensure that all qualified applicants receive fair consideration without regard to these characteristics.
Disability Statement Apex Fintech Solutions is committed to creating an inclusive and accessible workplace for all candidates, including those with disabilities. We are dedicated to ensuring equal employment opportunities and providing reasonable accommodations to qualified individuals with disabilities. If you require reasonable accommodations to participate in the application or interview process, please submit your request via the Candidate Accommodation Requests Form.
Additional Information Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties, or responsibilities required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.
#J-18808-Ljbffr