Morgan Stanley

Director, Software Engineer

Morgan Stanley, New York

Job Title

Design and develop natural language processing methodologies for information extraction from textual data. Utilize specialized knowledge of fixed income classes and financial documents to develop natural language processing models. Collaborate and work with internal technical stakeholders to translate complex customer requirements into tailored natural language processing models. Design, implement, and integrate natural language processing models with existing systems. Design, implement, and support scalable, reliable, high-performance services. Perform exploratory data analysis to build high quality training and validation datasets for model training and evaluation. Apply latest advances in deep learning and natural language processing to improve existing data models, data pipeline and data featurization. Design, implement, test and maintain distribution of natural language processing components for production use. Manage project priorities, deadlines, and deliverables. Telecommuting permitted up to 1 day per week.

Salary: Expected base pay rates for the role will be between $149,000 and $165,000 per year at the commencement of employment. However, base pay if hired will be determined on an individualized basis and is only part of the total compensation package, which, depending on the position, may also include commission earnings, incentive compensation, discretionary bonuses, other short and long-term incentive packages, and other Morgan Stanley sponsored benefit programs.

Requirements: Requires a Bachelor's degree in Computer Engineering, Computer Science, or a related field and two (2) years of experience in the position offered or two (2) years as a Manager, Associate, Analyst, or a related role in the technology field. Requires two (2) years of experience with: Using Python to implement end-to-end workflows; data collection and preprocessing to model training and evaluation; data analytics utilizing NumPy and Pandas; data manipulation, data cleaning, transformation, and exploratory data analysis (EDA); Natural Language Processing (NLP) including supervised and unsupervised learning algorithms; feature engineering, model training, and evaluation for text classification, summarization, question answering, named entity recognition, and sentiment analysis; using NLP toolkits including NLTK, SpaCy, and Gensim; deep learning frameworks including PyTorch and TensorFlow for building, training, and deploying neural networks; Hugging Face Transformers library to leverage and fine-tune pre-trained models including BERT, BART, and T5; model evaluation and hyperparameter tuning with Scikit-Learn; data visualization libraries including Matplotlib, Seaborn, and Plotly, to create insightful and interactive visualizations; Flask; fixed-income securities and structured finance documentation for Commercial Mortgage-Backed Securities (CBMS), Residential Mortgage-Backed Securities (RMBS), Asset-Backed Securities (ABS), and Collateralized Loan Obligations (CLO); and reviewing and interpreting credit rating reports, EDGAR forms, and EDGAR exhibits.