Writings and Perambulations¶
New Post! Summer of AI - An AgBiome Perspective (interview)
My expertise is in building and leading teams of Data Scientists, Machine Learning Engineers, Data Engineers, and Data Product experts of diverse experience levels, distributed across cultural and knowledge bases, to drive innovation, product development, and cutting edge, actionable research. I have a proven track record of mentoring, growing, executing, and building strategy around ambitious goals across the Data Science technical platform to provide business value.
My background is in Quantitative Cell Biology and I have a PhD in Biochemistry and Molecular Biology. I spent much of my career researching novel cancer therapeutics through live cell imaging and machine learning. You can now find me developing and implementing machine learning platforms and algorithms for everything from optimization problems to personalized medicine and medicinal AI. I have researched and built novel recommenders, 24/7 production machine learning solutions, and received patents for personalization solutions in the consumer sales space and chemical profile recommenders.
Most recently, I have been working in the Biotechnology space where I have developed the long-term strategy and vision for Data Science, led a growing team of Data Scientists to develop novel algorithms in the predictive modeling and Bioinformatics spaces, and accelerated biological discovery. Moreover, I designed and implemented novel algorithms to detect differential modes of action in both natural products and whole microbe systems, and engineered a proprietary online learning platform to serve as an artificial intelligence decision guidance system for our extensive microbial collection and proprietary product discovery platform (Genesis), a digital twin. I rebuilt global research and enablement programs across domains for knowledge generation and product creation and delivery using cutting edge deep learning techniques on unstructured records, genomics, omics, and image data as the global lead for digital science discovery and delivery programs. Established, built, and implemented data governance and provenance at both the discovery and enterprise levels. Built multimodal knowledge graphs using LLM guidance to develop GraphRAG technologies in-house for production use in the global pipeline.
I am now pursuing my passion work in improving patient outcomes utilizing cutting edge, clinical AI. The company is in early inception phases, but we are already building out the next generation of clinical AI technology. I would love to discuss with you the exciting work we are doing, so please reach out!
Currently
Mayo Clinic (Rochester, MN)
SENIOR ARCHITECT AND TECHNICAL DEVELOPMENT LEAD
More about my work to come. Reach out to me through email or LinkedIn to chat.
Experience
17 years of production Data Science/ML/AI experience and leadership
17 years of rigorous scientific training in academic programs - healthcare and precision medicine
11+ years solutioning ML/AI architecture
8+ years of Data Science and Machine Learning Engineering Leadership and Strategy Development at the Director or VP level
5 patents awarded or in final status for the development and application of novel applications of ML/AI
25+ Member teams across disciplines in data and product
21 years of production Software Engineering experience
8 years of ML/AI and advanced analytics product management
Industry proven leader in Data Science and Machine Learning Innovation across the AI landscape
Previously
GLOBAL TRAITS DIGITAL SCIENCE LEAD, GLOBAL TRAITS DISCOVERY AND DELIVERY PROGRAM LEAD
BENCHLING RESEARCH PIPELINE TECHNICAL LEAD
Led Trait (gene, phenotype) Discovery and Delivery efforts within RD and IT Digital Science across the entire Syngenta global space
Created, built, and led Generative AI program to automate and optimize the delivery process for gene delivery processes to target specific traits (phenotypes) or gene expression objectives
Created and led the program to use multimodal modeling with structured and instructed data, and foundation LLMs for knowledge generation, product discovery and delivery pipeline performance guidance, regulatory audits, and prompt-based experimental design
Owned stack from data model, application landscape, and AI research initiatives from gene discovery to trait (phenotype) introgression including building of API integrations, MLOps, DevOPs workflows
Team comprised business (Analysts, Architects, Delivery Managers, Scrum Masters), IT (Developers, QA, MLOps, DevOPs), and science (SMEs, Product Owners, Wet Lab Researchers) domains - both domestic and distributed teams
Mentored junior team members with a focus on professional development and upskilling opportunities
Led Data Science efforts to research and deploy foundational Large Language Models (LLMs) for protein design to predict expression levels of various molecular biology constructs as a software workbench for bench researchers - work emphasized multimodal data and contextualization/fine tuning of foundation models
Built strategy and vision across discovery and delivery to streamline scientific application portfolio, create single source of truth data streams, and initiated foundational work to bring cutting edge ML/AI technologies to the scientific pipeline
Communicated strategy and work initiatives to secure funding and oversee resourcing of execution
Collaborated with stakeholders across research and business domains to ensure cooperative acceleration and growth
Previously
DIRECTOR OF DATA SCIENCE/ML/AI AND TECHNICAL LEAD/STRATEGIC VISION
Served as director of Data Science/Machine Learning/Artificial Intelligence strategy and innovation initiatives.
Created, built, deployed, then led the genomic AI program for disease prediction, novel mode of action identification, biomarker discovery, and understanding population variation from DNA to signaling pathways/omics.
Served as technical lead for the development and utilization of mixed effects models to explain the impact of environmental factors on phenotypic outcomes in the background of genomic models.
Led a growing team of Data Scientists working to drive biological research and accelerate product development.
Team size ranged from 7 to over 20, across disciplines of Data and Product.
Developed and drove 4 year strategy and vision for Data Science, Data Engineering, Bioinformatics, and Data Product for entire organization.
Onboarded artificial intelligence methodologies like Generative AI and Large Language Models for genomic information and protein variant creation.
Owned stack from data model, application landscape, and AI research initiatives across all of ML/AI and Data Science including building of API integrations, MLOps, DevOPs workflows
Advocate and conduct learning around Data Science/ML/AI products and platforms both to internal teams, executive team and board members, and external partners and collaborators (customers).
Directly contributed to high impact publications and invited for speaking engagements.
Served as Data Science liaison across the company, communicating strategy, accomplishments, initiatives, and best practices. Developed the long-term Data Science strategy and vision, hiring plan, technological innovation pipeline, and overall project management for the department.
Developed a proprietary predictive artificial intelligence decision guidance system to serve as a core to our proprietary natural and whole microbe product discovery platform - GENESIS, a digital twin technology in part combining genomics and geospatial data modeling. This digital twin technology has already generated a large corpus of actionable research, product leads, and accelerated screening paradigms to put the best leads in the field. Excitingly, my team was able to harness GENESIS to generate cross-indication predictive models to drive and accelerate lead identification across the research platform.
Led Data Science contribution to external manuscripts and internal white papers for both research, Data Science, and SOPs.
Sat on a team of technical leaders that drove research initiatives across the company and served as a hive mind to solve challenges across domains. Identified and addressed gaps in research and Data Science.
Developed a novel algorithm to use interpretable machine learning model ensembles to traverse genomic annotations and drive mode of action discovery from small datasets. Designed and developed a predictive modeling platform to accelerate product discovery in indication screening. Designed and developed a cross-indication prediction platform to enable multi-target identification.
Two publications and multiple patentable IP technologies developed within first year.
Previously
DIRECTOR OF DATA SCIENCE/ML/AI AND PRINCIPAL, RESEARCH AND MACHINE LEARNING
Was responsible for envisioning and creating Data Science from scratch that saw a rapid incline of over 10x user growth and key KPIs like LTV, spend, and user ratings through my creation of our AI-driven personalization program.
Created and maintained the core, patented algorithms behind the Firstleaf wine club using both shallow, rules-based, and deep learning machine learning and AI technologies on both molecular data and marketing big data and ran the DevOps and MLOps for the realtime, 24/7 AI platform.
Designed, built, deployed, then led a set of interpretable model algorithms to generate industry first user profiles built on billions of data points per user.
Designed, built, deployed, then led a data-driven product creation AI built on molecular and consumer profile data, optimized through MCMC parameterizarion, that could scope down to zip code level targets and was used in both standard creation workflows, as well as running regional wine clubs like the LA Times.
I built and led a local and distributed team of Data Scientists and Machine Learning engineers (The Research and Machine Learning Team) developing machine learning and AI platforms to drive real time recommendations, inform business strategy, and create/integrate with product design life cycle. This included the end-to-end development of the ML stack using both traditional ML and cutting edge deep learning techniques including computer vision, generative AI, and NLP, along with novel algorithm development.
Team size ranged from 5 to 15, across disciplines of Data and Product.
I was further responsible for the continual growth of the team, maintaining stakeholder communication, driving Data Science strategy across the organization, and directly working with C-level executives to maintain a vision and communicate strategy and execution plans aligned with business executives.
The Research and Machine Learning team was responsible for identifying and developing key Data Science and Machine Learning technologies for Firstleaf. We utilized cutting edge approaches to both empower internal company function as well as customer facing products. The team was also responsible for design and implementation of the patented (developed technologies, co-write and secured patents) algorithms that drove the Firstleaf experience.
The team was responsible for initiating, driving, and executing on data science strategies across Marketing, Finance, Business Intelligence, and Wine Making functions – the Research and Machine Learning team was a full spectrum B2B and B2C solution within Firstleaf.
5 patents (3 awarded, 2 in final review), multiple interviews, blog posts, and department awards providing high visibility to intellectual property and achievements of team.
Previously
POSTDOCTORAL RESEARCH FELLOW (Biological Machine Learning and AI) - University of North Carolina, Lineberger Cancer Center
American Heart Association funded research fellow in AI-driven precision medicine
Identifying, modeling, and understanding noise in single cell signaling during stress responses focusing on utilization of live cell imaging and machine learning on big data
Built and ran Data Science and Data Engineering capabilities in the areas of computer vision, natural language processing, artificial intelligence, infrastructure, high performance computing, predictive modeling, and mathematical modeling
Built and maintained live cell imaging infrastructure, developed a suite of machine learning algorithms to understand noise in single cell signaling, wrote and secured two fellowships, directly contributed to high impact publications and invited for speaking engagements, and mentored several graduate students and postdoctoral fellows.
Extensive experience working with time series data from signal data from engineering of pipelines and early data processing to complex machine learning algorithm development and implementation for decision-making and novel hypothesis generation
Expertise
Predictive Modeling | Medicinal AI | Machine Learning | Strategy and Vision Building | Resource Allocation Logistics | Algorithm Development | Technical Writing | Leadership | Mentorship | Team Building | Cross-departmental Collaboration | Data Science | Data Engineering | Artificial Intelligence | Generative AI | Python/Software Engineering | Cell Biology | Biophysics | Microscopy | Cancer Therapeutics | E-commerce | Manuscript and Grant Preparation | Patent Development | Bioinformatics | Microbiology | Digital Twins | Biotechnology | Precision Medicine | Multimodal Modeling | Generative AI | Fine-Tuning | AI Trust and Safety | Digital Product Management | AI Validation | Testing and Experimentation | B2B and B2C | Data and Analytics | Data Governance | Regulations and Compliance
LANGUAGES, TOOLING, INFRASTRUCTURE
Python | Amazon Web Services (AWS) | Pandas | Numpy | Scikit-learn | Matplotlib | Plotly | Tableau | Scipy | SQL | PyTorch | Sympy | Flask | Gunicorn | FastAPI | BASH | Linux | Git Github | Jupyter | HTML | Javascript | Microsoft Office | Google Suite | Matlab | Full Stack Dev | APIs | Agile | Jira | Time Series Modeling | Simulations | A/B Testing | Multi-armed Bandits | Causal Inference | MLOps | DevOPs | Generative AI| LLMs | NLP (OpenAI, Hugging Face, Vertex) | MultiModal | GenAI | Fine Tuning | Knowledge Graphs | Computer Vision | Snowflake | Databases | DBT | Cloud Computing and Infrastructures (multiple)
Patents
Systems and methods for labeling and distributing products having multiple versions with recipient version correlation on a per user basis
Method, system, and computer readable medium for labeling and distributing products having multiple versions with recipient version correlation on a per user basis
Systems and methods for controlling production and distribution of consumable items based on their chemical profiles
Using FI-RT to build wine classification models
Using FI-RT to generate wine shopping and dining recommendations
Selected Press
Selected Publications
Laura K. Potter, Matthew K. Martz*, Douglas Lawton*. *These authors contributed equally to the work. Ground Truthed Models to Inform Tangible Guids of Global Microbial Diversity Using Deep Neural Network Computer Vision. In Preparation.
Yong Jun Goh*, Brody J. DeYoung, Nicholas C. Dove, Brant R. Johnson, Matthew K. Martz, Patrick Videau. AgBiome: Harnessing the Microbial World for Human Benefit. Trends in Biotechnology. 2023.
McCarter PC, Vered L, Martz MK, Errede BE, Dohlman, HG, Elston, TC. Temporal separation of opposing MAPK feedback loops leads to robust stress adaptation. In preparation.
Ramona Schrage, …, Matthew Martz, …, Evi Kostenis. The experimental power of FR900359 to study Gq-regulated biological processes. Nature Communications 6, Article number: 10156. 14 December 2015.
Michelle C Helms, Elda Grabocka, Matthew K Martz, Christopher C Fischer, Nobuchika Suzuki, Philip B Wedegaertner. Mitotic-dependent phosphorylation of leukemia-associated RhoGEF (LARG) by Cdk1. Cellular Signalling, Volume 28, Issue 1, January 2016, Pages 43-52.
Martz MK, Grabocka E, Beeharry N, Yen TJ, Wedegaertner PW. Leukemia-Associated RhoGEF (LARG) is a Novel RhoGEF in Cytokinesis and Required for the Proper Completion of Abscission. Mol. Biol. Cell September 15, 2013 vol. 24 no. 18 2785-2794.
Matthew Martz and Philip Wedegaertner: Faculty of 1000 Biology, 23 Jul 2010 F1000Prime.com/4242964#eval4039063
Carkaci-Salli N, Flanagan JM, Martz MK, Salli U, Walther DJ, Bader M, Vrana KE. Functional domains of human tryptophan hydroxylase 2 (hTPH2). J Biol Chem. 2006 Sep 22;281(38):28105-12. Epub 2006 Jul 24.
Here is a list of most recent posts:
09 November - Summer of AI - An AgBiome Perspective
This is an interview that served as the starting point for a podcast wherein we discussed Artificial Intelligence with an AgBiome perspective. The podcast went beyond what is below and looked at the broader societal perspective; I will link the podcast shortly.
17 April - Notes on MLOps - One
This is a short piece I wrote while at Firstleaf as a response to a really great article on the state of MLOps. I used several strong points in the article to articulate my thoughts on where we did things well and directions I would like to see us take.
13 November - Python Generators and Comprehension
Digging into generators and comprehension - from basics to to implementation in a comprehensive tutorial. This is a walkthrough for beginners that will build up to real world examples.
13 November - Dictionary Lookup - Exploring the Depths
Exploring methods of performant Python dictionary lookups