Machine learning for rare disease clinics
To identify and treat/cure rare diseases
Our solution using algorithms that aid clinicians to identify patients with rare disease, who are then counselled about cutting-edge clinical trials which can lead to a cure e.g. CRISPR
Mostly we serve patients with familial hypercholesterolaemia and amyloidosis but our services more broadly provide care to an impoverished, disadvantaged community.
We are already doing it!
- Optimize holistic care for people with rare diseases—including physical, mental, social, and legal support
- Mitigate barriers to accessing medical care after diagnosis which disproportionately affect disinvested communities and historically underrepresented identity groups
- Promote community and connection among rare disease patients and their advocates
- Pilot
We don't have any sustainable revenue which would keep us going and we lack a permanent data scientist to manage our databases.
The term machine learning describes a range of pattern recognition tools that hold considerable promise in the field of health care diagnostics, prognostics and therapeutics. The impact of machine learning may be most impactful in areas that have previously received limited attention, such as the identification of rare diseases and addressing inequities in health. Similarly a convergence has occurred with the pharmaceutical industries drug development pipelines which have steered towards rare disorders, with new knowledge gained from genomic studies. Just as genome-wide association, sequencing studies and proteomics studies have informed the biopharmaceutical industry of the causal mediators of disease (1-3), phenome-wide association studies or data mining interrogations of health databases are demonstrating the ability to both identify the important features of disease states and markers of therapeutic efficacy and toxicity (4, 5). However one of the barriers to the realisation of rare disorder therapeutics for is the poor identification of rare disease in clinical practice (6). Time pressure, information overload and a tendency to apply a one-size-fits all practice approach means that rare diseases frequently go undiagnosed (7). Cardiovascular diseases where this has the greatest impact include familial hypercholesterolaemia and heart failure with preserved ejection fraction (HFpEF), of which cardiac amyloidosis is a subset (8-10). Whilst new technologies such as nuclear scintigraphy pyrophosphate and DPD scans have improved access to diagnostic testing, both cost barriers and prior diagnostic assumptions need to be overcome to drive testing (11).
Machine learning has been applied to a number of imaging and diagnostic modalities, including retinal photography (12), electrocardiography (13, 14), and echocardiography(15) to enable cardiovascular disease prediction. Deep learning, a form of machine learning, directed at echocardiography cine images has been shown to be highly accurate at identifying cardiac amyloidosis (15). Integrating both ECG and echocardiography artificial intelligence models provides an even more accurate prediction of the presence of cardiac amyloidosis (16). Although laboratory testing is one of the most ubiquitous, high volume and low cost forms of diagnostic information there is only a recently emerging literature on the use of machine learning applied to lab data (17). One of the most highly evolved tools available, based on machine learning applied to full blood count testing is the ColonFlag Test, designed to flag the presence of colorectal cancer (18). This predictive test has gone from a research discovery to a clinically implemented electronic decision support tool (19). In heart failure several machine learning studies have been performed using data extracted from electronic healthcare records (20-22). Many of these have demonstrated the value of laboratory data to make an accurate both not only the prediction of the presence of heart failure but also prognosis (22, 23). Similarly machine learning has been applied to heart failure to preserved ejection fraction and identify its subtypes (24). This work has demonstrated not only the importance of certain diagnostic modalities but also the necessity of including a wide breadth of multimodel data.
Cardiac amyloidosis has been identified with a high degree of accuracy using machine learning applied to medical claims data (25). Whilst intriguing this method is dependent on adequate identification of ICD10 case data, completeness of clinical records and clinician behaviour which may not be translatable to other clinical systems or other countries. Whilst the potential for this type of bias is considerable, particularly with black-box uninterpretable methods such as deep learning, there are several methods that provide transparency and explainability but with equivalent levels of accuracy. Transferable machine learning models are preferably generated from data which comes from a similar source, is complete and is resilient to clinician biases. Haematology and to a lesser degree biochemistry data fulfils that requirement and has been used in a machine learning project to identify heart failure due to cardiac amyloidosis (26). Although this particular model was highly dependent on cholinesterase, a biomarker not frequently used in general care, the approach showed merit with this approach. We and others have similarly shown the ability for machine learning applied to haematology data to predict the presence of heart failure (27, 28). Whilst this was not specifically a method for identifying HFpEF, cardiac amyloidosis or familial hypercholesterolaemia it would be expected that the same method would have merit, when integrating additional laboratory and clinical data e.g. longitudinal lipids, ECG and imaging data etc. Primarily this benefits from over 10 years of longitudinal electronic health care records and prescribing data in New Zealand. With respect to heart failure therapeutics such as SGLT2 inhibitors, many of the features used in the machine learning prediction are biomarkers of treatment efficacy (29, 30).
We propose here funding a data scientist position within a public hospital to both build on this preliminary machine learning pipeline and then implement tools to identify patients in real-time who would then receive appropriate care and be counselled regarding participation in clinical trials.
References
1. Cohen JC, Boerwinkle E, Mosley TH, Hobbs HH. Sequence Variations in PCSK9, Low LDL, and Protection against Coronary Heart Disease. New England Journal of Medicine. 2006;354(12):1264-72.
2. Dewey FE, Gusarova V, Dunbar RL, O’Dushlaine C, Schurmann C, Gottesman O, et al. Genetic and Pharmacologic Inactivation of ANGPTL3 and Cardiovascular Disease. New England Journal of Medicine. 2017;377(3):211-21.
3. Bulawa CE, Connelly S, DeVit M, Wang L, Weigel C, Fleming JA, et al. Tafamidis, a potent and selective transthyretin kinetic stabilizer that inhibits the amyloid cascade. Proceedings of the National Academy of Sciences. 2012;109(24):9629-34.
4. Diogo D, Tian C, Franklin CS, Alanne-Kinnunen M, March M, Spencer CCA, et al. Phenome-wide association studies across large population cohorts support drug target validation. Nature Communications. 2018;9(1):4285.
5. Robinson JR, Denny JC, Roden DM, Van Driest SL. Genome-wide and Phenome-wide Approaches to Understand Variable Drug Actions in Electronic Health Records. Clin Transl Sci. 2018;11(2):112-22.
6. Cortese A, Vegezzi E, Lozza A, Alfonsi E, Montini A, Moglia A, et al. Diagnostic challenges in hereditary transthyretin amyloidosis with polyneuropathy: avoiding misdiagnosis of a treatable hereditary neuropathy. Journal of Neurology, Neurosurgery & Psychiatry. 2017;88(5):457-8.
7. Nativi-Nicolau JN, Karam C, Khella S, Maurer MS. Screening for ATTR amyloidosis in the clinic: overlapping disorders, misdiagnosis, and multiorgan awareness. Heart Failure Reviews. 2021.
8. Deaton C, Edwards D, Malyon A, S Zaman MJ. The tip of the iceberg: finding patients with heart failure with preserved ejection fraction in primary care. An observational study. BJGP Open. 2018;2(3):bjgpopen18X101606.
9. Banda JM, Sarraju A, Abbasi F, Parizo J, Pariani M, Ison H, et al. Finding missed cases of familial hypercholesterolemia in health systems using machine learning. npj Digital Medicine. 2019;2(1):23.
10. Lachmann HJ, Booth DR, Booth SE, Bybee A, Gilbertson JA, Gillmore JD, et al. Misdiagnosis of Hereditary Amyloidosis as AL (Primary) Amyloidosis. New England Journal of Medicine. 2002;346(23):1786-91.
11. Rapezzi C, Quarta CC, Guidalotti PL, Pettinato C, Fanti S, Leone O, et al. Role of <sup arrange="stack">99m</sup>Tc-DPD Scintigraphy in Diagnosis and Prognosis of Hereditary Transthyretin-Related Cardiac Amyloidosis. JACC: Cardiovascular Imaging. 2011;4(6):659-70.
12. Yim J, Chopra R, Spitz T, Winkens J, Obika A, Kelly C, et al. Predicting conversion to wet age-related macular degeneration using deep learning. Nature Medicine. 2020;26(6):892-9.
13. Johnson K, Neilson S, To A, Amir N, Cave A, Scott T, et al. Advanced Electrocardiography Identifies Left Ventricular Systolic Dysfunction in Non-Ischemic Cardiomyopathy and Tracks Serial Change over Time. J Cardiovasc Dev Dis. 2015;2(2):93-107.
14. Gladding PA, Loader S, Smith K, Zarate E, Green S, Villas-Boas S, et al. Multiomics, virtual reality and artificial intelligence in heart failure. Future cardiology. 2021.
15. Zhang J, Gajjala S, Agrawal P, Tison GH, Hallock LA, Beussink-Nelson L, et al. Fully Automated Echocardiogram Interpretation in Clinical Practice. Circulation. 2018;138(16):1623-35.
16. Goto S, Mahara K, Beussink-Nelson L, Ikura H, Katsumata Y, Endo J, et al. Artificial Intelligence-Enabled, Fully Automated Detection of Cardiac Amyloidosis Using Electrocardiograms and Echocardiograms. medRxiv. 2020:2020.07.02.20141028.
17. Ronzio L, Cabitza F, Barbaro A, Banfi G. Has the Flood Entered the Basement? A Systematic Literature Review about Machine Learning in Laboratory Medicine. Diagnostics. 2021;11(2):372.
18. Kinar Y, Kalkstein N, Akiva P, Levin B, Half EE, Goldshtein I, et al. Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts: a binational retrospective study. Journal of the American Medical Informatics Association : JAMIA. 2016;23(5):879-90.
19. Goshen R, Choman E, Ran A, Muller E, Kariv R, Chodick G, et al. Computer-Assisted Flagging of Individuals at High Risk of Colorectal Cancer in a Large Health Maintenance Organization Using the ColonFlag Test. JCO clinical cancer informatics. 2018;2:1-8.
20. Adler ED, Voors AA, Klein L, Macheret F, Braun OO, Urey MA, et al. Improving risk prediction in heart failure using machine learning. European journal of heart failure. 2020;22(1):139-47.
21. Choi D-J, Park JJ, Ali T, Lee S. Artificial intelligence for the diagnosis of heart failure. npj Digital Medicine. 2020;3(1):54.
22. Adler ED, Voors AA, Klein L, Macheret F, Braun OO, Urey MA, et al. Improving risk prediction in heart failure using machine learning. European journal of heart failure. 2020;22(1):139-47.
23. Adler E, Greenberg B, Braun O, Macheret F, Campagnari C. MARKER-HF (Machine Learning Assessment of RisK and EaRly mortality inHeart Failure): Development and Validation of a Novel Model MARKER-HF (Machine Learning Assessment of RisK and EaRly mortality inHeart Failure): Development and Validation of a Novel Model that AccuratelyIdentifies High Risk Heart Failure Patientsthat AccuratelyIdentifies High Risk Heart Failure Patients. Journal of Cardiac Failure. 2018;24(8, Supplement):S12-S3.
24. Shah SJ, Katz DH, Selvaraj S, Burke MA, Yancy CW, Gheorghiade M, et al. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation. 2015;131(3):269-79.
25. Huda A, Castaño A, Niyogi A, Schumacher J, Stewart M, Bruno M, et al. A machine learning model for identifying patients at risk for wild-type transthyretin amyloid cardiomyopathy. Nature Communications. 2021;12(1):2725.
26. Agibetov A, Seirer B, Dachs T-M, Koschutnik M, Dalos D, Rettl R, et al. Machine Learning Enables Prediction of Cardiac Amyloidosis by Routine Laboratory Parameters: A Proof-of-Concept Study. J Clin Med. 2020;9(5):1334.
27. Truslow JG, Goto S, Homilius M, Mow C, Higgins JM, MacRae CA, et al. Scalable cardiovascular risk assessment using artificial intelligence-enabled event adjudication and widely available hematologic predictors. medRxiv. 2021:2021.01.12.21249662.
28. Gladding PA, Ayar Z, Smith K, Patel P, Pearce J, Puwakdandawa S, et al. A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data. Future Science OA. 2021:FSO733.
29. Lawler PR, Liu H, Frankfurter C, Lovblom LE, Lytvyn Y, Burger D, et al. Changes in Cardiovascular Biomarkers Associated With the Sodium–Glucose Cotransporter 2 (SGLT2) Inhibitor Ertugliflozin in Patients With Chronic Kidney Disease and Type 2 Diabetes. Diabetes Care. 2021;44(3):e45-e7.
30. Packer M. Critical examination of mechanisms underlying the reduction in heart failure events with SGLT2 inhibitors: identification of a molecular link between their actions to stimulate erythrocytosis and to alleviate cellular stress. Cardiovascular research. 2021;117(1):74-84.
31. Besseling J, Reitsma JB, Gaudet D, Brisson D, Kastelein JJ, Hovingh GK, et al. Selection of individuals for genetic testing for familial hypercholesterolaemia: development and external validation of a prediction model for the presence of a mutation causing familial hypercholesterolaemia. European heart journal. 2017;38(8):565-73.
We aim to develop machine learning models which can identify rare disease and then clinically implement them. We have already done this with a algorithm that identifies patients with familial hypercholesterolemia from the electronic clinical records. Patients are then referred to a specialised clinic for this rare disorder and then enrolled into an siRNA or CRISPR-Cas9 trial. We intend to apply the same methods to amyloidosis for which we are also involved win CRISPR-Cas9 trials.
We are monitoring accuracy of our predictive algorithms through the use of audit, and the algorithm outputs are constantly monitored by a nurse specialist. We have audited our clinic processes and shown both increased efficiency and cost savings by implementing our new model of care.
We use an IMLA (Implement, Monitor, Learn, Adapt) which is similar to Agile Change Management as a technique to implement change. We do this using a well-circumscribed, protected environment where discourse and commentary is welcome and listened to. This environment is also accepted as a sandbox for new ideas to be trialled and stopped if considered of no value.
We use a range of machine learning and AI methods from logistic regression, decision trees, XGBoost applied to numeric data, natural language processing (NLP) applied to textual data and machine vision to medical imaging.
See here:
- A new application of an existing technology
- Artificial Intelligence / Machine Learning
- Big Data
- 9. Industry, Innovation, and Infrastructure
- 10. Reduced Inequalities
- New Zealand
- New Zealand
- Hybrid of for-profit and nonprofit
We welcome diversity, equity and inclusiveness. For this reason we have set up our clinic services in one of the poorest regions of Auckland, New Zealand and have substantive input from our indigenous population and other ethnic groups. We have run patient focus groups for rare disease and hui, e.g. familial hypercholesterolemia, discussing the ethical and societal implications of such technology. This has included patients in the LGBT community who suffer from rare diseases.
Our clinics are run through a public health care system however we welcome sponsors e.g. Eko digital stethoscope company with whom we share data, which generates revenue. We have numerous other potential partners and biopharma sponsors who are interested in running enrichment trials using our services. We have formed two spinout companies which has attracted external investment. Ultimately this all benefits the patients who pay no fees to attend our clinic.
- Individual consumers or stakeholders (B2C)
We hope to attract sufficient grants, government contracts, sponsorship and revenue from services to become self-sufficient.
We have received grants from Edwards Lifesciences ($60,000), Pfizer ($50,000), Verve therapeutics ($140,000) and received revenue from Eko digital stethoscope company.