ThinkMobile: Human-Centered Cognitive Assessment
Cognitive assessment is a crucial foundation for both rare and common neurological disease detection, drug development, efficacy determination, and side-effects monitoring, but the current practice suffers from a number of problems.
- Screening and testing is expensive and/or logistically very difficult, in part due to the need for trained test administrators, and the cost of reaching out to sub-populations that are ethnically, culturally, and geographically diverse.
- Especially in rural communities, there is an undersupply of trained clinicians.
- Current practice, particularly among underserved populations, is too often distorted by often unconscious bias and hampered by lack of relevant norms.
- There is insufficient use of existing data. We believe this may arise because difficulties in accessing the data, varied data collection methods across global populations, language/cultural bias inherent in traditional standardized tests, and data accuracy issues associated with subjective scoring.
- Given the speed with which technology and medical practice improve, even data carefully gathered in previous studies may still not be usable [Uegaki 22].
- Calibrating cognitive state is often error-based, i.e., counting the number of mistakes made. But human behavior is much more nuanced than a simple right/wrong judgment can capture.
Our solution addresses all six of these problems. ThinkMobile is self-contained, highly portable, and has embedded some of the skill that human clinicians use when testing people. This vastly lowers the cost and manpower requirements of screening and testing. ThinkMobile uses the person as their own control, reducing impact of confounding conditions such as fatigue, especially important on small samples common with rare diseases. ThinkMobile evaluates results using criteria expressed in code that is applied identically from one person to the next, and easily audited for bias. The system collects data using precisely specified processes that are auditable and repeatable. This significantly increases the ease of data reuse by collaborative or future research efforts. Because the behavior captured is preserved indefinitely, it is easy to test out novel ideas retrospectively reanalyzing data long after the study was performed. This reduces the number and cost of new studies needed. Finally, we use a much richer model of assessment, one that examines a broad range behaviors, including, e.g., how hard the person is working to get their answer, not just the accuracy of the final answer.
The ability to capture behavior facilitates the construction of models for rare diseases. As one example, while Alzheimer's isn’t rare (55 million people worldwide), access to the newest treatments is limited to a very small subset, due in part to limited availability and costs of cognitive prescreening and monitoring, essential due to high risk of adverse cerebral events (e.g., ARIA). According to Alzheimer's International, much of the growth in the disease will be in low and middle income countries: “The fastest growth in the elderly is in China, India, and their south Asian and western Pacific neighbors.” We believe ThinkMobile will enable developing models that can predict treatment efficacy and calibrate risk adverse effects (e.g., ARIA), and help get the right treatments to the right people, safely.
ThinkMobile is a tablet-based assessment platform that enhances efficiency of clinical trials because: It is human centered at multiple levels; it focuses on capturing and analyzing a broad range of behaviors, not just the final answers people give; and it uses AI to embed into the system some of the skills of an experienced clinician.
Human centered. Our platform makes wide use of an insight pioneered in the Clock Drawing Test: present people with the same task under different cognitive loads, then attend to the difference this makes in their behavior. In our Digital Maze Test, for example, the subject is given two mazes to solve. The first presents no choices; just a path from start to end, while the second maze has choices. The person does not know it, but the solution paths are identical, so the physical task is identical but the cognitive load differs. Everyone is slower on the choice maze, but the difference between no choice and choice is significantly larger for those with cognitive impairment. This allows us to use each person as their own control. We are thus human centered down to the level of a single person.
ThinkMobile gives verbal instructions, these can be in any language or dialect, and can even be recorded by the local clinician, lending familiarity to the testing process. ThinkMobile assessment is customizable as a screener or as a test battery, and is designed to be quick (5–15 minutes), easy and even enjoyable, promoting engagement, reducing drop-outs.
Embedded Expertise. We embed into our system some of the expertise of an experienced clinician: the system knows e.g., how to react if the person disobeys the instructions for test. This can sharply reduce the need for trained administrators of tests.
The ability to deploy remotely (ultimately to a personal device) with automatic, secure data transfer of test data back to the study database enables assessment almost anywhere, sharply reducing site and travel costs, enabling cost-effective large scale pre-screenings.
Capture Behavior. ThinkMobile is distinguished by being a behavior capture system, that uses AI and ML techniques to analyze naturally occurring inadvertent interactions. In our maze test, for example, we capture a wide variety of aspects of subject problem-solving behavior (e.g., speed at decision points), not just the final answer (whether they found a path).
Capturing behavior facilitates the creation of customized algorithms, as well as the subsequent re-analysis of behavior, e.g., from rare disease subjects who may have been misdiagnosed by then-current ideas about disease detection..
All of this, combined with the culture/language-free design of our tests allows assessment essential for recruiting sufficient patient numbers for rare disease studies. Finally, behavior capture enables us to train algorithms for biomarkers, tests and features for different diseases, both rare and common, potentially impacting a very large number of people. In prior work, for example, digital features from DCTclock™ were correlated to regional distributions of Amyloid and Tau Pet imaging markers in unimpaired older adults, demonstrating the ability to create digital Alzheimer’s biomarkers.
We aim at people who are worried about cognitive change or neurological symptoms, but have little or no access to providers with neurological expertise. This includes people for whom the providers are geographically distant, as for example rural populations where travel costs, missed work and logistics present barriers to care. We aim at those whose culture and language reduce the effectiveness of assessments due to the paucity of clinicians who are fluent in that language and familiar with that culture. These barriers disadvantage access to care, making it extraordinarily difficult for those with rare disorders.
We want to make it possible to harmonize studies, to pool resources across diverse countries, cultures and languages. This is especially important in the sporadic populations of those with rare disorders.
We solicit feedback from assessment takers upon completing the session using a set of questions we have found elicit useful feedback.
We also work to understand the needs of our all of our target populations by involving them early on in the development of our systems. From them we get insight about what they need and immediate feedback about how well our tools meet their needs.
As we have illustrated with our Digital Clock Drawing Test (winner of multiple awards, including a winner in the MITSolve Brain Health Challenge in 2017) we can train algorithms for relevant biomarkers, a capability that will be especially useful for rare diseases. This will enable cost-effective pre-screening and reduce biomarker costs currently associated with screen failures.
As member of a large neurology division within a tertiary care teaching hospital, Dana Penney has experience in rare neurological diseases including those that are misdiagnosed as common neurological disorders, as for example prion disease, progressive supranuclear palsy, and frontotemoporal lobe degeneration.
We have ongoing collaboration with MGH in their study of a rural Colombian Kindred cohort that carries a rare genetic mutation that causes early-onset Autosomal Dominant Alzheimer’s Disease (ADAD). This condition results in a near-certain chance of developing memory problems associated with Alzheimer’s by the third decade of life. High incidence of learning disabilities in cohort children has raised some concern for an even earlier disease onset variant. This cohort presents a number of challenges including minimal education, cultural diversity, geographically remote location and a need for lifespan cognitive assessment. A model specific to this cohort is likely to be quite revealing.
We previously partnered with the Diver's Alert Network in a pilot study of the famous Ama pearl divers in Japan, who routinely free dive to considerable depths (40–100 feet) and stay down for up to 7 minutes. There is concern about whether these conditions (pressure and repetitive hypoxia) have cognitive consequences. The Ama divers were reluctant to take traditional western tests, but were accepting of a behavior capture drawing test also in use with the Kindred cohort. Results showed declines in cognitive function even over the course of a single day of repetitive diving similar to cognitive issues seen in vascular impairment. Subsequent research by DAN has led to improved safety guidelines for breath-holding divers.
We have successfully collaborated with small rural groups with diverse cultures and unique backgrounds and needs, using beta versions of our system. These experiences helped to inform the design and evaluation of ThinkMobile.
- Enhance efficiencies in clinical trials and research, including data collection and sharing.
- United States
- Pilot: An organization testing a product, service, or business model with a small number of users
We see several ways in which the Prize will help. Additional funding will of course be useful, but we believe that assistance addressing cultural/international, technical, legal, and market challenges will be equally valuable.
Cultural: It will be valuable to have access to global partners with diverse rare disease interests. This would be a major step toward further testing and developing our ability to develop algorithms specific to diverse rare diseases. The Harvard/Colombia and Ama pearl/DAN connections noted above are to date our only entre to rare diseases of international scope.
Technical: We would benefit significantly from access to an experienced biostatistician. We do not have enough work to keep someone busy full-time so have never hired one, but ongoing occasional access to an experienced practitioner would be quite valuable.
Markets: In our prior experience as founders of a medical device start-up company, we gained great appreciation for the complexity and expertise necessary for FDA clearance, marketing and distribution on a national scale. Our prior Solve solution (DCTclock) was initially designed as a medical diagnostic tool, but marketing and distribution reasons led to its reconfiguration as a screening tool (a change that was easily accomplished). ThinkMobile is aimed at international distribution - adding considerable complexity for positioning to address the needs of various markets. As academic researchers we clearly do not have the knowledge necessary for effective marketing and distribution, or for the barriers that might be encountered.
Legal: We would welcome legal support to help deal with the issues that may arise from trans-border movement of data, given the patchwork of different regulations across different geographic locations/countries.
In our collaborative research we have been involved with a number of communities around the world with unique experiences and needs that are not met by traditional assessment tools.
As a neurodiagnostician in a metropolitan teaching hospital, it is not uncommon for Dana Penney to encounter rare disorders, as for example PSP, ALS, FTD, prion disease, and posterior cortical atrophy. Few available specialists and limited resources produce costly delays in diagnosis that contribute to patient and family distress and isolation. Delays and misdiagnosis contribute to the use of poorly matched treatments that can make a difference in life quality even for people with diseases for which no cure is available. To worry about the unknowns inherent in rare diseases can be stressful and frightening, particularly when it involves your ability to think and remember. To live with symptoms without a diagnosis - simply waiting for access - can be nearly unbearable.
Even in the New England region, there are considerable barriers to accessing specialty neurocognitive care. In Dr. Penney's practice, for example, wait times for cognitive assessment appointments can be a year or more. Patients need to travel considerable distances, over multiple appointments, involve family members who lose time from work, creating geographic, time and cost barriers. Remote assessment tools are few and insensitive to cognitive change. ThinkMobile is designed to change this.
According to NIH there are approximately "6,000–8,000 rare diseases, with 250–280 new diseases described annually, affecting an estimated 6–8% of the human population". Developing new tools specific for disease detection and treatment monitoring for every new rare disease is impossible - there are just too many. But an intelligent behavior capture system that can be deployed remotely and facilitates the creation of novel algorithms to detect and monitor new rare diseases could have huge impact on many people. ThinkMobile is designed to make this possible.
Dr. Davis has been involved in developing innovative AI approaches to medicine for most of his career. Among other things, he conceived of and implemented a model of explanation that allowed early AI/Medicine programs to both explain their reasoning that led to their diagnosis, and explain why a different diagnosis was not chosen.
In later work he created systems that reasoned from structure and behavior, making it possible to diagnose problems in any system that could be described in those terms. Most recently he has been the technical end of the development of ThinkMobile and all of its predecessor systems.
As noted earlier, we focus on all the behaviors people exhibit during assessment. This is inspired by the observation of how much a clinician can learn about a person simply from observing them get up from the waiting room and come into the office. We aim to capture a small but revealing part of that by focusing on, capturing, and analyzing subject behaviors during assessment.
It is clearly sustainable in the sense that it reduces many of the costs of assessment, increasing the possibility of reaching under-served communities.
Our platform will reduce the time and effort needed for clinical studies because it embeds knowledge about the task, reducing the manpower needed. It will make assessment more informative because it attends to many of the subject’s behaviors.
Reduced manpower will in turn reduce the startup cost barrier of clinical studies, enabling attention to rarer diseases, i.e., those that affect smaller groups of people.
While we do not year have formal evidence for this, Dr. Penney’s experience in her clinic and both of our experience with the development, FDA approval, and international distribution and use of DCTclock™ strongly indicate its plausibility.
Our platform is motivated by the extensive information clinicians gather from all of a participant’s behaviors, not just those scored by the test. We view all participant-platform interactions as informative and have designed our platform to capture a wide variety of inadvertent and natural behaviors.
At the most basic level, the system relies on the ability to track human movement precisely, currently hand motion, measured in terms of stylus position, pressure, angle, and orientation, and eye movement, i.e., gaze position on the testing form. These data are then analyzed at increasing levels of abstraction, leading to a final assessment. For hand motion stylus positions become strokes, strokes become digits, and digits are recognized. For eye motion we have gaze position, fixations, and gaze trails, as well as blinks, pupil size, and patterns of fixations. At the next level of analysis we use machine learning to explore the information available from a broad range of these properties (e.g., pen speed, delays between answers, the coordination of eye and stylus movement). At yet another level we detect common patterns of eye movements that indicate in real-time whether the person is learning as they work through the test.
This last item in particular provides an opportunity to explore our hypothesis that impaired learning precedes impaired memory, providing a possible neuropsychological explanation for the ability of these tests to spot impairment sooner than traditional testing approaches.
At the high level, ThinkMobile functions as a partner in the assessment process. It measures cognitive abilities like information processing, executive functions, learning-to-learn and reasoning as subjects interact with our platform. Our test instructions use interactive examples, allowing us to measure aspects of learning-to-learn (how to take the tests). We believe declines in the learning-to-learn process precede memory impairment, and are measurable even when test performance appears normal.
We embed in our platform guidance of the sort exhibited by experienced clinicians, which required understanding what they look for (e.g., subject confusion) and how they respond (e.g., reassure participant, encourage response). We use inference rules that combine pattern detection and signal processing to detect the relevant indicators and then carry out clinician-like interactions.
At a systems/data handling level; the platform is designed around 3 classes of users: Administrators can set a number of basic parameters for the platform and create accounts for Clinicians, who in turn can create accounts for Subjects. Clinicians select tests they want their subjects to take; subjects are then notified by email.
Subjects can take the tests after receiving notification. Their personal information is encrypted using a password set by their clinician; their test data (e.g., stylus and eye tracking information) is stored in both the iPad’s Secure Enclave and transmitted to our Test Data server, where it is encrypted via public key encryption (this permits multiple different sites to upload data with the same (public) key, without compromising security).
- A new technology
Substantial evidence of the value capturing and analyzing behavior is provided by the demonstrated ability to detect cognitive impairment sooner than current approaches. This has been documented in a number of publications, including:
Learning classification models of cognitive conditions from subtle behaviors in the digital clock drawing test, William Souillard-Mandar, Randall Davis, Cynthia Rudin, Rhoda Au, David J Libon, Rodney Swenson, Catherine C Price, Melissa Lamar, Dana L Penney, Machine Learning 102, 393-441
[Selected as winner of the 2016 Innovative Applications in Analytics Award by INFORMS, The Institute for Operations Research and the Management Sciences.]
Cognitive and connectome properties detectable through individual differences in graphomotor organization, Melissa Lamar, Olusola Ajilore, Alex Leow, Rebecca Charlton, Jamie Cohen, Johnson GadElkarim, Shaolin Yang, Aifeng Zhang, Randall Davis, Dana Penney, David J. Libon, Anand Kumar. Neuropsychlogia, vol 85, May 2016, pp 301-309. doi: 10.1016/j.neuropsychologia.2016.03.034.
Clock Drawing Performance Slows for Older Adults After Total Knee Replacement Surgery. Hizel LP, Warner ED, Wiggins ME, Tanner JJ, Parvataneni H, Davis R, Penney DL, Libon DJ, Tighe P, Garvan CW, Price CC.. Anesth & Analg, July 2019, Vol 128, Issue 1, pp 212-219.
Total clock drawing and inter-stroke latencies or information revealed between the lines. D. L. Penney, D. Libon, C. Price, M. Lamar, R. Swenson, K. Garrett, R. Davis. (2011). Digital Clock Drawing Test (dCDT) – IV: Abstract presented at the 39th annual meeting of the International Neuropsychological Society. (2011) Boston, MA.
Digitized Clock Drawing Performance And Its Relationship To Amyloid And Tau Pet Imaging Markers In Unimpaired Older Adults, Alzheimer's & Dementia: The Journal of the Alzheimer's Association , Volume 14 , Issue 7, P236 - P237. Papp K.V., Rentz D.M., Burnham,S., Orlovsky I., Souillard-Mandar W. Penney D., Davis R., Sperling R.A., Johnson K.A.
Hizel, L., Marion, A., Amini, S., Davis, R., Penney, D., Libon, D.J., Price, C.C. (2018). Post clockface latency as an indicator of executive functioning in Parkinson’s disease. Journal of the International Neuropsychological Society 24(S1), A-325. P. 217, doi:10.1017/S1355617718000528
- Artificial Intelligence / Machine Learning
- Imaging and Sensor Technology
- Software and Mobile Applications
- Not registered as any organization
One of our project leads (Davis) is a long-time MIT faculty member, and as such he has been involved in and participated in the culture described by MIT’s DEI statement, notably “… Solve’s core values of optimism, partnership, open innovation, human-centered solutions, and inclusive technology."
He also helped to define the goals and structure of the new Schwarzman College of Computing, particularly its commitment to studying the social and ethical responsibilities of computing.