The Key Takeaway
Minnetronix’s DepiCT is an example of how powerful deep learning algorithms can be developed for medical applications while mitigating the cost of input data and optimizing for overall program schedule and budget efficiency.
In the case of DepiCT, our robust data science selection and development processes resulted in:
- Improved accuracy by incorporating physiologic knowledge into the midline shift algorithm, leading to a 75% reduction in the mean squared error of midline shift predictions.
- Saved cost and timeline by considering the application circumstances, deviating from our standard process of device and algorithm integration, and mitigating the resulting risk through a student-generated-data approach that prioritized data quantity and cost.
- Minimized development risks by choosing a deep learning approach early, minimizing the risks of this approach, and partnering with clinical experts to verify the algorithm was producing qualitatively acceptable results.
The Importance of Data Science
Today, data science is becoming a critical part of any medical device development program because of the algorithmic power that it can unlock for a defined clinical application. Specifically, artificial intelligence (AI) is becoming more widely adopted. But because the medical industry has strict requirements and regulations around data science are evolving, it can be challenging for companies to navigate the biggest barriers. Specifically, the cost of change and the cost of input data are two areas that when not addressed correctly, could drive a development team over schedule and over budget.
To address the cost of change, companies must carefully plan for the future at the outset. From a data science perspective, this means selecting the optimal data science approach for the device based on both current design and future plans in the product roadmap. The data science approach in this case study, neural networks (a.k.a. deep learning) , is a great choice when a company has a specific use case in mind, already possesses or can gather large amounts of data, and needs a near-human level of performance from their algorithm.
To address the cost of input data, companies must find ways to make more with less. From a data science perspective, this means following a rigorous algorithm development process that incorporates the clinical context, knowledge of clinical applications, hardware design implications, and regulatory strategy to optimize the amount of insight that can be extracted from costly and often scarce input data. When developing deep learning algorithms, in particular, this area is even more pivotal to the success or failure of a program due to the high volume of input data required for these applications.
Minnetronix offers full-scale data science capabilities as a part of our core services and our team has built highly effective deep learning models in the past. We are a single partner that excels at the junction of medical device and data science, meaning that our clients don’t need to manage the algorithm development themselves or seek additional support from an outside data science vendor. When partnering with Minnetronix, our data science processes are built into the core medical device development program and handled by our team of data scientists. Instead of running siloed workstreams that merge at the end, this integrated approach ensures better tracking to the overall schedule and budget targets of the program.
Every project has different data science needs. One prevalent request is from our customers who are aware of the potential AI and neural nets offer to add value to their business, but may not be aware of best development practices to realize this value, especially in the highly regulated medical device space. To better illustrate this situation and our proficiency in data science, the development process surrounding the DepiCT CT classification software serves as an excellent example.
“Specifically, the cost of change and the cost of input data are two areas that, when not addressed correctly, could drive a development team over schedule and over budget.”
Demonstrating Data Excellence with DepiCT
DepiCT is a deep learning capability developed by Minnetronix’s data science team that segments CT scans of patients who have had an intracerebral hemorrhage. The software is designed to automate the characterization of blood, swelling, and CSF (cerebrospinal fluid) volumes in the brain over time as well as predicting the midline shift of the scan. This information may inform physicians in their clinical workflow when evaluating post-hemorrhage patients.
The general strategy for this development effort was to collect large amounts of CT data from post-intracerebral hemorrhage patients and build an algorithm that could characterize this critical information. After the algorithm was finished, we then aimed to create a user interface for clinicians before validating and commercializing the algorithm.
“DepiCT is an algorithm using large amounts of CT data from post-intracerebral hemorrhage patients to characterize critical information.”
Selecting the Right Approach
One critical aspect of any medical algorithm is selecting the data science approach that minimizes the high potential cost of change. It may seem trivial, but each approach introduces different trade-offs that will occur immediately and in later development stages.
Minnetronix has developed a framework to aid in the selection process and provide an approach to data that minimizes future risk. We generally consider four approaches to data science. Each provides benefits and drawbacks, according to six competing characteristics.
To illustrate these characteristics and how they play into the selected approach to data science, we will consider the decisions made during the DepiCT project, for which some critical factors included clinical application, ease or difficulty of deployment, regulatory hurdles, and business or market constraints.
After strategically analyzing the data science and commercialization tradeoffs based on our model, we determined that the AI-Neural Network approach was optimal. This approach would maximize performance levels and leverage the large dataset, but came with some cost and timeline risks. These risks were reduced with the knowledge that segmentation of CT images is a well-tread application of neural networks.
Assessing the Needs for DepiCT
We chose the proper approach based on six competing variables and tweaked it to fit our desired goal.
Input Data Efficiency
The choice of approach relies heavily on the amount of data available. In this case, a large dataset of annotated CT scans was collected and available for algorithm development. The dataset was large enough to meet the input data needs of any data science approach.
The DepiCT algorithm simply needed to label the amount of blood, swelling, and CSF in the brain. Any additional knowledge or insight into how it makes conclusions was not critical for the project.
The algorithm’s performance was more important than its speed. It was assumed that the algorithm would be deployed on a machine with high computational resources. As a result, algorithm performance was prioritized over computation speed.
The cost and timeline requirements for this project were very high. There were significant cost constraints concerning the scope of the project and the timeline was aligned with the release of another product.
The algorithm needed to segment blood, swelling, and CSF, and predict MLS at a performance level comparable to that of human radiologists. As such, the performance requirement was very high.
There was opportunity for this product to grow into new markets post-commercialization. Beyond intracerebral hemorrhage patients, potential new markets included other types of hemorrhaging or hydrocephalus. Therefore, this step in the Minnetronix process was given some weight but not considered strictly necessary.
Minnetronix’s Proven Process
A standardized algorithm development process is vital to a program’s success. This importance is amplified for medical applications because of the high cost of input data. Regardless of which approach to data science is selected, every project follows Minnetronix’s trusted seven-step process. For the DepiCT project, our process helped identify gaps and maintain focus while developing the algorithm. In particular, there were some steps that proved most critical to the success of DepiCT. The DepiCT team valued speed and minimizing expenses over our own effort level. That is – we wanted to work quickly and keep external costs low. To accomplish this, we were willing to work harder internally to make it happen. This philosophy greatly influenced the decisions we made during the process.Throughout the Plan and Understand phase, we used a global network of physician connections. This network allowed Minnetronix to understand physicians’ needs from DepiCT in a clinical setting. Additionally, it allowed us to build a team of domain experts who understood the value of the product and could generate ground truth segmentations in the Collect phase.
We also developed a deep understanding of potential risks. One such risk was the lack of control Minnetronix had over the medical device itself. Normally, Minnetronix’s data science capabilities are integrated with our device development, and we can design the device’s outputs to be optimal inputs for the algorithm. Minnetronix, however, did not design any of the CT scanners used for DepiCT. To minimize this risk, we gathered a much larger dataset than we would normally consider. This was feasible because
- Different CT scanners produce broadly similar images, and many successful algorithms have already been shown to work with data from multiple scanners.
- CT scans are already commonly ordered and available in the standard of care for our target population.
When comparing a model trained on our large dataset against a model trained on a smaller dataset of scans from a single scanner, our larger dataset had 40% less volumetric error. This validated our approach to focus on data quantity in this instance.
Overall, this phase created a better understanding of what capabilities the final product would need, and which features we needed to prioritize during development. The phase also gave Minnetronix detailed inputs for cost and timeline estimates, which decreased their uncertainty and risk.
The Collect phase for DepiCT was two-pronged. One side was the collection of de-identified CT scans from nine hospitals. To get scans quickly and inexpensively, Minnetronix negotiated separate business agreements with each hospital. This required working through the different bureaucracies at clinical sites, identifying the appropriate stakeholders within each clinical site, and being flexible and agnostic towards the means of delivery of data if it was compliant to the clinical and technical requirements.
On the other side was the collection of the ground truth segmentations, which radiologists generated. Radiologists used a custom segmentation tool to view the scans and place their labels directly on the scans themselves.
As mentioned above, we developed a custom segmentation tool. By using custom software, Minnetronix reduced the time needed to complete segmentations and had greater control over the creation of our ground truth dataset.
To further save time and reduce cost, radiologists worked with trained engineering and medical students. Radiologists provided initial ground truth segmentations which were then assigned to students for cleaning before quality check signoffs. This system produced higher quality scan labels with lower cost by playing to the strengths of the more pragmatic radiologists and detail-oriented medical and engineering students
After a small number of scans were collected and segmented, we developed a proof-of-concept algorithm. Prediction analysis of this proof-of-concept showed the algorithm’s strengths and weaknesses. The strengths were enough evidence to move forward with development, while the weaknesses helped clarify what additional work was needed to achieve the necessary performance in a timely and cost-sensitive manner.
After the proof-of-concept model, we trained an updated model on a larger set of scans. The performance of this model was then tested on a set of scans the algorithm had never seen. Physicians received prediction results and gave feedback on the algorithm’s qualitative performance. We evaluated quantitative performance against the ground truth segmentations from the Collect and Organize and Clean phases.
During this stage the midline shift predictions were found to be sub-optimal, and the midline shift algorithm was modified to mirror the steps a physician would take when calculating midline shift by hand. By embedding this physiologic knowledge into the algorithm, the midline shift mean squared error was reduced by 75% and the model’s explainability to physicians was improved.
Validation of the DepiCT algorithm will be performed using an even larger set of scans the algorithm has not seen. The goal of this phase is to display the algorithm’s ability to generate segmentations at a performance level similar to radiologists and to predict accurate volumes of blood, swelling, and CSF, as well as midline shift distances.
Implementation will entail the completion of a viewing app that will present the volumes and segmentations to physicians. This app will allow physicians to quickly absorb the information provided by the DepiCT algorithm and inform clinician management.
The results from the Validate and Implement phase will be used to seek regulatory approval. Commercialization will require developing infrastructure to run the algorithm and interface with the medical image databases of healthcare providers.
A neural network was chosen from the outset of the project, primarily due to the high Capability Ceiling needed and large dataset available. However, neural networks come with plenty of cost and timeline uncertainty. To reduce the effect of this tradeoff, we were flexible in our data collection, partnered radiologists with students, and regularly measured the quantitative and qualitative performance of the algorithm to stay focused on the final product.
Deciding to use a neural network from the outset of the project allowed for a very clear and organized timeline to be drawn up for each stage of the Minnetronix data science process.
Why Work with Minnetronix?
DepiCT represents a case where planning and data collection occurred early in the development process to reduce cost and timeline uncertainty. Proper planning led to selecting a data science approach that would meet the lofty performance goals. Collecting a large input dataset and controlling its quality allowed Minnetronix to overcome the lack of device design control.
Minnetronix opens the door to data science, alongside our other core services. When you partner with Minnetronix, you’ll know that we can handle all your data needs and incorporate them seamlessly with your timeline, device, and business constraints. We approach data science as a fluid process that can integrate new data when it becomes available–delivering a better overall product at a lower cost. Best of all, this expert approach to data science comes standard with our partnership and translates seamlessly to the rest of device development.
In the case of DepiCT, our robust data science selection and development processes resulted in:
- Improved accuracy by incorporating physiologic knowledge into the midline shift algorithm, leading to a 75% reduction in the mean squared error of midline shift predictions.
- Saved cost and timeline by considering the application circumstances, deviating from our standard process of device and algorithm integration, and mitigating the resulting risk through a student-generated approach that prioritized data quantity and cost.
- Minimized development risks by choosing a deep learning approach early, minimizing the risks of this approach, and partnering with clinical experts to verify the algorithm was producing qualitatively acceptable results.
With Minnetronix as a partner during development, you can build a powerful algorithm while still adhering to pragmatic business constraints by utilizing our proven process, knowledge of clinical applications, and in-house data scientists.
Interested in learning more? Click here to contact us!