Machine learning in scientific fields

This entry was posted on Tuesday, April 24th, 2018.

As machine learning and artificial intelligence (AI) becomes more sophisticated and advanced, there are more practical uses for the technology.

Machine learning can advance scientific achievement, especially where medicine and space-exploration is concerned. Since machine learning can handle a myriad of data in an efficient manner and is able to automate repetitive actions based on data, machine learning is a logical step forward for bleeding-edge scientific fields.

Machine Learning In Medicine

One might not typically assume Barbara Engelhardt, a computer scientist at Princeton University to be working in medicine, but she has been developing statistical tools that seek out expected biological patterns in order to map out the genome’s real, but elusive “ground truth.” Techniques in AI and machine learning are dramatically changing the landscape of biological research, but Engelhardt does not think traditional “black box” methods are enough to provide the insights necessary for understanding, diagnosing, and treating disease.

In an interview with Wired, Engelhardt believes that traditional machine learning models lacks interpretability. While traditional machine learning models can accumulate data and process them in a box, they do not allow people to “open the box,” to understand which genes are differentially regulated in particular cell types or which mutations lead to a higher disease. She states in the interview that she “can’t just have something that gives and answer without explaining why.”

The group she works with, the Genotype-Tissue Expression Consortium (GTEx), relies heavily on sparse latent factor models, which partition all the variation observed in the samples with respect to a minute number of features. One of these partitions might include 10 genes or 20 mutations. A scientist can look at those 10 genes or 20 mutations and figure out what they have in common and determine what this given partition represents in terms of biological signal that affects sample variance.

Engelhardt thinks of the solution as a two-step process. First, build a model that separates all the sources of variation as carefully as possible. Then, go in as a scientist to understand what those partitions represent as a biological signal. The scientists can then validate those conclusions in other data sets and corroborate what else they know about those samples.

Other examples of Engelhardt’s work includes a model that determined how mutations relate to the regulation of genes on other chromosomes in 44 human tissues. Among the findings were results that pointed to a potential genetic target for thyroid cancer therapies. She also built a model that makes recommendations to doctors when to remove their patients from a ventilator and allow them to breathe on their own.

Machine Learning In Space

Thanks to a recent effort by researchers at the University of Toronto Scarborough, the same machine learning technology used in autonomous vehicles is being used to measure the size and location of crater impacts on the moon.

Before, counting and measuring craters on the moon was done using an “archaic method” according to Mohamad Ali-Dib, a postdoctoral fellow in the Centre for Planetary Sciences (CPS). Ali-Dib developed the machine learning technology, along with Ari Silburt, Chenchong Charles Zhu, a group of researchers at CPS and the Canadian Institute for Theoretical Astrophysics (CITA).

“Basically, we need to manually look at an image, locate and count the craters and then calculate how large they are based off the size on the image.” Thanks to AI, “we’ve [sic] developed a technique from artificial intelligence that can automate this entire process that saves significant time and effort.”

Ali-Dib has said that this is the first time that they have an algorithm that can accurately detect craters for not only parts of the moon, but also some areas of Mercury. To determine accuracy, researches first trained the neural network on a large data set, taken from elevation maps gathered from satellites covering two-thirds of the moon, then tested their trained network on the remaining third. The algorithm worked so well it was able to identify 6,000 previously unidentified craters on the moon.

By studying craters, Ali-Dib believes that they can better understand the distribution of material and the physics that occurred during the early stages of the solar system. In other words, craters might offer a window into the history of the solar system.

There are plans to further improve the algorithm to allow researchers to find more craters, and to expand operations into other parts of the solar system like Mars, Ceres, and the moons of Jupiter and Saturn.


Machine learning has found its way into various use-cases, not just in practical fields like autonomous vehicles, but also in the field of scientific research. Machine learning has helped advanced medical research, and although Engelhardt maintains that a “black box” solution will not suffice for better understanding biology, people like her are nevertheless attempting to create models that can better analyze biological data.

Ali-Dib’s team has discovered double the number of craters using their machine learning solution to detect and discover craters on the moon. So successful is the model that there are plans to expand the algorithm’s use onto other planets. Machine learning is not only saving users time and effort on tedious tasks, they are making things better.