Algorithm As Accurate As Dermatologists in Diagnosing Skin Cancer

Computer algorithm for skin cancer matched performance of human dermatologists.

An artificial intelligence diagnosis algorithm for skin cancer is as accurate as a board-certified dermatologist, according to a recent study.

To make a skin cancer diagnosis, dermatologists examined the lesion with the naked eye and with the aid of a dermatoscope. If the methods are inconclusive, or the dermatologist believes the lesion to be cancerous, a biopsy will be taken.

In a study published in Nature, investigators developed a database consisting of nearly 130,000 skin disease images, and trained the algorithm to visually diagnose potential cancer. The results showed the test performed with inspiring accuracy, according to the authors.

“We realized it was feasible, not just to do something well, but as well as a human dermatologist,” said investigator Sebastian Thrun. “That’s when our thinking changed. That’s when we said, ‘Look, this is not just a class project for students, this is an opportunity to do something great for humanity.”

The product was tested against 21 board-certified dermatologists. The results of the study showed that the algorithm of diagnosing skin lesions matched the performances of dermatologists.

Although deep learning has a history in computer science, it has only recently been applied to visual processing tasks. According to the authors, the core of machine learning is that a computer is trained to figure out a problem, as opposed to having the answers programmed into it.

“We made a very powerful machine learning algorithm that learns form data,” said co-lead author Andre Esteva. “Instead of writing into computer code exactly what to look for, you let the algorithm figure it out.”

For the algorithm, the investigators fed it each image as raw pixels with an associated disease label. This particular method required very little processing or sorting of images prior to classification, meaning the algorithm can work with a wider range of data compared with other methods.

To build the algorithm, the investigators used an algorithm developed by Google that was already trained to identify 1.28 million images from 1000 object categories. Although the algorithm could differentiate between common images, it could not differentiate between skin lesions.

“There’s no huge dataset of skin cancer that we can just train our algorithms on, so we had to make our own,” said co-lead author Brett Kuprel. “We gathered images from the internet and worked with the medical school to create a nice taxonomy out of data that was very messy—–the labels alone were in several languages, including German, Arabic, and Latin.”

To classify each of the internet images, the investigators teamed up with co-author Helen M. Blau and dermatologists from Standard Medicine. Overall, the team was able to collect approximately 130,000 images of skin lesions that represented more than 2000 different diseases.

Only high-quality, biopsy confirmed images provided by the University of Edinburgh and the International Skin Imaging Collaboration Project were used during testing. Each of the dermatologists were asked whether they would proceed with a biopsy or treatment, or reassure the patient, based on each image.

Success was evaluated by how well the dermatologists were able to correctly diagnose both cancerous and non-cancerous lesions in more than 370 images. The performance of the algorithm was measured through the creation of a sensitivity-specificity curve—–where sensitivity represented its ability to correctly identify malignant lesions and specific represented its ability to correctly identify benign lesions, according to the study.

It was then assessed through the 3 diagnostic tests when observed using a dermoscopy: keratinocyte carcinoma classification, melanoma classification, and melanoma classification. In each of the 3 tasks, the algorithm was able to match the performance of the 21 dermatologists with the area under the sensitivity-specificity curve amount to at least 91% of the total area of the graph, according to the study.

An advantage of the AI algorithm is that it can be adjusted to become more or less sensitive, and allows investigators to tune its response depending on what they wanted to assess.

As of now, the algorithm only exists on a computer, but investigators hope to make it compatible with a smartphone in the near future.

“My main eureka moment was when I realized just how ubiquitous smartphones will be,” Esteva said. “Everyone will have a supercomputer in their pockets with a number of sensors in it, including a camera. What if we could use it to visually screen for skin cancer? Or other ailments?”

The investigators are confident that transitioning the algorithm to a smartphones will be relatively easy. However, they noted that more research needs to be done in a real-world clinical setting.

“Advances in computer-aided classification of benign versus malignant skin lesions could greatly assist dermatologists in improved diagnosis for challenging lesions and provide better management options for patients,” said co-author Susan Swetter. “However, rigorous prospective validation of the algorithm is necessary before it can be implemented in clinical practice, by practitioners and patients alike.”