Reply to the Letter to the Editor: “COVID-19, AI enthusiasts, and toy datasets: radiology without radiologists”

by H.R. Tizhoosh, Jennifer Fratesi (tizhoosh@uwaterloo.ca)

COVID-19, AI enthusiasts, and toy datasets:radiology without radiologists

Dear Editor,

In a reply to our letter “COVID-19, AI enthusiasts, and toy datasets: radiology without radiologists” [1], Sérgio D. Dias and Pedro E. M. Lopes (FastCompChem, Lda) “advocate a two-way communication between the clinical and technological worlds.” We certainly support this suggestion. The critique in our letter is directed toward works that exactly do not value this “communication” and as non-radiologists publish sketchy results based on toy datasets in violation of all established good practices for curating reliable datasets. Dias and Lopes state that their “view aligns in general with the opinion” expressed in our letter. However, they also say that in their “experience, toy datasets are still of value, even when not following all accepted standards.” This could be a major concern when works get published using toy datasets amid a pandemic with aggrandizing names such as “COVID-Net” with implied or direct claims on a diagnosis. It seems Dias and Lopes would agree with this as well. They continue that they “used the compilation of Cohen et al. … as source of radiographs. The dataset is certainly not the well curated…but it was quickly available allowing the initial development effort.” Although there may be some value in using toy datasets for initial developments in the lab, what we categorically reject is reporting “faulty results based on creating amateur datasets and training sketchy AI solutions hastily to publish online.” Later improvements of such faulty results and publishing them in reputable places do not help either because the damage has already been done.

Dias and Lopes “fully agree with the ideas” we expressed in our letter, but they “would not discard the use of toy datasets if that is the only avenue of development, including training of deep learning algorithms.” Initial experiments in a lab may be fine with toy datasets. However, training deep networks with amateur data is risky if those networks are made publicly available. We take comfort in Dias and Lopes further agreement when they state: “the true validation can only occur in controlled clinical trials under the supervision of radiologists though.”