Detection and removal of texts burned in medical images using deep learning
Files
Szelagowski_21661800_2023.pdf
Open access - Adobe PDF
- 25.31 MB
Details
- Supervisors
- Degree label
- Abstract
- This master thesis addresses a critical yet under-explored aspect of medical imaging: the removal of burned-in text that can bias AI algorithms and threaten patient confidentiality. While anonymization of DICOM tags is a well-established practice, the removal of patient identifiers burned into the pixels of medical images often poses a challenge due to the lack of accessible, open-source tools. To address this, this study presents a new approach involving the application of a scene text detection algorithm called TextBoxes. Using a synthetic dataset derived from The Cancer Imaging Archive, we trained different models and obtained impressive results. This achievement led to the creation of MedTextCleaner (MTC), an open-source and user-friendly plugin for Orthanc. MTC, incorporating our deep learning model, automates the de-identification process by suggesting potential text regions within an image for removal and allowing for manual user validation and adjustment. The integration of MedTextCleaner into Orthanc represents a valuable development in DICOM image anonymization, which can be especially useful for teaching and research applications.