TY - JOUR AU - Bourdois, Loick AU - Avalos, Marta AU - Chenais, Gabrielle AU - Thiessard, Frantz AU - Revel, Philippe AU - Gil-Jardine, Cedric AU - Lagarde, Emmanuel PY - 2021/04/18 Y2 - 2024/03/28 TI - De-identification of Emergency Medical Records in French: Survey and Comparison of State-of-the-Art Automated Systems JF - The International FLAIRS Conference Proceedings JA - FLAIRS VL - 34 IS - 0 SE - Special Track: AI in Healthcare Informatics DO - 10.32473/flairs.v34i1.128480 UR - https://journals.flvc.org/FLAIRS/article/view/128480 SP - AB - <p>In France, structured data from emergency room (ER) visits are aggregated at the national level to build a syndromic surveillance system for several health events. For visits motivated by a traumatic event, information on the causes are stored in free-text clinical notes. To exploit these data, an automated de-identification system guaranteeing protection of privacy is required.<br>In this study we review available de-identification tools to de-identify free-text clinical documents in French.&nbsp;A key point is how to overcome the resource barrier&nbsp;that hampers NLP applications in languages other than&nbsp;English. We compare rule-based, named entity recognition, new Transformer-based deep learning and hybrid systems using, when required, a fine-tuning set of 30,000 unlabeled clinical notes. The evaluation is performed on a test set of 3,000 manually annotated notes.<br>Hybrid systems, combining capabilities in complementary tasks, show the best performance. This work is a first step in the foundation of a national surveillance system based on the exhaustive collection of ER visits reports for automated trauma monitoring.</p> ER -