Explainable and generalizable audio deepfake detection for forensic applications using foundation models

Position

Department

Date

07-2026

Reference

PhD student - Thesis offer (Ref.: SN/MT/deform/062026)

EURECOM is opening a PhD position in the Digital Security Department in the area of audio deepfake detection, speech forensics, and explainable artificial intelligence. The position is part of the Horizon Europe project DEFORM — Deepfake detection and Explainability using FOundation Representation Models — which aims to develop trustworthy AI-based forensic tools for the detection, explanation, and interpretation of manipulated or synthetic audio, image, and video evidence.

The PhD candidate will work on explainable and generalizable methods for detecting synthetic or manipulated speech in realistic forensic scenarios. Current audio deepfake detectors often perform well on known benchmarks but remain vulnerable to unseen speech synthesis methods, voice conversion systems, codec and channel variability, short-duration utterances, background noise, and adversarial manipulations. Moreover, most systems operate as black boxes and provide limited explanations of why a given recording is considered authentic or manipulated.

The objective of the PhD is to develop robust, interpretable, and forensic-ready audio deepfake detection methods. The research will investigate the use of self-supervised and foundation audio models, such as WavLM, wav2vec 2.0, HuBERT, Whisper, and related multimodal models, together with few-shot learning, domain adaptation, and adversarial/data augmentation strategies. A central goal will be to improve generalization to unseen attacks and real-world acoustic conditions while producing explanations that are meaningful to forensic practitioners.

The PhD work will include the following research directions:

Development of audio deepfake detection methods robust to unseen synthesis and voice conversion techniques;
Adaptation and fine-tuning of foundation audio models for forensic manipulation detection;
Few-shot and low-resource learning strategies for rapid adaptation to new deepfake generation methods;
Robustness analysis under realistic conditions, including compression, transmission artefacts, background noise, reverberation, short utterances, and adversarial perturbations;
Explainable AI methods for audio forensics, including time-frequency saliency, artefact localization, SHAP-style explanations, and human-interpretable forensic rationales;
Contribution to multimodal forensic benchmarks and possible integration with the broader DEFORM framework for audio, image, and video deepfake detection;
Evaluation of the proposed methods in line with forensic requirements such as reliability, traceability, interpretability, and reproducibility.

The successful candidate will join EURECOM’s research activities in speech security, biometric security, audio forensics, and deepfake detection. The work will be carried out in close collaboration with academic, industrial, forensic, and law-enforcement partners involved in the DEFORM project. The PhD candidate will contribute to scientific publications, international benchmarks, open research outputs, and project deliverables.

Requirements

Education Level / Degree:

Master’s degree or equivalent in computer science, electrical engineering, signal processing, machine learning, artificial intelligence, data science, or a related field.

Field / specialty:

Speech processing, audio signal processing, machine learning, deep learning, biometric security, multimedia forensics, or trustworthy AI.

Technologies:

Deep learning for audio and speech;
Self-supervised and foundation models for speech/audio;
Audio deepfake detection, anti-spoofing, or speaker verification;
Explainable AI and model interpretability;
Python and common machine learning frameworks such as PyTorch;
Experience with speech/audio toolkits and datasets is an advantage.

Languages / systems:

Excellent programming skills in Python, including experience with deep learning frameworks and DNN computing tools such as PyTorch, TensorFlow, CUDA, and GPU-based training environments;
Good command of Linux-based development environments;
Good written and spoken English.

Other skills / specialties:

Strong analytical and problem-solving skills;
Ability to work independently and as part of an international research team;
Interest in applied research with societal, legal, and forensic impact;
Good scientific writing and communication skills;
Previous experience with speech processing, ASVspoof-style tasks, audio deepfake detection, or explainable AI will be considered a strong asset.

Other important elements:

The position is based at EURECOM, Sophia Antipolis, France.
The PhD will be supervised within the Digital Security Department.
The candidate will participate in the scientific activities of the DEFORM project and collaborate with European partners.
The position involves publishing in leading international conferences and journals in speech processing, multimedia forensics, biometrics, and AI security.

Application

The application must include:

Detailed curriculum vitae;
Motivation letter describing the candidate’s background, research interests, and suitability for the position;
Academic transcripts for Bachelor’s and Master’s degrees;
Master thesis, if available;
List of publications, if applicable;
Names and contact details of at least two references.

Applications should be submitted by e-mail to todisco@eurecom.fr with secretariat@eurecom.fr in cc, with the reference: SN/MT/deform/062026

Start date: September 1st 2026
Type of employment contract : Fixed-term doctoral position in private law

More info

SN_MT_DEFORM_062026_US.pdf99.79 KB