Dipjyoti Paul

Dipjyoti Paul

Ph.D. Student | Research Scientist | Machine Learning | Deep Learning | Audio Signal Processing | Computer Vision | Conversational AI

Computer Science Department, University of Crete


I am a PhD researcher at the Department of Computer Science, University of Crete, Greece under Dr. Yannis Stylianou. I am co-advised by Dr. Yannis Pantazis and Dr. Simon King. My research interests span the general area of theory and applications of machine learning algorithms especially deep learning. My research agenda is to establish a thorough understanding of the theoretical concepts and then leverage these concepts for solving various real-world problems. I have research interests in diverse areas such as Applied Probability Theory, Machine/Deep Learning, Computer Vision, Signal/Speech Processing, Optimization Theory. I also have a broad interest in probabilistic machine learning methods and generative models.

Download my resumé.

  • Machine/Deep Learning
  • Audio Signal Processing
  • Speech Synthesis & Voice Conversion
  • Image Processing & Computer Vision
  • Spoofing & Anti-spoofing
  • Speaker Recognition.
  • PhD in Computer Science, 2018 - Present

    University of Crete

  • MS in Electronics & Electrical Communication Engineering, 2014 - 2017

    Indian Institute of Technology Kharagpur

  • BTech in Electronics & Communication Engineering, 2009 - 2013

    West Bengal University of Technology


Statistics & Signal Processing
Machine/Deep Learning


Senior Research Scientist (Part-time)
Stealth Startup, London, UK (Remote)
Oct 2020 – Present Greece

Responsibilities include:

  • Development of voice morphing algorithms.
  • Development of Text-to-Speech (TTS) systems.
  • Working on improving speech quality from lower bandwidth and lower bit-depth samples.
ML/AI/Audio/CV Freelancer
Oct 2017 – Sep 2020 Greece
  • My work varied from generic task formulation and consulting to the full cycle of design-development-deployment. In general, I helped to figure out and develop viable solutions for ML-based applications such as Audio, Computer Vision etc.
Visiting Researcher
Chania General Hospital, Greece
Sep 2020 – Sep 2020 Greece
  • Analysing voice pathology data and design novel deep learning algorithms to enhance speech intelligibility.
Visiting Researcher
Voxygen, France
Mar 2019 – May 2019 France
  • Developed a pipeline for expressive text-to-speech synthesis using sequence-to-sequence learning.
Visiting Researcher
Foundation for Research and Technology - Hellas, Greece
Aug 2018 – Oct 2018 Greece
  • Developed new training algorithms for Generative Adversarial Networks (GANs).
Marie Skłodowska-Curie Fellow
ENRICH European Union’s Training Network (ETN).
Oct 2017 – Sep 2020 Greece
  • Introduced universal multi-speaker, multi-style expressive TTS systems.
  • Developed speech intelligibility enhancement algorithms in speech synthesis.
  • Analyzed the feasibility of incorporating variational representation of disentangled representations learning in real-world scenarios.
  • Proposed novel voice morphing algorithms.
Junior Project Officer
Indian Institute of Technology, Kharagpur & Indian Space Research Organization (ISRO), Govt. of India.
Dec 2013 – Aug 2017 India
  • Built authentication systems that enhance the security of automatic speaker verification systems against intentional circumvention using fake audio recordings.


Invited Talk at Apple Siri.
Presented my research work on Text-to-Speech Synthesis.
Public understanding event at the Royal Institution in London.
Presented my research work in the public understanding event.
Google’s Speech Technology Summit.
Invited to attend Google’s 3rd Speech Technology Summit at Google London.
Marie Skłodowska-Curie Fellowship
Awarded Marie Skłodowska-Curie Fellowship during the year 2017–2020 from European Union’s training network (ETN).
Travel Grant.
Received full financial assistance to present research paper at ICASSP 2017 held at New Orleans, Louisiana, USA
Reviewer (Journals and Conferences)
IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE Journal of Selected Topics in Signal Processing
IEEE Access
Computer Speech and Language
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Conference of the International Speech Communication Association (INTERSPEECH)
Spoken Language Technology (SLT)
Research Fellowship
Awarded Fellowship during the year 2014–2017 from Indian Space Research Organization (ISRO), Govt. of India.


Image Translation

Image Translation

Image-to-Image translation using GANs.

Image Generation

Image Generation

Generating high-quality images using GANs.

Text-to-Speech Synthesis

Text-to-Speech Synthesis

Building voice from text.

Voice Conversion

Voice Conversion

Modifies the speech of a source speaker and makes their speech sound like that of another target speaker.

Spoofing Countermeasure

Spoofing Countermeasure

Developing a bona fide-spoofed classifier (spoofing countermeasure) for speech data.