Work Experience

Software Engineer - REVE Systems

February 2023 - Present
Dhaka, Bangladesh

Deployed a gender and age predictive model from speech, achieving 93% and 74% accuracy respectively.
Optimized a custom-trained VITS text-to-speech model for ONNX on Bengali datasets, doubling inference speed for rapid cross-platform deployment.
Developed a Your TTS-based multi-speaker model, enabling diverse voice synthesis with a single model.
Enhanced ASR by training NVIDIA's Nemo Quartznet on Bengali datasets with beam search, improving prediction accuracy.
Built a Speaker Diarization system using Nvidia TitaNet-L, identifying speakers in multi-party conversations with over 80% accuracy.
Implemented a 3x faster FastSpeech 2 pipeline integrating TorchServe, WebSocket, and Redis for efficient batch processing.
Conducted unsupervised clustering of noises using K-means, t-SNE, and silhouette scoring for future noise augmentation strategies.
Developed a tri-layered Biometric Authentication System with speaker, facial, and fingerprint verification, supported by ROC & AUC analysis for threshold setting.
Created a NISQA-based speech quality evaluation tool for quantitative assessment of speech signal integrity and clarity.
Developed a Keycloak SPI for integrating face and fingerprint authentication, enhancing security with biometric verification.

Technologies: Python, FastAPI, Scikitlearn, Tensorflow, Pytorch, Tensorboard, MLflow, Redis, Torchserve, Websocket, MongoDB, Linux, Lambdalabs, Java, Prompt Engineering

Junior Software Engineer - REVE Systems

September 2022 - January 2023
Dhaka, Bangladesh

Implemented audio analysis techniques for pitch detection, volume normalization, and energy calculation.
Doubled voice activity detection accuracy across use cases with sileroVAD from WebRTCVAD.
Enhanced audio quality using DeepfilterNet and Meta speech enhancement pretrained models for noise reduction.
Optimized audio data processing speed by a factor of 5 using Python's multiprocessing.
Configured Azure and Google Cloud STT/TTS APIs for comprehensive speech-to-text and text-to-speech evaluations.
Applied multispeaker speech separation technology using Conv-TasNet and DPRNN pretrained models.

Technologies: Python, FastAPI, Matplotlib, Seaborn, Numpy, Pandas, GCP

Trainee Software Engineer - REVE Systems

April 2022 - August 2022
Dhaka, Bangladesh

Engineered a Bengali Text Normalizer and trained a FastSpeech 2 TTS model to support Bengali language processing.
Gained proficiency in containerization with Docker and API development with Flask.
Performed audio data analysis and signal processing, focusing on data visualization for insights.

Technologies: JavaScript, Python, Docker, Flask, Postman, GitHub

Intern - SELISE rockin' software

May 2019 - June 2019
Dhaka, Bangladesh

Built an online quiz platform; front-end in HTML, CSS, JavaScript, Bootstrap; back-end in PHP, Ajax, MySQL.
Learned GIT, SCRUM, SDLC; facilitated admin controls for question curation.
Implemented registration for timed student tests and performance comparison.

Technologies: HTML, CSS, JavaScript, Bootstrap, PHP, Ajax, MySQL