Open to opportunities

Sheikh
Khairul Momin
Mohammad Tahmid

MSc Data Science (Kingston University) · 92%+ accuracy on transformer NLP models · SHAP & LIME explainability · 6+ years of data-informed professional experience across analytics, operations, and business development.

Sheikh Khairul Momin Mohammad Tahmid

Over 6 years of professional experience delivering measurable outcomes - 35% revenue growth, 40% repeat business uplift, and 12% cost reduction. MSc in Data Science from Kingston University, with independent portfolio projects spanning fintech, healthcare, legal NLP, HR tech, and crime analytics.

PythonSQLMySQLPostgreSQLMachine LearningNLPBERTRoBERTaXGBoostSHAPLIMEFastAPIDockerAWS
Highlights

Key Achievements

Quantified outcomes across machine learning, explainable AI, and commercial performance — combining production-grade technical delivery with measurable business impact.

92%+
Transformer Accuracy
BERT and RoBERTa fine-tuned on real-world NLP classification tasks
99.8%
Classifier Accuracy
TF-IDF model across 5 crime categories on 280k+ records
R²=0.9996
Regression Performance
XGBoost trained on 4,137 features for lead quality scoring
35%
Revenue Growth
Delivered through data-informed partnership and business strategy
40%
Repeat Business Uplift
Driven by 50+ analytical client presentations and insight delivery
12%
Cost Reduction
Achieved by leading offshore teams with data-driven performance metrics
Explainable AI: Built SHAP and LIME pipelines for token-level interpretation and model transparency across NLP, fintech, healthcare, legal, and HR analytics projects.
HR Tech: Architected an AI hiring intelligence platform, SBERT candidate ranking, SHAP/LIME explainability per match, and fairness auditing across gender, ethnicity, and age groups.
🏆 Recognition: Awarded "Best Employee of the Year" 2021 by Quantanite for data-driven leadership and consistent operational excellence.
Portfolio

Featured Projects

Real-world applications of data science, machine learning, and explainable AI.

Explainable AI Hiring Intelligence Platform
HR Tech / NLPExplainable AI Recruitment Intelligence System

Explainable AI Hiring Intelligence Platform

Modular 12-app Django REST API with JWT auth and role-based permissions, generating 384-dim SBERT embeddings to cosine-rank 3,000+ candidates per job posting. SHAP and LIME explanations are fused 60/40 into a unified explainability report per match, with disparate impact ratio and 4/5 rule computed across gender, ethnicity, and age groups. An async Celery pipeline orchestrates CV parsing, embedding generation, and batch matching, while a trainable GradientBoosting scorer ships with zero-downtime fallback to fixed weights.

PythonDjangoDRFSBERTspaCySHAPLIMECeleryRedisPostgreSQLReactDocker
384
Embedding Dims
3K+
Candidates Ranked
60/40
SHAP/LIME Fusion
12
Django Apps
View on GitHub View Demo
TrialGuard Clinical Trial Dropout Prediction
Healthcare AI / Survival AnalysisClinical SaaS for Patient Retention Intelligence

TrialGuard: Clinical Trial Patient Dropout Prediction Platform

Django SaaS platform predicting clinical trial dropout up to 60 days in advance. XGBoost classifier with per-visit SHAP waterfall plots identifies at-risk patients at the individual level. Cox Proportional Hazards model delivers hazard ratios and 30/60/90-day retention probabilities, with stratified Kaplan-Meier curves segmented by risk tier (Low to Critical) via Lifelines. Cohort forecasts with 95% confidence intervals served through a DRF REST API; branded ReportLab PDFs generated per patient including SHAP feature drivers, survival curves, and clinical action logs.

PythonDjangoDRFXGBoostSHAPLifelinesCox PHReportLabMySQL
60
Days Ahead
4
Risk Tiers
95%
CI Forecasts
3
Time Horizons
View on GitHub View Demo
Contract Intelligence Platform
Legal AI / NLPMulti-Model Legal Intelligence System

Contract Intelligence & Power Imbalance Platform

Fine-tuned LegalBERT and Legal-RoBERTa-large ensemble classifying every clause across a 100-type unified taxonomy (CUAD, LEDGAR, MAUD - 132k clauses, macro F1 0.606). Anomaly detection fuses Isolation Forest with a 1024-dim autoencoder; bilateral power imbalance is scored −100 to +100 via sentiment, modal verb, and obligation NLP. Every prediction is explained at token level with SHAP, served through a FastAPI REST API and dark-theme dashboard with downloadable PDF reports.

PythonPyTorchLegalBERTLegal-RoBERTaSHAPFastAPIUMAPSQLite
0.606
Macro F1
100
Clause Types
132K
Training Clauses
±100
Imbalance Scale
View on GitHub View Demo
Contract Intelligence and Power Imbalance Analysis Platform
FINTECH / AIProduction AI Risk Platform

AI Revenue Leakage Detection Platform

Production contract intelligence platform with XGBoost/Logistic Regression ensemble (70/30) scoring invoice leakage probability, Isolation Forest anomaly detection per record, and Prophet forecasting with 90% CI bands over a 24-month revenue horizon. SHAP attribution pipeline exposes top risk drivers per invoice; a 7-rule leakage engine generates 13k+ alerts covering missing payments, underbilling, and duplicates. Predictions served via DRF API with a Chart.js dashboard and per-invoice drill-down modals.

PythonDjangoXGBoostProphetSHAPIsolation ForestMySQLChart.js
70/30
Ensemble Split
13k+
Leakage Alerts
24mo
Forecast Horizon
7
Leakage Rules
View on GitHub View Demo
ML-Powered Portfolio Stress Testing
FINTECH / AIProduction AI Risk Platform

AI Portfolio Stress Testing Platform

Production risk platform with GMM regime classification, Ledoit-Wolf shrinkage, and XGBoost/ElasticNet trained on 50+ macro features, delivering Sharpe 4.42, VaR -12.07%, and CVaR -17.48%. Stress-tested 20+ historical crises with SHAP attribution and plain-English narratives, plus reverse stress-testing for threshold-based shock back-solving. FastAPI dashboard refreshes 20 live instruments hourly.

PythonFastAPIXGBoostElasticNetSHAPGMMSciPyyfinanceChart.js
4.42
Sharpe Ratio
-12.07%
VaR
20+
Crises Stress-Tested
50+
Macro Features
View on GitHub View Demo
PolicyGate AI Agent Action Firewall
AI / LLMOpsDeclarative LLM Agent Action Firewall

PolicyGate: AI Agent Action Firewall

Deployed a framework-agnostic LLM agent firewall with a YAML policy DSL (6 operators, hot-reload on file mtime), dual-layer safety enforcement via system prompt and mechanical gate, and per-response confidence scoring (High/Medium/Low) via constrained secondary LLM call. Supports LangChain, CrewAI, and AutoGen via POST /gate. All gate decisions logged to SQLite with WAL mode and threading.Lock() for concurrent write safety.

PythonFastAPIPydanticOpenAI SDKOpenRouterPyYAMLSQLiteChart.jsDockerHuggingFace Spaces
3
Frameworks Supported
6
Policy Operators
16
Pytest Tests
Live
HF Spaces Demo
AI Lead Quality Scoring
AI / MLNLP-Powered Lead Scoring & Spam Classification

AI Lead Quality Scoring & Spam Detection

Fine-tuned BERT (99.12%) and RoBERTa (99.41%) for spam detection, paired with XGBoost scoring at R²=0.9996 across 4,137 engineered features from UK Companies House, firmographics, and DNS/SMTP signals. Deployed via FastAPI with live SHAP explanations, a Tailwind SPA, and an automated nightly retraining pipeline.

Done in collaboration with Prince Kumar Nath.

PythonPyTorchHugging FaceBERTRoBERTaXGBoostFastAPISHAPSQLite
99.41%
Spam Detection
0.9996
Lead Score R²
30K
Leads Engineered
4,137
Features Built
View on GitHub
Game AI / BackendGrounded AI Game Master Engine

Ironwood Dungeon: AI Game Master

Server-authoritative D&D 5E engine powered by NVIDIA NIM, with an LLM locked to 10 registered server-side tools — inventory, HP, and room contents verified before narration. Structured output enforced via game_response() schema across narrative, HP delta, and action list. Procedurally generates a 4-floor, 20-room dungeon with 7 room types and 4 unique floor bosses. Multiplayer for up to 4 players via Socket.IO with shared AI history per room, full conversation history persisted to disk across restarts, and deployed to HuggingFace Spaces via Docker with Git LFS for 100 MB+ assets.

PythonFlaskFlask-SocketIONVIDIA NIMOpenAI SDKJavaScriptDockerHuggingFace Spaces
10
Server Tools
20
Rooms / Run
4
Floor Bosses
4
Multiplayer
View on GitHub View Demo
Fake News Detection
FEATUREDMSc Dissertation

Fake News Detection

MSc dissertation on fake news detection using large language models and explainable AI. Fine-tuned BERT and RoBERTa, achieving 92%+ accuracy on the LIAR dataset. Applied LIME and SHAP for token-level prediction transparency and interpretability.

PythonPyTorchHugging FaceBERTRoBERTaLIMESHAPNLP
92%
Model Accuracy
3
Datasets Integrated
2
XAI Techniques
View on GitHub Read Thesis Paper
Crime Analytics
ANALYTICSEnd-to-End Analytics Project

Montgomery County Crime Analytics

Engineered and analysed 280k+ crime records across geospatial, temporal, and behavioural dimensions. Built TF-IDF classification, crime severity prediction, KMeans clustering, and 74-month SARIMA forecasting pipelines, then surfaced insights through an interactive Plotly Dash dashboard.

PythonPandasScikit-learnPlotlyDashNLPSARIMA
280k+
Records Cleaned
99.8%
Classifier Accuracy
0.998
Prediction Accuracy
6
Clusters Identified
View on GitHub
HMLR Boundary Extraction
ASSESSMENTGeospatial Data Project

HMLR Boundary Extraction

Land Registry assessment on geospatial data handling and boundary extraction. Processed HM Land Registry boundary datasets, converting spatial data into structured GeoPackage format with automated cleaning, projection, and export workflows for GIS compatibility.

PythonGeoPandasShapelyFionaGISGeoPackage
View on GitHub
Technical Expertise

Skills & Technologies

A recruiter-focused technical stack spanning machine learning, NLP, explainable AI, analytics, visualisation, and geospatial data processing.

⌨️Programming Languages
Python SQL MATLAB Java C++ C HTML R Programming JavaScript React Tailwind CSS
🤖Data Science & AI
Machine Learning Deep Learning NLP BERT LegalBERT RoBERTa Legal-RoBERTa SHAP LIME Time Series SARIMA TF-IDF KMeans Pandas NumPy scikit-learn PyTorch XGBoost Hugging Face Transformers Matplotlib Seaborn SciPy SBERT Isolation Forest Prophet Gradient Boosting Survival Analysis Lifelines spaCy
📊Data Analytics & Visualisation
Tableau Excel Jupyter Notebook Plotly Dash Dashboarding Quantitative Research
🗺️Geospatial Analysis
GeoPandas Shapely Fiona GDAL/OGR GIS GeoPackage Mapbox
🔧Tools & Cloud
Git GitHub AWSAWS FastAPI SQLite UvicornUvicorn Pydantic joblibjoblib Chart.js Jinja Django DRF PostgreSQL MySQL Redis Celery Docker JWT Vite ReportLab
Career

Professional Experience

Over 6 years of professional experience across analytics, operations, and business development.

1
Business Development Manager
Ideal PCO Licence, London, UK
Mar 2023 – Jan 2025
  • Defined and executed business goals using data-driven strategies to meet key performance targets.
  • Leveraged data analysis across Python, SQL, and Excel to track industry patterns, supporting stronger forecasting and commercial decision-making.
  • Applied data analysis techniques to track industry patterns, contributing to 15% revenue growth.
  • Built high-value partnerships through data-informed strategy, contributing to 35% annual revenue growth.
  • Delivered 50+ client presentations supported by analytical insights, driving 40% more repeat business.
  • Secured high-value partnerships through data-informed strategies, increasing annual revenue by 35%.
2
Senior Associate
Quantanite (formerly Taskeater), Dhaka, Bangladesh
Jul 2021 – Aug 2022
  • Tracked team tasks with performance data, achieving 99.8% on-time delivery and 98%+ quality.
  • Analysed team KPIs to identify process inefficiencies and implement data-informed improvements.
  • Used performance metrics to design and deliver training, reducing new hire ramp-up time by 20%.
  • Managed daily, weekly, and monthly client reporting cycles using performance data, achieving 100% on-time delivery and 98%+ quality.
  • Evaluated outputs to identify quality gaps, reducing error rates by 15% through data-based performance improvements.
  • Recognised as Best Employee of the Year for data-driven leadership and consistent operational excellence.
3
Associate
Quantanite (formerly Taskeater), Dhaka, Bangladesh
Jun 2017 – Jul 2021
  • Processed and validated 5K+ weekly fashion industry datasets with high accuracy and timeliness.
  • Collaborated with lead generation teams, analysing data trends to boost qualified leads by 15%.
  • Monitored data categorisation and team efficiency, partnering with QA to keep error rates under 4% through compliance tracking.
  • Partnered with QA to keep error rates under 4% via data quality checks and compliance tracking.
  • Applied data analysis skills to identify trends, improve reporting accuracy, and strengthen client satisfaction.
4
Analyst
Quantanite (formerly Taskeater), Dhaka, Bangladesh
Dec 2016 – Jun 2017
  • Career progression built on strong data accuracy, quality control discipline, and dependable delivery in fast-paced operations.
  • Recognised internally for consistent quality standards, operational reliability, and readiness for promotion into broader analytical responsibilities.
  • Established the performance foundation that later supported KPI reporting, training design, and cross-functional process improvement.
Professional Strengths

Core Capabilities

A balanced mix of analytical thinking, operational excellence, and stakeholder communication developed through data science projects and commercial leadership roles.

🎯Analytical & Strategic
Data-Driven Decision Making
Analytical Thinking
Critical Thinking
Portfolio Optimisation
Continuous Improvement Mindset
Change Management
⚙️Operational & Performance
Performance Optimisation
Performance Management
Process Improvement
Operational Efficiency
KPI Tracking
Time Management
🤝Communication & Collaboration
Cross-Functional Collaboration
Client Relationship Management
Mentoring & Training
Data Visualisation & Reporting
Academic Background

Education

🎓

MSc Data Science

Kingston University, London
Jan 2025 – Mar 2026

Dissertation: Enhancing Fake News Detection with Explainable AI

🎓

MBA Business Administration

Cardiff Metropolitan University
Jun 2015 – Jul 2016

Project: New Start-up Proposal

🎓

MSc Electronics & Computer Engineering

University of Birmingham
Sep 2013 – Oct 2014

Dissertation: Highlighting Important Data in Visual Analytics

🎓

BSc Computer Engineering

Abu Dhabi University
Jan 2009 – Jun 2013

Minor in Management. Certificates of Excellence for Academic Achievement.

Professional Development

Certifications

🏆

AWS Cloud Bootcamp

ThinkCloudly
Mar 2026  •  5 CPE Hours

Certificate No: TC-032026-2PN03Y7-23388

🏆

Security Operation Center Bootcamp

ThinkCloudly
Apr 2026  •  5 CPE Hours

Certificate No: TC-042026-07HV84M-25960

🏆

IT Auditing & GRC Bootcamp

ThinkCloudly
Apr 2026  •  5 CPE Hours

Certificate No: TC-042026-PI5117K-26914

Get In Touch

Let's Connect

MSc Data Science graduate with 6+ years of professional experience delivering measurable business outcomes. Seeking junior data scientist, analytics engineer, or graduate scheme roles where I can apply transformer models, explainable AI, and end-to-end ML project experience across fintech, healthcare, legal, HR tech, or any data-driven organisation.

Open to: Junior Data Scientist · Data Analyst · Analytics Engineer · ML Engineer (Graduate) · Data Science Graduate Scheme