I'm Farrukh, an ML engineer who enjoys building production grade ML systems and squeezing models into places they probably shouldn't fit.
I work primarily with PyTorch, TensorFlow, and Hugging Face Transformers, focusing on model optimization, deployment, and efficient AI systems. My background is in mechanical engineering, but I’ve spent the past year designing and deploying ML pipelines—turns out optimizing fluid flow equations isn’t that different from optimizing neural networks.
Right now, I’m focusing on projects that make ML systems leaner and easier to ship.
Currently building a data-centric MLOps pipeline in public — starting from raw NYC taxi data and gradually shaping it into a production-grade ML workflow.
Production Inference API – Built a DistilBERT service on AWS EC2 using FastAPI...
I was part of the UraanAI Techathon 2025, where our team built an integrated AI framework for manufacturing—computer vision for defect detection (99.6% accuracy), BiLSTM-GRU for predictive maintenance, and LightGBM for demand forecasting. The focus was deployment under real industrial constraints—limited compute, bandwidth, and cost.
I’ve also worked on model compression, taking a ResNet-based model from 45M parameters down to 180K (99.6% smaller) through knowledge distillation while keeping 94% accuracy. That 4× speedup made real-time inference viable on resource-limited hardware.
| Project | Description | Tech Stack | Highlights | 
|---|---|---|---|
| Data-Centric-MLOps-Pipeline | Active project — building an end-to-end, data-first MLOps pipeline using NYC Green Taxi data. Focused on data ingestion, validation, and reproducible pipelines. | FastAPI • Docker • GitHub Actions • Pandas • DVC (upcoming) | Learning and building in public | 
| PakIndustry-4.0 | Integrated AI system for manufacturing — computer vision for defects, predictive maintenance, and demand forecasting. | PyTorch • LightGBM • FastAPI | 99.6% defect detection • Predictive RUL (MAE = 13.4) • Edge deployment | 
| Sentiment-MLOps | Production-ready DistilBERT inference API deployed on AWS with CI/CD automation. | Hugging Face • FastAPI • Docker • AWS • GitHub Actions | Quantized model (−50% size / latency) • End-to-end deployment pipeline | 
| Model-Compression | Knowledge distillation and quantization pipeline for compact, high-performance models. | PyTorch • ONNX • NumPy | 99.6% parameter reduction (45M → 180K) • 4× faster inference | 
📧 smfarrukhm@gmail.com • 
💼 LinkedIn  
💡 Open to ML engineering opportunities