cv

Basics

Name Ahmad Faraz Khan
Label Ph.D. Candidate in Computer Science
Email ahmadfk@vt.edu
Url https://afkd98.github.io
Summary Ph.D. candidate specializing in Machine Learning Systems, with a robust focus on Federated Learning optimization. Demonstrates comprehensive expertise across programming languages and development tools, contributing significantly to both academic research and practical applications.

Education

  • 2020.12 - Present
    Ph.D.
    Virginia Tech
    Computer Science
    • Machine Learning Systems
    • Distributed Systems
    • Deep Learning
    • Machine Learning
    • Cloud Development
    • Computer Systems
  • 2016.01 - 2020.01
    B.S.
    LUMS
    Computer Science
    • Distributed Systems
    • Deep Learning
    • Machine Learning
    • Cloud Development
    • Computer Systems

Work

  • 2020.12 - Present
    Graduate Research Assistant
    Virginia Tech, DSSL
    Mentored by Dr. Ali Butt, focused on developing solutions for resource-constrained learning. My research spans the design of distributed systems, enhanced learning schedulers, and the fine-tuning of Large Language Models (LLMs), aiming at optimizing resource utilization, accuracy, and efficiency in privacy-aware learning environments.
    • Built a distributed learning system in Pytorch for resource-constrained privacy-aware learning, enhancing resource utilization by 81x, scalability by 78x, and accuracy by 53%.
    • Designed a distributed learning parameter server on Hadoop Spark to support over one million learning nodes, increasing scalability by 4x, reducing latency by 8x, and cutting costs by 2x.
    • Developed a scheduler for distributed learning systems in Pytorch, which improved accuracy by 57% and reduced training time by 40%.
    • Engineered an efficient, infinitely scalable, and cost-effective cache on AWS Lambda, ElastiCache, SageMaker, and EC2 for non-training workloads, decreasing latency by 99.9% and costs by 99.6%.
    • Improved distributed ML schedulers in Pytorch to identify and eliminate adversarial data sources, increasing accuracy by 7% by successfully mitigating 100% of malicious data sources.
    • Developed clustering-based personalized learning solutions in Pytorch for distributed ML systems, enhancing personalized accuracy by up to 45%.
    • Designed a RAG-based context-aware LLM framework using Hugging Face and Pytorch, automating the adaptive online configuration of distributed cloud services to enhance resource efficiency.
    • Implemented a Direct Preference Optimization (DPO)-based approach to mitigate sycophancy by fine-tuning LLMs on our curated dataset, reducing sycophancy by 64% in persona-based tests and 44% in preference-driven tests.
    • Developed a DPO-based approach for prompt optimization without separate reward modeling for LLMs, improving score by 27% compared to supervised fine-tuning.
  • 2020.05 - 2020.12
    Associate Data Engineer
    i2c Inc.
    Spearheaded the development and maintenance of distributed databases, focusing on performance optimization and scalability.

Publications

Skills

Programming Languages
Python
Javascript
C/C++
Java
Go
Tools & Libraries
PySpark
AWS Suite
Pandas
Numba
Dask
Docker
PyTorch
TensorFlow
IBMfl lib
FedScale
Selenium
Appium
gnuplot
ES6+
TypeScript
React/Redux
Node.js
Express
MongoDB
SQL
FLSim
Spark MLlib
Hadoop
Kubernetes
OpenFaaS
CUDA

Languages

English
Fluent

Interests

Federated Learning
Resource Optimization
Model Performance
System and Data Heterogeneity
Machine Learning Systems
Scalability
Efficiency
Cloud Development

Projects

  • 2020.12 - Present
    Federated Learning Frameworks
    Led the design and development of both Horizontal and Vertical Federated Learning frameworks, integrating MLOps pipelines with AWS cloud resources.
    • HFL & VFL framework development
    • MLOps pipeline integration with AWS
    • Significant contributions to AAAI'24 and AAMAS'24
  • 2020.12 - Present
    ML System Optimization
    Developed algorithms to enhance ML system architectures for improved resource allocation, scalability, and efficiency.
    • Algorithm development for system optimization
    • Impactful contributions leading to publications in IEEE and ACM conferences