cv
Basics
Name | Ahmad Faraz Khan |
Label | Ph.D. Candidate in Computer Science |
ahmadfk@vt.edu | |
Url | https://afkd98.github.io |
Summary | Ph.D. candidate specializing in Machine Learning Systems, with a robust focus on Federated Learning optimization. Demonstrates comprehensive expertise across programming languages and development tools, contributing significantly to both academic research and practical applications. |
Education
Work
- 2020.12 - Present
Graduate Research Assistant
Virginia Tech, DSSL
Mentored by Dr. Ali Butt, focused on developing solutions for resource-constrained learning. My research spans the design of distributed systems, enhanced learning schedulers, and the fine-tuning of Large Language Models (LLMs), aiming at optimizing resource utilization, accuracy, and efficiency in privacy-aware learning environments.
- Built a distributed learning system in Pytorch for resource-constrained privacy-aware learning, enhancing resource utilization by 81x, scalability by 78x, and accuracy by 53%.
- Designed a distributed learning parameter server on Hadoop Spark to support over one million learning nodes, increasing scalability by 4x, reducing latency by 8x, and cutting costs by 2x.
- Developed a scheduler for distributed learning systems in Pytorch, which improved accuracy by 57% and reduced training time by 40%.
- Engineered an efficient, infinitely scalable, and cost-effective cache on AWS Lambda, ElastiCache, SageMaker, and EC2 for non-training workloads, decreasing latency by 99.9% and costs by 99.6%.
- Improved distributed ML schedulers in Pytorch to identify and eliminate adversarial data sources, increasing accuracy by 7% by successfully mitigating 100% of malicious data sources.
- Developed clustering-based personalized learning solutions in Pytorch for distributed ML systems, enhancing personalized accuracy by up to 45%.
- Designed a RAG-based context-aware LLM framework using Hugging Face and Pytorch, automating the adaptive online configuration of distributed cloud services to enhance resource efficiency.
- Implemented a Direct Preference Optimization (DPO)-based approach to mitigate sycophancy by fine-tuning LLMs on our curated dataset, reducing sycophancy by 64% in persona-based tests and 44% in preference-driven tests.
- Developed a DPO-based approach for prompt optimization without separate reward modeling for LLMs, improving score by 27% compared to supervised fine-tuning.
- 2020.05 - 2020.12
Associate Data Engineer
i2c Inc.
Spearheaded the development and maintenance of distributed databases, focusing on performance optimization and scalability.
Publications
-
2025.7.31 FLStore: A Cache for Non-Training Workloads in Federated Learning
Published: MLSys'25
Designed 'FLStore' locality-aware processing cache for handling non-training workloads of distributed privacy-aware learning efficiently at low cost. Decreased latency up to 99.9% and cost up to 99.6%
-
2025.6.12 PETER: Privacy-Preserving Vertical Federated Learning Against Feature Inference Attacks
Submitted: TIFS'24
Proposed a lossless and efficient defense mechanism for inference attacks in Vertical Federated Learning environments.
-
2025.3.6 IP-FL: Incentive-driven Personalization in Federated Learning
Published: IPDPS'25
Designed an incentive-driven personalized training and fine-tuning distributed learning framework for resource-constrained data-heterogeneous Edge devices. Enhanced personalized accuracy by up to 45%.
-
2025.2.15 LADs: Leveraging LLMs for AI-Driven DevOps
Submitted: ACL'25
Developed a first-of-its-kind reasoning-based Agentic AI-driven DevOps platform for adaptive online configuration of cloud systems, employing context-aware prompting for optimal resource efficiency and reduced human effort and cost.
-
2024.7.9 Prompt optimization for LLMs
Submitted: COLM'24
Developed a Direct Preference Optimization approach harnessing human preferences for prompt optimization of text-to-image tasks. Improved score by 27% compared to supervised fine-tuning
-
2024.4.22 FLOAT: Federataed Learning Optimizations with Automated Tuning
Published: ACM EuroSys'24
Designed 'FLOAT', a framework for enabling distributed learning and fine-tuning with high efficiency and resource utilization at low cost on constrained and heterogeneous Edge devices, leveraging Reinforcement Learning with Human Feedback. Improved resource utilization by 81x, scalability by 78x, and accuracy by 53%.
-
2024.12.15 ICL: An Incentivized Collaborative Learning Framework
Published: IEEE BigData'24
Proposed a framework to incentivize collaboration in distributed learning environments.
-
2024.12.15 Mitigating Sycophancy in Large Language Models via Direct Preference Optimization
Published: IEEE BigData'24
Introduced a Direct Preference Optimization approach to mitigate sycophancy by fine-tuning LLMs on a curated dataset. Reduced sycophancy by 64% in persona-based tests and 44% in preference-driven tests.
-
2024.12.15 Personalized Federated Learning Techniques: Empirical Analysis
Published: IEEE BigData'24
Conducted a thorough analysis of personalized ML algorithms, highlighting the trade-offs between privacy and performance.
-
2024.12.15 DynamicFL: Federated Learning with Dynamic Communication Resource Allocation
Published: IEEE BigData'24
Designed 'DynamicFL' that dynamically allocates communication resources in distributed learning based on data heterogeneity, enhancing model accuracy by up to 10% compared to standard methods.
-
2023.12.15 Towards cost-effective and resource-aware aggregation at Edge for Federated Learning
Published: IEEE BigData'23
Developed an adaptive aggregator that significantly enhances scalability and time efficiency for Federated Learning on Edge and IoT devices. Increased scalability by 4x, reducing latency by 8x, and cutting costs by 2x
-
2023.10.20 A survey on attacks and their countermeasures in deep learning: Applications in deep neural networks, federated, transfer, and deep reinforcement learning
Published: IEEE Access'24
Conducted a survey on adversarial tactics in deep learning models, emphasizing their applications and distinct features.
-
2022.7.9 Privacy Preserving and Feature Importance Based Incentive Mechanism in Vertical Federated Learning
Submitted: ICML'25
Monetized VFL with PERFACY-FL, an incentive mechanism valuing data quality and privacy using Homomorphic Encryption, boosting participation and profitability.
-
2022.7.10 Distributed Learning Schedulers
Published: FL-AAAI’22, IEEE CLOUD’22, AAAI-AIES’24
Developed a scheduler for distributed learning systems in Pytorch, which improved accuracy by 57% and reduced training time by 40%.
-
2022.12 Heterogeneity-Aware Adaptive Federated Learning Scheduling
Published: IEEE BigData'22
Proposes an adaptive scheduling method for federated learning that addresses resource and data heterogeneity, aiming to improve training efficiency and fairness.
-
2022.07 Tokenized Incentive for Federated Learning
Published: AAAI'22
Presents a tokenized incentive framework designed to motivate clients' participation and data sharing in federated learning environments.
-
2022.07 TIFF: Tokenized Incentive for Federated Learning
Published: IEEE BigData'22
Introduces a token-based incentive mechanism to encourage high-quality data contribution and sustained participation in federated learning systems.
Skills
Programming Languages | |
Python | |
Javascript | |
C/C++ | |
Java | |
Go |
Tools & Libraries | |
LangChain | |
Ray | |
Spark MLlib | |
Hugging Face | |
Ollama | |
PyTorch | |
TensorFlow | |
PySpark | |
AWS Suite | |
Pandas | |
Numba | |
Dask | |
Docker | |
IBMfl lib | |
FLOWER | |
FedScale | |
Hadoop | |
Kubernetes | |
OpenFaaS | |
CUDA |
Languages
English | |
Fluent |
Interests
Federated Learning | |
Resource Optimization | |
Model Performance | |
System and Data Heterogeneity |
Machine Learning Systems | |
Scalability | |
Efficiency | |
Cloud Development |
Projects
- 2020.12 - Present
ML System Optimization
Developed algorithms to enhance ML system architectures for improved resource allocation, scalability, and efficiency.
- Designed and implemented DynamicFL to address heterogeneity in federated learning, published in IEEE BigData'24 (Best Paper).
- Created algorithms for personalized federated learning techniques, leading to empirical insights published in IEEE BigData'24.
- Optimized distributed ML systems for resource-constrained environments, contributing to impactful publications in EuroSys'24 and IEEE BigData'23.
- 2020.12 - Present
Federated Learning Frameworks
Led the design and development of both Horizontal and Vertical Federated Learning frameworks, integrating MLOps pipelines with AWS cloud resources.
- Developed HFL & VFL frameworks with tokenized incentives for participation, resulting in publications in AAAI and IEEE CLOUD.
- Integrated MLOps pipelines with AWS resources to ensure scalability and reliability.
- Contributed to the development of incentive mechanisms for collaborative learning frameworks, as published in IEEE BigData'24 and IPDPS'25.
- 2021.01 - Present
LLM Fine-Tuning and Optimization
Developed methods for fine-tuning large language models (LLMs) to reduce sycophancy, enhance privacy, and optimize prompts for specific tasks.
- Mitigated sycophancy in LLMs using Direct Preference Optimization, published in IEEE BigData'24.
- Fine-tuned LLMs with privacy-aware data for use in federated learning systems.
- Designed prompt optimization techniques tailored for text-to-image synthesis, currently under review at COLM'24.
- 2021.06 - Present
Privacy-Preserving Machine Learning
Developed privacy-preserving mechanisms for federated learning to counter feature inference attacks and enhance security.
- Designed PETER: a privacy-preserving framework for vertical federated learning, under review at TIFS'24.
- Proposed feature importance-based incentive mechanisms for federated learning, to be submitted to AAAI'25.
- Surveyed security threats in deep learning and proposed countermeasures, published in IEEE Access.