Optimizing Large Language Models for Resource-Constrained Environments: A Parameter-Efficient Approach Using QLoRA and Prompt Tuning

Authors

  • Shivay Shakti ComScore, New Delhi
  • Drishti Hajong Indian Institute of Technology
  • Priyanshi Dubey Government Medical College

Keywords:

management, parameter-efficient fine-tuning, large language models, QLoRA, prompt tuning, resource-constrained environments, NLP, memory optimization, deployment cost reduction, text classification, quantization, low-rank adaptation

Abstract

As the deployment of AI solutions continues to grow, particularly in resource-constrained environments, the need for efficient and cost-effective methods becomes increasingly critical. Large Language Models (LLMs) present significant computational challenges that often make their deployment impractical for many real-world applications. This study evaluates parameter-efficient fine-tuning methods, specifically QLoRA and Prompt Tuning, in combination with DistilBERT, to address these challenges. Our combined approach achieved a 36.2% reduction in memory usage and a 50% reduction in inference costs while maintaining 87.75% accuracy compared to baseline models. The results demonstrate that stacking these techniques can provide multiplicative benefits in resource reduction without significant performance degradation, offering practical solutions for resource-constrained deployments.

Downloads

Published

2025-10-05

How to Cite

Shakti , S., Hajong , D., & Dubey , P. (2025). Optimizing Large Language Models for Resource-Constrained Environments: A Parameter-Efficient Approach Using QLoRA and Prompt Tuning. American Journal of Management, 25(4). Retrieved from https://articlearchives.co/index.php/AJM/article/view/7266

Issue

Section

Articles