Dr. Dhruv Sharma | Computer Vision | Best Researcher Award
Amity University | India
Dr. Dhruv Sharma has made extensive contributions to the domains of artificial intelligence, deep learning, and multimodal systems through a wide range of impactful publications. His research encompasses visual data captioning, adaptive attention mechanisms, and transformer-based models that enhance image understanding and description generation. Notable works include Evolution of Visual Data Captioning Methods, Datasets, and Evaluation Metrics: A Comprehensive Survey, Automated Image Caption Generation Framework using Adaptive Attention and Bi-LSTM, and XGL-T Transformer Model for Intelligent Image Captioning, which collectively advance the field of vision-language integration. His studies such as Lightweight Transformer with GRU Integrated Decoder for Image Captioning and Control With Style: Style Embedding-based Variational Autoencoder for Controlled Stylized Caption Generation Framework propose innovative architectures for stylistic and efficient captioning. In addition, he has developed frameworks like FDT–Dr2T: A Unified Dense Radiology Report Generation Transformer Framework for X-ray Images and Unma-Capsumt: Unified and Multi-Head Attention-Driven Caption Summarization Transformer, highlighting his interest in medical AI and caption summarization. His earlier works, including Memory-Based FIR Digital Filter using Modified OMS-LUT Design and Modified Efficient OMS LUT-Design for Memory-Based Multiplication, show his foundational expertise in signal processing and hardware-efficient algorithms. Moreover, his contributions such as Obscenity Detection Transformer and DVRGNet reflect his commitment to developing socially responsible AI for content moderation. Overall, Dr. Sharma’s scholarly output demonstrates a consistent trajectory from traditional signal processing to cutting-edge multimodal AI, bridging research innovation with practical applications in intelligent computing and human-centered artificial intelligence.
Profile: Google Scholar
Featured Publications
-
Sharma, D., Dhiman, C., & Kumar, D. (2023). Evolution of visual data captioning methods, datasets, and evaluation metrics: A comprehensive survey. Expert Systems with Applications, 221, 119773.
-
Sharma, D., Dhiman, C., & Kumar, D. (2024). XGL-T transformer model for intelligent image captioning. Multimedia Tools and Applications, 83(2), 4219–4240.
-
Sharma, D., Dhiman, C., & Kumar, D. (2024). Control with style: Style embedding-based variational autoencoder for controlled stylized caption generation framework. IEEE Transactions on Cognitive and Developmental Systems, 1–11.
-
Sharma, D., Dhiman, C., & Kumar, D. (2024). FDT–Dr2T: A unified dense radiology report generation transformer framework for X-ray images. Machine Vision and Applications, 35, 1–13.
-
Sharma, D., Dhiman, C., & Kumar, D. (2022). Automated image caption generation framework using adaptive attention and Bi-LSTM. In 2022 IEEE Delhi Section Conference (DELCON) (pp. 1–5). IEEE.