Computer Vision, Natural Language Processing, Audio Processing
Development of models that understand and generate across multiple modalities (text, image, video, audio), enabling more human-like and context-aware AI systems.
Daffodil International University
Have an interesting research topic to share?