About
I'm a Computer Science and Engineering graduate from United International
University, who's interested to work in reasearch.
My main areas of research are multimodal deep learning, especially I aim to research about about we can
represent multimodal data as a single representation/language to create more robust expert systems.
Awards
- Gold Award Winner (Champion) of the International Blockchain Olympiad 2023
- Championof the Bhashamul: Bengali Regional Text to IPA challenge
- 1st Runner Upof the 4th National Blockchain Olympiad, Bangladesh
- Championof the CSE Dept. Project Show (Software Engineering) || Summer '23
- Champion of the CSE Dept. Project Show (DBMS) || Spring '22
- Champion of the CSE Dept. Project Show (AOOP) || Fall '21
- Selected for the Hult Prize Global Round in Boston, USAfrom 14th-15th June 2024
- Champion Hult Prize OnCampus Round
- 1st Runner Up Intra-University Deep Learning Spring
- 1 stRunner-Up Season 1 Project Show
- Best Performer of the Year 2020 || UIU EDF
- 1st Runner Up PPT Genius || UIU EDF
Interests
My field of interests:
- Machine Learning
- Data Science
- Back-end development
- Blockchain
Computer Science:
- Networking
- Hackathons & Datathons
- Olympiads
Other:
Skills
Soft skills:
Hard Skills:
Resume
Experience
Software Engineer
November 2024 - Present
Apurba Technologies Ltd.
Lecturer
June 2024 - October 2024
Department of Computer Science and Engineering, United International University
- Courses: Object-Oriented Programming Lab, Data Structures and Algorithms 1 Lab, Database Management Systems Theory
- Taught students computer science core concepts
Undegradute Assistant
September 2023 - May 2024
Department of Data Science, United International University
- Prepared question sets and course contents jointly with the course faculties
- Couselled students jointly with the course faculties
- Worked closely with the Data Science Department to facilitate the course materials and content
Education
Bachelors in Science, in Computer Science & Engineering
2020 - 2024
United International University, Dhaka, Bangladesh
Major: Data Science
Notable courses:
Machine Learning, Digital Image Processing, Bioinformatics,
Software Engineering, Human-Computer Interaction.
International A Levels
2019
Academia, Dhaka, Bangladesh
Result: A in Mathematics; B in Physics; B in Chemistry
Extracurriculars
General Secretary
2022 - 2024
UIU Computer Club
- Jointly managed the outreach of the Club with the Vice-President
- Worked closely with the University CSE Department for and during events
- Lead the departments during live and online events
- Helped groom members for their next roles.
- Supervised the performances of the departments and the department Heads
Organising Secretary
2020 - 2022
UIU Entrepreneur Development Forum
- Lead the departments during live and online events
- Helped groom members for their next roles.
- Supervised the performances of the departments and the department Heads
- Jointly managed the Club Fund with the President, and the Treasurer
Team Member of Marketing
2021-2022
AIESEC in Bangladesh
- Developed numerous marketing materials.
- Kept a close community active with my team
- Attended and participated in conferences
Notable Projects
From C/C++, to Web apps, here are some of my notable projects.
- All
- C/C++
- Java
- Web Apps
- Mobile Apps
- Data Science and ML
- Embedded Systems
Publications
Published
E-MedViTr: Enhanced Vision Transformers with Registers for Biomedical Image Classification
Authors: Sadia Ahmmed, Taimur Rahman, S M Jishanul Islam, Al-Momen Reyad, Sonjoy Dey, James Anthony Purification and Dewan Md. Farid
6th International Conference on Electrical Engineering and Information & Communication Technology 2024 [Scopus Indexed]
DOI: 10.1109/ICEEICT62016.2024.10534573
About: Medical image classification is critical in clinical decision-making, requiring efficient and precise evaluation of images. We propose an E-MedViTR model that boosts the performance of the vision transformer with registers by adding an enhanced classifier head. Research in this field could pave the way for more personalized, accurate, and efficient diagnostic procedures in medical imaging, benefiting both clinicians and patients.
Accepted
MIMIC: Multimodal Islamophobic Meme Identification and Classification
Authors: S M Jishanul Islam, Sahid Hossain Mustakim, Sadia Ahmmed, Md. Faiyaz Abdullah Sayeedi, Swapnil Khondoker, Syed Tasdid Azam Dhrubo, and Nahid Hossain.
Muslims in ML Workshop, 38th Neural Information Processing Systems 2024 (NeurIPS 2024)
arXiv: arxiv.org/abs/2412.00681 || OpenReview || Code & Data
Abstract: Anti-Muslim hate speech has emerged within memes, characterized by context-dependent and rhetorical messages using text and images that seemingly mimic humor but convey Islamophobic sentiments. This work presents a novel dataset and proposes a classifier based on the Vision-and-Language Transformer (ViLT) specifically tailored to identify anti-Muslim hate within memes by integrating both visual and textual representations. Our model leverages joint modal embeddings between meme images and incorporated text to capture nuanced Islamophobic narratives that are unique to meme culture, providing both high detection accuracy and interoperability.
In Review
An audio video-based multi-modal fusion approach for emotion recognition
Authors: S M Jishanul Islam (FA), Sahid Hossain Mustakim, Musfirat Hossain, Mysun Mashira, Nur Islam Shourav, Md. Rayhan Ahmed (Corresponding Author), Salekul Islam (Corresponding Author), Swakkhar Shatabda, and A.K.M Muzahidul Islam
Under Review, Knowledge-based Systems (Elsevier)
About: A multi-modal fusion approach that takes audio and video input in parallel to effectively predict human emotions. Competes fairly with state-of-the-art models. Potential applications in the field of human-computer interaction (HCI), psychological evaluations, driver safety in vehicles, and so on.
Research Experience
AVLoS: Audio-Visual Long Text Scene Summarization (interim title)
Supervisor: Swakkhar Shatabda
Type: Undergraduate Thesis
In Progress
About: We are experimenting on fine-grained long text scene summarization from videos using video and audio inputs. Most approaches use queries to guide text generation, we are experimenting to generate text without these queries.
An audio video-based multi-modal fusion approach for emotion recognition
Supervisors: Md. Rayhan Ahmed, Salekul Islam, Swakkhar Shatabda, and A.K.M Muzahidul Islam
Type: Research Project
Under Review, Knowledge-based Systems (Elsevier) [Code]
About: A multi-modal fusion approach that takes audio and video input in parallel to effectively predict human emotions. Competes fairly with state-of-the-art models. Potential applications in the field of human-computer interaction (HCI), psychological evaluations, driver safety in vehicles, and so on.
MIMIC: Multimodal Islamophobic Meme Identification and Classification
Supervisors: Nahid Hossain
Type: Research Project
Accepted at NeurIPS'24 MusIML Workshop | Work Expanding [Code] [arXiv draft] [OpenReview]
About: We present the first Islamophobic Memes dataset and VLMs to identify and classify hate from those memes.
BhaShammo: IPA Transcription of Bengali Regional Dialect using Dialect Guided Tokens
Supervisors: Swakkhar Shatabda, and Farig Sadeque
Type: Research Project | Competition Work
In Progress [Code] [arXiv draft]
About: We present an approach to transcribe regional Bengali text to IPA by introducing the Dialect Guided Tokens (DGT) technique on a new dataset spanning six districts of Bangladesh. We provide the model with information on the regional dialect of the input text before generating the IPA transcription. This is the first time this problem has been solved.
E-MedViTR: Enhanced Vision Transformers with Registers for Biomedical Image Classification
Supervisor: Dewan Md Farid
Type: Research Project
Published at ICEEICT. [Paper]
About: We investigated the effectiveness of the ViT with registers (aka DINO v2) in classifying medical pathology images. Normally, it doesn't perform up to the mark as SOTA models do, but with an extension and data augmentations, it performs relatively close to SOTA models.
Curriculum Vitae
Find my CVs tailored to your preferences.
Follow me @ various platforms
smji