About

I'm a Computer Science and Engineering graduate from United International University, who's interested to work in reasearch.
My main areas of research are multimodal deep learning, especially I aim to research about about we can represent multimodal data as a single representation/language to create more robust expert systems.

Awards

  • Gold Award Winner (Champion) of the International Blockchain Olympiad 2023
  • Championof the Bhashamul: Bengali Regional Text to IPA challenge
  • 1st Runner Upof the 4th National Blockchain Olympiad, Bangladesh
  • Championof the CSE Dept. Project Show (Software Engineering) || Summer '23
  • Champion of the CSE Dept. Project Show (DBMS) || Spring '22
  • Champion of the CSE Dept. Project Show (AOOP) || Fall '21
  • Selected for the Hult Prize Global Round in Boston, USAfrom 14th-15th June 2024
  • Champion Hult Prize OnCampus Round
  • 1st Runner Up Intra-University Deep Learning Spring
  • 1 stRunner-Up Season 1 Project Show
  • Best Performer of the Year 2020 || UIU EDF
  • 1st Runner Up PPT Genius || UIU EDF

Interests

My field of interests:

    Computer Science:

  • Machine Learning
  • Data Science
  • Back-end development
  • Blockchain

    Other:

  • Networking
  • Hackathons & Datathons
  • Olympiads

Skills

Soft skills:

Communication
Team Leading
Design Thinking and Planning
Presentation and Public Speaking
Time Management


Hard Skills:


C

C++

Java

Python

html

css

javascript

php

dart

mysql

mongodb

node.js

express

spring boot

fastapi

react

react native

Flutter

github

git

firebase

docker

pytorch

tensorflow

huggingface

numpy

pandas

opencv

bash

ubuntu

embedded c

arduino

qiskit

Resume

Experience

Software Engineer

November 2024 - Present

Apurba Technologies Ltd.

Lecturer

June 2024 - October 2024

Department of Computer Science and Engineering, United International University

  • Courses: Object-Oriented Programming Lab, Data Structures and Algorithms 1 Lab, Database Management Systems Theory
  • Taught students computer science core concepts

Undegradute Assistant

September 2023 - May 2024

Department of Data Science, United International University

  • Prepared question sets and course contents jointly with the course faculties
  • Couselled students jointly with the course faculties
  • Worked closely with the Data Science Department to facilitate the course materials and content

Education

Bachelors in Science, in Computer Science & Engineering

2020 - 2024

United International University, Dhaka, Bangladesh

Major: Data Science

Notable courses:
Machine Learning, Digital Image Processing, Bioinformatics, Software Engineering, Human-Computer Interaction.

International A Levels

2019

Academia, Dhaka, Bangladesh

Result: A in Mathematics; B in Physics; B in Chemistry

Extracurriculars

General Secretary

2022 - 2024

UIU Computer Club

  • Jointly managed the outreach of the Club with the Vice-President
  • Worked closely with the University CSE Department for and during events
  • Lead the departments during live and online events
  • Helped groom members for their next roles.
  • Supervised the performances of the departments and the department Heads

Organising Secretary

2020 - 2022

UIU Entrepreneur Development Forum

  • Lead the departments during live and online events
  • Helped groom members for their next roles.
  • Supervised the performances of the departments and the department Heads
  • Jointly managed the Club Fund with the President, and the Treasurer

Team Member of Marketing

2021-2022

AIESEC in Bangladesh

  • Developed numerous marketing materials.
  • Kept a close community active with my team
  • Attended and participated in conferences

Publications

Published

E-MedViTr: Enhanced Vision Transformers with Registers for Biomedical Image Classification
Authors: Sadia Ahmmed, Taimur Rahman, S M Jishanul Islam, Al-Momen Reyad, Sonjoy Dey, James Anthony Purification and Dewan Md. Farid

6th International Conference on Electrical Engineering and Information & Communication Technology 2024 [Scopus Indexed]

DOI: 10.1109/ICEEICT62016.2024.10534573

About: Medical image classification is critical in clinical decision-making, requiring efficient and precise evaluation of images. We propose an E-MedViTR model that boosts the performance of the vision transformer with registers by adding an enhanced classifier head. Research in this field could pave the way for more personalized, accurate, and efficient diagnostic procedures in medical imaging, benefiting both clinicians and patients.




Accepted

MIMIC: Multimodal Islamophobic Meme Identification and Classification
Authors: S M Jishanul Islam, Sahid Hossain Mustakim, Sadia Ahmmed, Md. Faiyaz Abdullah Sayeedi, Swapnil Khondoker, Syed Tasdid Azam Dhrubo, and Nahid Hossain.

Muslims in ML Workshop, 38th Neural Information Processing Systems 2024 (NeurIPS 2024)

arXiv: arxiv.org/abs/2412.00681 || OpenReview || Code & Data

Abstract: Anti-Muslim hate speech has emerged within memes, characterized by context-dependent and rhetorical messages using text and images that seemingly mimic humor but convey Islamophobic sentiments. This work presents a novel dataset and proposes a classifier based on the Vision-and-Language Transformer (ViLT) specifically tailored to identify anti-Muslim hate within memes by integrating both visual and textual representations. Our model leverages joint modal embeddings between meme images and incorporated text to capture nuanced Islamophobic narratives that are unique to meme culture, providing both high detection accuracy and interoperability.




In Review

An audio video-based multi-modal fusion approach for emotion recognition
Authors: S M Jishanul Islam (FA), Sahid Hossain Mustakim, Musfirat Hossain, Mysun Mashira, Nur Islam Shourav, Md. Rayhan Ahmed (Corresponding Author), Salekul Islam (Corresponding Author), Swakkhar Shatabda, and A.K.M Muzahidul Islam

Under Review, Knowledge-based Systems (Elsevier)

About: A multi-modal fusion approach that takes audio and video input in parallel to effectively predict human emotions. Competes fairly with state-of-the-art models. Potential applications in the field of human-computer interaction (HCI), psychological evaluations, driver safety in vehicles, and so on.




Research Experience

AVLoS: Audio-Visual Long Text Scene Summarization (interim title)
Supervisor: Swakkhar Shatabda
Type: Undergraduate Thesis

In Progress

About: We are experimenting on fine-grained long text scene summarization from videos using video and audio inputs. Most approaches use queries to guide text generation, we are experimenting to generate text without these queries.




An audio video-based multi-modal fusion approach for emotion recognition
Supervisors: Md. Rayhan Ahmed, Salekul Islam, Swakkhar Shatabda, and A.K.M Muzahidul Islam
Type: Research Project

Under Review, Knowledge-based Systems (Elsevier) [Code]

About: A multi-modal fusion approach that takes audio and video input in parallel to effectively predict human emotions. Competes fairly with state-of-the-art models. Potential applications in the field of human-computer interaction (HCI), psychological evaluations, driver safety in vehicles, and so on.




MIMIC: Multimodal Islamophobic Meme Identification and Classification
Supervisors: Nahid Hossain
Type: Research Project

Accepted at NeurIPS'24 MusIML Workshop | Work Expanding [Code] [arXiv draft] [OpenReview]

About: We present the first Islamophobic Memes dataset and VLMs to identify and classify hate from those memes.




BhaShammo: IPA Transcription of Bengali Regional Dialect using Dialect Guided Tokens
Supervisors: Swakkhar Shatabda, and Farig Sadeque
Type: Research Project | Competition Work

In Progress [Code] [arXiv draft]

About: We present an approach to transcribe regional Bengali text to IPA by introducing the Dialect Guided Tokens (DGT) technique on a new dataset spanning six districts of Bangladesh. We provide the model with information on the regional dialect of the input text before generating the IPA transcription. This is the first time this problem has been solved.




E-MedViTR: Enhanced Vision Transformers with Registers for Biomedical Image Classification
Supervisor: Dewan Md Farid
Type: Research Project

Published at ICEEICT. [Paper]

About: We investigated the effectiveness of the ViT with registers (aka DINO v2) in classifying medical pathology images. Normally, it doesn't perform up to the mark as SOTA models do, but with an extension and data augmentations, it performs relatively close to SOTA models.




Curriculum Vitae

Find my CVs tailored to your preferences.


Follow me @ various platforms

Contact

Let's connect via LinkedIn

Follow me on X

Shoot me an email at: jishanlion@gmail.com