Table of Contents

F24 Project Listings

Instructions for Faculty Members

This page is a child of the page: Proposed Projects for the Upcoming Term

Students:


Automotive Knowledge Graph Construction and LLM Integration

[added 2024-09-03]

Course: EECS4080

Supervisor: Aijun An

Supervisor's email address: aan@yorku.ca

Project Description: This project focuses on the construction and application of knowledge graphs in the automotive domain, combining natural language processing, information extraction, and large language models (LLMs). Students will explore recent advances in the field of automatic knowledge graph construction and evaluate its impact on enhancing LLM performance.

The project involves the following steps: 1. Conduct a literature review on recent advances in automatic knowledge graph construction. 2. Select and implement an existing knowledge graph construction method, making use of available open-source code. 3. Create an automotive knowledge graph by applying the chosen method to one or multiple car owner's manuals. 4. Evaluate the accuracy of answers of an LLM with and without the constructed knowledge graph.

Required skills or prerequisite courses:

  1. Good programming skills in Python
  2. Familiarity with natural language processing concepts
  3. Basic understanding of graph theory and data structures

Recommended skills or prerequisite courses:

  1. Experience with machine learning libraries (PyTorch)
  2. Familiarity with knowledge representation and ontologies
  3. Basic understanding of large language models and their applications

Instructions: Send your transcript and a short statement of motivation to Aijun An.

Optimizing Regulatory Document Summarization for Automated Compliance Analysis in Healthcare**

Course: {EECS4080 | EECS4088}

Supervisor: Maleknaz Nayebi

Supervisor's email address: mnayebi@yorku.ca

Project Description:

Healthcare regulatory documents are notoriously complex and voluminous, posing significant challenges for effective summarization. This project addresses these challenges by developing a multi-step pipeline that integrates extractive and abstractive summarization techniques specifically tailored to healthcare regulations. The research explores the impact of different model architectures on summarization quality, particularly focusing on how decoder-only models benefit from a two-step process. For healthcare regulations, the effectiveness of the extractive step varies depending on the context length and the type of encoder-decoder model used. Additionally, the project examines the difficulties in evaluating generated summaries, highlighting the discrepancies between human expert assessments and automated metrics. This study aims to optimize summarization strategies for healthcare regulatory texts, ensuring that summaries are both accurate and contextually relevant, thereby supporting better compliance and decision-making in the healthcare industry.

Required skills or prerequisite courses:

  1. Python Programming
  2. Machine Learning
  3. Natural Language processing

Recommended skills or prerequisite courses:

  1. Prompt engineering
  2. Image processing
  3. Generative AI

Instructions: Email Maleknaz both your CV and transcripts in one email.


Image Processing for Enhanced Software Development

Image Processing for Enhanced Software Development

Course: { EECS4080 | EECS4088}

Supervisor: Maleknaz Nayebi

Supervisor's email address: mnayebi@yorku.ca

Project Description:

Developers often seek help online by posting questions accompanied by screenshots, but the responses they receive typically rely on textual explanations, requiring the developers to manually parse and interpret the details from the images. This project aims to bridge this gap by integrating various image processing techniques to enhance the quality and clarity of responses. By automatically analyzing and extracting relevant information from screenshots, we can create more precise and visually guided answers, improving the efficiency of problem-solving in developer communities.

Required skills or prerequisite courses:

  1. Python Programming
  2. Machine Learning
  3. Image processing
  4. Software Engineering

Recommended skills or prerequisite courses:

  1. Prompt engineering
  2. Natural Language processing
  3. Generative AI

Instructions: Email Maleknaz both your CV and transcripts in one email.


LLM Native Operating Systems

Added: 2024-08-26

Course: EECS4088, EECS4080 or EECS4070

Supervisor: Hamzeh Khazaei

Supervisor's email address: hkh@yorku.ca

Project Description: We stand at the precipice of a paradigm shift in the ever-evolving landscape of computing systems. Large Language Models (LLMs) have emerged as powerful tools, revolutionizing natural language understanding, translation, and code generation. But what if we could harness their potential to reshape the very foundations of operating systems? This project embarks on an audacious journey—to design and implement operating systems that natively integrate LLMs. Imagine an OS that executes commands and converses with users, predicts their needs, and adapts dynamically. We aim to create an ecosystem where LLMs become native inhabitants, enhancing efficiency, security, and user experience. This is a large-scale project, and students will be assigned specific topics depending on their interests and specialty.

Required skills or prerequisites:

  1. B+ or higher grade in EECS3221: Operating Systems
  2. Good understanding of Machine Learning concepts
  3. Good understanding of Computing Systems
  4. Good command of Python language

Recommended skills or prerequisites:

  1. Familiarity with microservices architecture, dockers, and virtualization, in general, is a plus.
  2. Familiarity with Cloud Computing is desirable.
  3. Familiarity with Go lang is a plus.

Instructions: Please email the supervisor your CV and latest transcripts.


Sims for University Life

Course: EECS4080

Supervisor: Meiying Qin

Supervisor's email address: mqin@yorku.ca

Project Description: One of the biggest challenges that first-year students face is the transition from high school to university. This is expected to be more pronounced once the York Markham campus opens, as all courses will use the flipped-class model. In this model, students are required to be more active in learning and preview the content before each class in order to stay on track. In order to assist first-year students in making a smoother transition even before school starts, we plan to release a game that simulates the life of a computer science student at the Markham campus to provide students with a preview of university life. In this project, students have the opportunity to gain hands-on experience in both designing and implementing a game.

Required skills or prerequisites:

  1. Strong software engineering skills;
  2. An interest in helping first-year students and suggesting game components based on your own experience.

Recommended skills or prerequisites:

  1. Familiar with game development and Unity

Instructions: Please email me (mqin@yorku.ca) your CV and transcript, with a statement of why you are interested in the project.


Building Robots Tutors

Course: EECS4080

Supervisor: Meiying Qin

Supervisor's email address: mqin@yorku.ca

Project Description: The research is to innovate cost-effective robot tutors that are accessible on a broader scale, fostering inclusive and impactful learning experiences. Robot tutors have demonstrated effectiveness in aiding students, yet their widespread adoption faces hurdles due to high costs and limited scalability. Current robot tutors are often impractical for widespread use in universities due to their expenses. This project seeks to overcome these limitations by developing an affordable robot tutor. The objective is to create a solution that meets the education needs of university students without imposing financial constraints. In order to evaluate our cost-effective robot, we will recruit participants around campus to interact with our robot and a commercial robot. We are targeting for a deadline in November. The work will be heavy before November and will be light afterwards.

Required skills or prerequisites:

  1. Be familiar with hardware

Recommended skills or prerequisites:

  1. Experience with running user-studies/experiments
  2. ROS/ROS2

Instructions: Please email me (mqin@yorku.ca) your CV and transcript, with a statement of why you are interested in the project.


Using Mixed Reality to Support Programming in CS1

Course: EECS4080

Supervisor: Meiying Qin

Supervisor's email address: mqin@yorku.ca

Project Description: Project Description: Debugging is one of the most important skills for computer science students. However, first-year students are usually not comfortable with working with a debugger. In order to help ease the process for first-year students, we plan to write an application that can visualize the process by animating the variable manipulated, either on a screen or using mixed reality. In this project, students will have the opportunity to gain hands-on experience in both designing and implementing a software application. Students will gain experience in mixed reality.

Required skills or prerequisites:

  1. Strong Programming skills

Recommended skills or prerequisites:

  1. Be familiar with Unity

Instructions: Please email me (mqin@yorku.ca) your CV and transcript, with a statement of why you are interested in the project.


LLM4SE (Large Language Models for Software Engineering)

[added 2024-08-26]

Course: EECS4070/EECS4080

Supervisor: Zhen Ming (Jack) Jiang

Supervisor's email address: zmjiang@yorku.ca

Project Description: Software engineering data (e.g., source code repositories and bug databases) contain a wealth of information about a project's status and history. With the recent advances of large language models (e.g., GPT and BERT) as well as their applications (e.g., ChatGPT or GitHub Copilot), many software engineering tasks can be automated or optimized. In this project, the student(s) will explore and investigate various software engineering applications which can benefit from the use of LLMs.

Required skills or prerequisites:

Recommended skills or prerequisites: Some knowledge in AI would be preferred but not required

Instructions: Send c.v. and unofficial transcript to the supervisor.


FMOps

[added 2024-08-26]

Course: EECS4070/EECS4080

Supervisor: Zhen Ming (Jack) Jiang

Supervisor's email address: zmjiang@yorku.ca

Project Description: Artificial Intelligence is gaining rapid popularity in both research and practice, due to the recent advances in machine learning (ML) research and development. Many ML applications (e.g., Tesla’s autonomous vehicle and Apple’s Siri) are already being used widely in people’s everyday lives. McKinsey recently estimated that ML applications have the potential to create between $3.5 and $5.8 trillion in value annually. Foundation models are large AI models trained on a vast quantity of data at scale. FMs can be used to power a wide range of downstream tasks (e.g., chat bots, code assistants, tutors, etc.). However, there remain many challenges in efficiently training, deploying and monitoring such FM infrastructure. In addition, there is a lack of tools and processes to further develop applications or services on top of such FMs. The goal of this project is to develop engineering tools and best practices to support effectively operationalizing FMs.

Required skills or prerequisites:

Recommended skills or prerequisites: Some knowledge in AI would be preferred but not required

Instructions: Send c.v. and unofficial transcript to the supervisor.


Learning Analytics Application (LAApp) - Early Intervention System

Added on Aug 27, 2024

Course: { EECS4088}

Supervisor: Pooja Vashisth

Supervisor's email address: vashistp@yorku.ca

Project Description:

Problems and Needs: Moodle - York University Learning Management System (LMS) - provides instructors with rich data sets for students’ activities and performance. However, while the data comes in bulk, some important information (e.g., course activities) is automatically deleted by the system and replaced with new data each week. This issue prevents instructors from using such data effectively to enhance their teaching. Moreover, the bulk data may prevent professors from gaining useful insights to improve course quality.

Ideas: This project aims to address the main problems below with the following solutions: • Retrieve the full data set using scripts/integrations/apps that automatically pull course data from Moodle. • Provide instructors with useful data visualizations from Moodle's enormous data to support their teaching. • Develop a machine learning model to predict students' final grades based on their performance so far and their engagement with course materials. • Offer students insights into their standing in the class and provide personalized feedback. • Facilitate early intervention sessions between instructors and students identified as at risk of poor performance, helping to decide if they should drop the course or improve their engagement.

Implementation: • Data Collection • Data Cleaning • Data Visualization • Machine Learning Model: • Intervention System

Required skills or prerequisite courses:

• Web development (applying OOP design, design pattern, and SOLID principles in developing backend API using Python Flask). • Data analysis using Python frameworks and libraries (pandas, NumPy, seaborn, matplotlib, statsmodel). • Full-stack development:

      Front-end: Building user interfaces with JavaScript technologies.
      Back-end: Implementing RESTful API and GraphQL API using JavaScript technologies.

• DBMS: Database design and using NoSQL/SQL databases in web development.

Recommended skills or prerequisite courses:

Research and statistics skills Natural Language processing Generative AI

Instructions: Email Pooja your Interest in the project, CV and transcripts.


Mentor/Mentee Matching Tool

Added on Aug 27, 2024

Course: { EECS4080}

Supervisor: Pooja Vashisth

Supervisor's email address: vashistp@yorku.ca

Project Description:

1. Problem Statement Effective mentorship is crucial for personal and professional growth, but matching mentors and mentees can be challenging. A manual process, though effective, is often time-consuming and inefficient.

2. Project Objectives Objective 1: The tool creates groups of 15-20 mentees with similar interests and diverse backgrounds. Objective 2: The tool automates the matching of mentors and a group of mentees based on availabilities, passions, and interests. Some mentors can be assigned to multiple groups. 3. Methodology Data Collection: Past data (if any). Develop surveys for mentors and mentees to gather detailed information about their interests, goals, and passions. Matching Algorithm: The algorithm will likely utilize clustering techniques to match mentors and mentees effectively. 4. Summary: This project aims to create a tool that will automate the process of forming groups of mentees and assigning mentors to each group, making it faster and more efficient than the current manual method. The tool will consider factors like schedules, interests, and passions to create groups of 15-20 mentees, each with a mentor. It will also ensure that these groups are diverse, including students from different academic programs, genders, and backgrounds, such as both international and local students. By automating the group creation and mentor assignment process, the tool will save time and enhance the quality of mentorship experiences. Students will benefit from interacting with peers who have similar passions but different perspectives and experiences, creating a more inclusive and well-rounded learning environment.

Required skills or prerequisite courses:

• Web development (applying OOP design, design pattern, and SOLID principles in developing backend API using Python Flask). • Data analysis using Python frameworks and libraries (pandas, NumPy, seaborn, matplotlib, statsmodel). • Full-stack development:

o	Front-end: Building user interfaces with JavaScript technologies.
o	Back-end: Implementing RESTful API and GraphQL API using JavaScript technologies.

• DBMS: Database design and using NoSQL/SQL databases in web development.

Recommended skills or prerequisite courses:

Research and statistics skills Natural Language processing Generative AI

Instructions: Email Pooja your Interest in the project, CV and transcripts.


Generative AI for Software Architecture

[added 2024-08-27]

Course: EECS4080, EECS4070

Supervisor: Marios Fokaefs

Supervisor's email address: fokaefs@yorku.ca

Project Description: The goal of the project is to create interactive AI agents to assist with design and architecture tasks for complex software systems. At first, we will evaluate the ability of generic large language models (such as ChatGPT and Gemini) to propose complex software designs and architectures using established patterns and styles and using both quality and structural criteria to justify the appropriateness of the proposed solution. Our interest is to identify the problems or details that cause the model to hallucinate, which may lead to problematic or suboptimal designs. Next, we will engineer the prompts so that eventually they are provided by the users in a more structured manner, as a configuration file. At the same time, we will investigate whether a domain-specific foundation model will need to be developed, which would surpass the results of the generic model. Finally, the agents will be evaluated on a data science platform for epidemiological research. At the moment, the platform supports a data lake as the backend and a web front-end for data analytics. The objective of the agents will be to generate a customized analytics environments based on the specific needs of the researcher. The reason this platform was selected is to specifically support non-CS experts in complex tasks such as designing and deploying software infrastructure.

Required skills or prerequisites:

  1. Major in Computer Science/Software Engineering/Computer Engineering
  2. Third year and up
  3. At least B+ for EECS 3311
  4. Proficient in Python and Java-based programming

Recommended skills or prerequisites:

  1. Basic use of tools like ChatGPT
  2. Some knowledge of AI

Instructions: Email the instructor your CV and your unofficial transcript. Use EECS4070 or EECS4080 as part of the subject of the emai.


Generative AI for Self-adaptive Systems

[added 2024-08-27]

Course: EECS4080, EECS4070

Supervisor: Marios Fokaefs

Supervisor's email address: fokaefs@ualberta.ca

Project Description: The goal of the project is to allow site reliability engineers and DevOps engineers to handle the large volume of events in complex systems to ensure reliability and minimize cost and revenue loss. In this context, the foundation model will play the role of the mediator between the SRE and a complex backend with large volumes of multi-dimensional logs and events. The model will allow the integration of data from multiple sources related to an event, summarize it in useful information and present it through visualizations. Finally, the model will serve as an interface to act upon the event and issue mitigating and adaptive actions. The project will evaluate large language models and generative AI methods with respect to their ability in: a) integrating and summarizing data from multiple sources related to an event, b) plan and propose responses to events, c) executing and evaluating the response for future events.

Required skills or prerequisites:

  1. Major in Computer Science/Software Engineering/Computer Engineering
  2. Third year and up
  3. At least B+ for EECS 3311
  4. Proficient in Python and Java-based programming

Recommended skills or prerequisites:

  1. Basic use of tools like ChatGPT
  2. Some knowledge of AI

Instructions: Email the instructor your CV and your unofficial transcript. Use EECS4070 or EECS4080 as part of the subject of the email.


AI in constraint environments (IoT and mobile)

[added 2024-08-27]

Course: EECS4080

Supervisor: Marios Fokaefs

Supervisor's email address: fokaefs@yorku.ca

Project Description: In the spirit of ubiquitous computing, there is a need or tendency to integrate AI in most software applications. However, in domains, such as Internet-of-Things, mobile computing and edge computing, resources are constrained, while AI modules are often demanding in terms of storage, memory and computation. Foundation models may be even more demanding, which creates new challenges in embedded and self-adaptive systems. First, the project will explore, through performance testing and benchmarking, the requirements of large language models and generative AI methods in terms of computation and storage resources. Second, the project will devise optimization methods to reduce the model’s requirements and allow its deployment on constrained environments. Finally, it will also develop adaptive methods to retrain, repackage and redeploy AI models across heterogeneous infrastructures as they become available.

Required skills or prerequisites:

  1. Major in Computer Science/Software Engineering/Computer Engineering
  2. Third year and up
  3. At least B+ for EECS 3311
  4. Proficient in Java-based programming

Recommended skills or prerequisites:

  1. Familiarity with Android Studio
  2. At least B+ for EECS4443

Instructions: Email the instructor your CV and your unofficial transcript. Use EECS4070 or EECS4080 as part of the subject of the email.


Mapping Stong Pond

[added 2024-08-29]

Course: { EECS4080}

Supervisor: Michael Jenkin

Supervisor's email address: jenkin@yorku.ca

Project Description: Much of the surface of the planet is covered by water. Mapping and performing other tasks on these environments can be augmented through the deployment of unmanned surface vessels (USV) that can perform these tasks autonomously. This project involves using a simulator for one of the USVs in the lab to develop a strategy to perform measurements over real and simulated bodies of water (Stong Pond).

Required skills or prerequisites:

  1. Ability to work independently and in groups
  2. Good python programming skills
  3. Knowledge of/interest in ROS2 would be helpful

Recommended skills or prerequisites:

  1. None beyond 4080 prerequisites

Instructions: Contact Michael Jenkin as jenkin@yorku.ca if interested


Enhanced avatar for human-robot interaction

[added 2024-08-29]

Course: { EECS4080 }

Supervisor: Michael Jenkin

Supervisor's email address: jenkin@yorku.ca

Project Description: Avatars have been proposed as a key element in user interface designs since the development of Microsoft's Clippy, if not before. In the lab we have been developing a Unity-based avatar that operates as the front end of a LLM-based avatar that can be deployed in various environments. This forward facing avatar provides a natural interaction with individuals in the environment, providing audio-based input and output and literally putting a face on the underlying system. The basic goal of the project is to take the operational system and to enhance it in a number of ways, perhaps most critically through the addition of canned animation scripts that can be used by the avatar to provide a natural interaction and non-interaction appearance to the avatar.

Required skills or prerequisites:

  1. Ability to work independently and as part of a team.
  2. Knowledge/interest in Unity and C# programming
  3. Ability to work with external partners.

Recommended skills or prerequisites:

  1. none beyond those required for 4080

Instructions: Contact Michael Jenkin by email (jenkin@yorku.ca) if interested.


Evaluating Planning Domain Validation Tools

[added 2024-09-18]

Course: { EECS4080 }

Supervisor: Yves Lesperance

Supervisor's email address: lesperan@eecs.yorku.ca

Project Description: In this project, the student will evaluate software tools (such as Val, FastDownward, and the “unquestionable parser for PDDL 3.1”) that are used to validate planning domains and planning problems specified in the Planning Domain Description Language (PDDL), focussing on the STRIPS fragment. This is part of a larger project to use Large Language Models to generate abstract planning domain models to support more efficient planning and provide explanations at an abstract level by supressing uninteresting details from domain models. The validation tool selected will become part of a neuro-symbolic systems to help users develop such abstract planning models.

Required skills or prerequisites:

  1. EECS 3401
  2. Python

Recommended skills or prerequisites:

  1. basic knowledge of AI planning techniques and PDDL
  2. some experience using large language models

Instructions: Contact instructor by email


TEMPLATE ENTRY 10 - PUT PROJECT TITLE HERE

[added YYYY-MM-DD]

Course: { EECS4080 | EECS4088 | EECS4480}

Supervisor: NAME

Supervisor's email address: EMAIL

Project Description: lorem ipsum…

Required skills or prerequisites:

  1. pre-req 1… do not add pre-reqs that already exist for the course, see Course Descriptions
  2. pre-req 2…

Recommended skills or prerequisites:

  1. recommended skil/prereq1…

Instructions: state how you wish to receive inquiries of interest


TEMPLATE ENTRY 11 - PUT PROJECT TITLE HERE

[added YYYY-MM-DD]

Course: { EECS4080 | EECS4088 | EECS4480}

Supervisor: NAME

Supervisor's email address: EMAIL

Project Description: lorem ipsum…

Required skills or prerequisites:

  1. pre-req 1… do not add pre-reqs that already exist for the course, see Course Descriptions
  2. pre-req 2…

Recommended skills or prerequisites:

  1. recommended skil/prereq1…

Instructions: state how you wish to receive inquiries of interest