Table of Contents

Proposed Projects for Summer 2023

Building a Personal Finance Planning App

Course: EECS4080

Supervisor: Uyen Trang Nguyen

Supervisor's email address: utn@eecs.yorku.ca

Project Description: How much should I spend and how much should I save (for a car/house and my children’s education)? How much do I need to save for a comfortable retirement? What types of investment should I use for a specific saving goal? What is the investment return rate needed to reach my saving goals? How much will my savings grow? In this project we will build an app to assist people with their personal finance planning. The students will read books/articles on personal finance (provided by the supervisor) to acquire background knowledge for the project.

Required skills or prerequisites: Experience in software development; completion of a co-op or internship in software development.

Instructions:

Learning Analytics Application (LAApp)

Course: EECS4080

Supervisor: Pooja Vashisth; Lab Website: https://lassonde.yorku.ca/users/pvashisth

Supervisor's email address: vashistp@yorku.ca

Project Description: Problems and Needs: Moodle – York University Learning Management System (LMS) – provides instructors with rich data sets for students’ activities and performance. However, while the data comes in bulk, some important information (e.g. course activities) is automatically deleted by the system and replaced with new data each week. This issue prevents instructors from using such data effectively to enhance their teaching. Moreover, the bulk data may prevent professors from taking away useful insights to improve course quality. Ideas: This project aims to address the main problems below with the following solutions: Retrieve full data set using scripts/integrations/apps that automatically pull course data from Moodle; Provide instructors with useful data visualizations from Moodle’s enormous data to support their teaching; Provide a quick summary and insights from those data sets. To be able to identify and address the issues related to: (1) Pull data – quiz statistics, proposal, weekly course activity, new analytics → report, course activity, course grades. (2) Data visualization: visualize general data, data of assessments (quizzes, assignments,…), summary for all assessments and summary for each assessment and for each question, mean, mode, median, average time overall and for each assessment, average number of attempts, grade distribution, time distribution (when and how long students spend on) → clusters, respondents distribution (based on best attempt and compared to class) for each question, correlation between the number of attempts and accuracy, highlight hard questions and corresponding materials/reading views and engagement, extract question labels/keywords, what are the types of questions (MCQ, pseudocode,…) that students are generally not performing well on, data of exam, mean, mode, median, grade distribution, data of engagement, time distribution (when and how long students spend on) per course material, most and least viewed/engaged materials, average views and engagement per material, is there any correlation between students’ engagement and performance (i.e. do students actually achieve higher scores if they engage with course materials more frequently), visualize individual data, current avg. mark, all marks of that student up to that moment, performance as a graph (quizzes, assignments, test 1, test 2, final), attempts and accuracy per quiz, time spent on each assessment, time a student started an assessment and the last submission timestamp, questions/topics that each student is doing well/struggling with engagement: what content does this student engage with? how often? when? how long? times opened a document? correlation between engagement and performance? feedback and insights, individual feedback – assess whether a student is at risk or not, if a student does well on assignments, how do they perform on midterms and finals? how about the opposite case? [tentative] predict examination results? interventions, when and how instructors should deploy their interventions to students, based on the results provided? [tentative] option to email/inform those low-performing students?

Deliverables: Research and Proposal for a website to show statistical visualizations (high priority), sign in with Moodle, all students’ data graphs individuals’ data graphs feedback by words | insights for each student, automation on data collection from Moodle (medium priority) instructors will be able to share those results with students (optional – low priority).

The extent of implementation depends on the complexity and scope of the tasks mentioned above.

Required skills or prerequisites: The position is open for third and fourth year students in a computer science, statistics, data science, or engineering degree.

Recommended skills or prerequisites: Good at programming, statistics, research, and has a willingness to explore the unknown.

Research and statistics skills; web development (applying OOP design, design pattern, and SOLID principles in developing backend API (using Python Flask)); data analysis: using Python framework and library (pandas, NumPy, seaborn, matplotlib, statsmodel) in analyzing the data

Instructions: Send your CV, transcript, statement of interest in the project, and your suitability to the same to Professor Vashisth.

Designing Privacy-preserving Systems

Course: EECS4080 or EECS4480

Supervisor: Yan Shvartzshnaider

Supervisor's email address: rhythm.lab@yorku.ca

Project Description: Modern sociotechnical systems share and collect vast amounts of information. These systems violate users’ privacy by ignoring the context in which the information is shared and failing to incorporate contextual information norms.

Using techniques in natural language processing, machine learning, network, and data analysis, this project is set to explore the privacy implications of mobile apps, online platforms, and other systems in different social contexts/settings.

To tackle this challenge, the project will operationalize a cutting-edge privacy theory and methodologies to conduct an analysis of existing technologies and design privacy-enhancing tools.

Students will help analyze information handling practices of online services and design privacy-enhancing tools.

Specific tasks include: comprehensive literature review of existing methodologies and tools, analysis of privacy policies and regulations, visualization of information collection practices, and design of a web-based interface for analyzing extracted privacy statements to identify vague, misleading, or incomplete privacy statements.

For prior project, see this link

Required skills or prerequisites: Good programming and data analysis skills overall, and experience in using Jupyter and/or R for data analysis.  Ability to work indecently. Interest in usable privacy, critical analysis of privacy policies and privacy related regulation.

Recommended skills or prerequisites: Experience with Machine Learning, Natural Language Processing techniques, HCI design. Students with diverse backgrounds, including in technical fields, social sciences and humanities are encouraged to apply.

Instructions: Please fill in this form

Identifying the Relationship between Course Performance and Career-Readiness

Course: EECS4080

Supervisor: Marzieh Ahmadzadeh

Supervisor's email address: marzieha@yorku.ca

Project Description: The proposed study uses data-mining techniques to explore the relationship between course performance and career readiness in Electrical Engineering and Computer Science. The study includes gathering data on students' online performance (e.g., StackOverflow) as well as their academic performance. It then aims to analyze the relationship between these and the career readiness level of individuals as measured by surveys, standardized tests, and other relevant metrics.

The project follows a mixed method. First, the student applies data mining and machine learning techniques to identify the relationship between multiple factors, patterns, trends, and correlations within the collected data. This analysis involves various aspects, including but not limited to the impact of demographic and other relevant factors on the results.

Required skills or prerequisites:

Recommended skills or prerequisites:

Instructions: Email your full CV and transcript to marzieha@yorku.ca

Robot Tutors in Higher Education

Course: EECS4070

Supervisor: Meiying Qin

Supervisor's email address: mqin@yorku.ca

Project Description: In this reading course, you will survey the literature on robot tutoring systems, which lies in the field of human-robot interactions (HRI). In particular, you will read literature on robot tutors for different ages ranging from elementary school students to university school students. You will also learn reinforcement learning relevant to model the robot tutors. In order to gain a deeper understanding of the materials, you may design a relevant project with what you have learned, though you will not implement the project. You are expected to compile a survey of robot tutoring systems as an outcome of this course. Depending on the quality of the survey, we may publish this survey and you may gain experience of formal publication.

Required skills or prerequisites:

Recommended skills or prerequisites:

Instructions: Please send your c.v. and transcript. Optional: e-portfolio that demo previous projects that one has worked on

How to use ChatGPT in Classrooms

Course: EECS4080

Supervisor: Hadi Hemmati

Supervisor's email address: hemmati@yorku.ca

Project Description: ChatGPT and similar AI-based generative models are becoming very strong these days and one worry about them is their misuse in classrooms (i.e., for cheating). However, they can also be seen as a tool in the instructor's hand to leverage in the assignments and class activities. In this project we try to build tools that help university instructors use AI in classroom. Note: students will be expected to do a portion of the project, not the whole.

Required skills or prerequisites:

Recommended skills or prerequisites:

Instructions: Send c.v. and transcript to Prof. Hemmati.

ChatGPT-based tool for code defect detection

Course: EECS4080

Supervisor: Hadi Hemmati

Supervisor's email address: hemmati@yorku.ca

Project Description: In this project, you will work with some grad students and research assitants to help implement tools that uses ChatGPT or other similar large language model to predict defects in code.

Required skills or prerequisites: Java, Python, basic knowledge of Machine Learning

Recommended skills or prerequisites:

Instructions: Send your CV and transcript to the Prof. Hemmati.

Design and Implementation of a Mental Health Mobile App for University Students

Course: EECS4080

Supervisor: Kiemute Oyibo

Supervisor's email address: koyibo@yorku.ca

Project Description: Mental health issues are a growing public concern globally. University students are not left out of this global crisis, which became more prevalent during and after the COVID-19 pandemic. Mental health issues such as stress, anxiety, and depression adversely affect health and quality of life, including students’ academic performance. Research shows that 21% of counseling center students’ cases border on severe mental health issues. Despite the growing number of students experiencing mental health challenges, university support services are inadequate. This calls for additional support systems to help university students that may be in need, e.g., of social support and counselling. We propose to utilize a mental health app tailored to the university student community to support and promote students’ mental wellbeing on campus. In this project, we aim to design, implement and evaluate a mental health mobile app that has the potential to foster students’ mental health and wellbeing.

Required skills or prerequisites:

Recommended skills or prerequisites:

Instructions: Send your cv and transcript to Prof. Oyibo.

Design and Implementation of an Interactive Website to Create Awareness about Dark Patterns

Course: EECS4080

Supervisor: Kiemute Oyibo

Supervisor's email address: koyibo@yorku.ca

Project Description: Dark patterns have become prevalent in the online environment. They are deceptive and/or manipulative interfaces crafted by UX designers in the best of interest of the vendor or service. Examples include bait and switch, sneak into the basket, hidden cost, tricky question, and confirm shaming. Despite their prevalence in the online environment, public awareness about dark patterns is still low. In this project, we aim to create an interactive educational website in which the various types of dark patterns are illustrated, and users are made aware of the “underlying intentions” and the potential cost and effects on users.

Required skills or prerequisites:

Recommended skills or prerequisites:

Instructions: Send your cv and transcript to Prof. Oyibo.

Electric Load Forecasting via Deep Generative Models

Course: EECS4080

Supervisor: Michael Jenkin

Supervisor's email address: jenkin@yorku.ca

Project Description: With the fast increase in renewable energy generation and electric vehicles, electric load forecasting is becoming more and more important for power system operation. Based on the forecasting horizon, there are mainly three types of load forecasting, i.e., short-term, medium-term, and long-term. Short-term load forecasting mainly aims to predict the electric load in the next few seconds to the next few hours, which can be very helpful for real-world energy dispatching. In recent years, machine learning, especially deep learning, has shown impressive performance for short-term load forecasting. Generative models, e.g., generative adversarial networks, have shown great potential for computer vision and natural language processing. The potential of such generative models has not been well studied for load forecasting. In this project, we mainly aim to benchmark the performance of different types of deep generative models for short-term load forecasting. We will mainly work on OPEN EI data sets which consist of electric load consumption data sets for different buildings in the US.

Required skills or prerequisites: Good python software skills. Interest in AI systems.

Recommended skills or prerequisites: Interest in GANs. Interest in AI software development.

Instructions: Send CV, (unofficial transcript), GitHub repo address if available to Prof. Jenkin.

Analyze the Impacts of Ensemble Learning for Anomaly Detection

Course: EECS4080

Supervisor: Michael Jenkin

Supervisor's email address: jenkin@yorku.ca

Project Description: Hacking and false data injection from adversaries threaten can cause significant financial loss. Accurate detection of anomalies is of significant importance for the safe and efficient operation of modern power grids. In recent years, different types of techniques, such as statistical methods, unsupervised learning methods, generative models, and prediction-based methods, have been applied for anomaly detection. However, most of the current works assume the stability of the data distribution and ignore the distribution drift, which often happens in the real world. In this work, we aim to utilize the benefits of ensemble learning to address real-world anomaly detection problems. Specifically, we plan to dynamically utilize the different base models via ensemble learning to tackle the challenges of distribution drift in the real world. For this project, we will mainly work on two data sets, i.e., the Secure Water Treatment (SWaT) Dataset and ICS Cyber Attack Dataset. These two data sets are frequently used real-world data sets for anomaly detection.

Required skills or prerequisites: Interest in AI systems. Interest in AI software development

Recommended skills or prerequisites: Good python programming skills. Some course(s) in AI systems

Instructions: Send CV, (unofficial transcript), GitHub repo address if available to Prof. Jenkin.