projects
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
projects [2015/08/11 20:43] – jarek | projects [2015/08/26 21:59] (current) – jarek | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Proposed Projects for Fall 2015 ====== | ====== Proposed Projects for Fall 2015 ====== | ||
\\ | \\ | ||
+ | ======Clustering High-Dimensional Data Sets====== | ||
- | ===Genome-wide identification of plant micro RNAs=== | + | **Supervisor: |
+ | Clustering is a basic technique for analyzing data sets. Clustering is the process of grouping data points in a way that points within a group are | ||
+ | more similar to each other than points in other clusters. Many clustering algorithms have been developed over the years. However no single algorithm works well for all data sets. Further, most clustering algorithms have running times of the order of n^2 or n^3, so that they are not feasible for data sets with hundreds of thousands of points. In this project we will design good clustering algorithms for large real data sets. In particular we are interested in | ||
+ | Biological data sets. | ||
- | **Supervisor: | + | Our data sets will include those obtained from Flow Cytometry data. Flow Cytometry is a common technique in many areas of Biology, particularly Immunology. Typical usage involves testing a blood sample for 25 attributes on a per-cell basis, and thus typical data sets are arrays of 500,000 points in a 25 dimensional space. The aim is to identify clusters that correspond to a biologist' |
+ | |||
+ | No Biology knowledge is required. The student should be a strong programmer. Knowledge of C/C++ is desirable but not essential. The work involves reading and understanding existing algorithms and working with the supervisor to design and implement improved algorithms and to measure the performance of the proposed algorithm(s). | ||
+ | |||
+ | For more information, | ||
+ | |||
+ | Required Background: General CSE408x prerequisites | ||
+ | |||
+ | |||
+ | \\ | ||
+ | |||
+ | |||
+ | ======Metaheuristic-based Optimization techniques====== | ||
+ | |||
+ | **Supervisor: | ||
+ | |||
+ | Optimization is a crucial step in many computational problems. For computational problems that seem (or are known to be) intractable, | ||
+ | |||
+ | The student should be a strong programmer. A good grasp of algorithms and knowledge of C/C++ are desirable but not essential. The work involves reading and understanding existing algorithms and working with the supervisor to design and implement improved algorithms and to measure the performance of the proposed algorithm(s). | ||
+ | |||
+ | For more information, | ||
+ | |||
+ | Required Background: General CSE408x prerequisites | ||
+ | |||
+ | \\ | ||
+ | |||
+ | ======Data visualization in Skydive====== | ||
+ | |||
+ | **Supervisor: | ||
+ | |||
+ | Skydive is a prototype system designed for database visualization using a concept of the so called | ||
+ | data pyramid. The system is composed of three modules (DB - Database Module, D2I - | ||
+ | Data-to-Image module, and VC - Visualizaton Client). Each is designed to use a different type | ||
+ | of computer memory. The DB module uses disk to store and manage the raw data, and materialized | ||
+ | data pyramids. The D2I module works with a small subset of the aggregated dataset, | ||
+ | and stores data in main memory (RAM). The VC module uses the graphic card’s capabilities to | ||
+ | perform more advanced operations – such as zooming, scaling, panning, and rotation – over the | ||
+ | graphical representation of the data. | ||
+ | Currently the system support three presentation models implemented within the Visualization | ||
+ | Component, namely: | ||
+ | |||
+ | • a 2D heat-map; | ||
+ | |||
+ | • a 2.5 D heat-map by 3D barchart; and | ||
+ | |||
+ | • a 2.5 D terrain (by mesh and UV-mapping). | ||
+ | |||
+ | The goal of the project is to implement two additional ways of data visualization as well as | ||
+ | extend some of existing ones, that is: | ||
+ | |||
+ | 1. Implement and test functions for data pyramid-based visualization of time series. | ||
+ | |||
+ | 2. Implement functions for visualization based on cross-product of data pyramids. | ||
+ | |||
+ | 3. Add support for specular and normal maps for 2.5 D terrain presentation model. | ||
+ | |||
+ | Required Background: CSE 3421, Java programming course, (C programming course a plus) | ||
+ | |||
+ | |||
+ | \\ | ||
+ | |||
+ | ======Genome-wide identification of plant micro RNAs====== | ||
+ | |||
+ | |||
+ | **Supervisor: | ||
Line 50: | Line 118: | ||
\\ | \\ | ||
+ | ======Dynamic Interface Detection and Control Project====== | ||
- | ===Dynamic Interface Detection and Control Project=== | + | **Supervisor: |
- | + | ||
- | **Supervisor: | + | |
Line 79: | Line 146: | ||
\\ | \\ | ||
+ | |||
====== DDoS Attack using Google-bots ====== | ====== DDoS Attack using Google-bots ====== | ||
- | **Supervisor**: Ntalija | + | **Supervisor:** Natalija |
**Recommended Background**: | **Recommended Background**: | ||
Line 105: | Line 173: | ||
\\ | \\ | ||
+ | |||
====== Attentive Sensing for Better Two-Way Communication in Remote Learning Environments ====== | ====== Attentive Sensing for Better Two-Way Communication in Remote Learning Environments ====== | ||
Line 144: | Line 213: | ||
\\ | \\ | ||
- | ====== | + | ====== JPF in a Jar ====== |
**Supervisor: | **Supervisor: | ||
Description: | Description: | ||
- | Java PathFinder | + | JPF, which is short for Java PathFinder, is an open source |
- | The Java library Apache log4j allows developers | + | tool that has been developed at NASA's Ames Research Center. |
- | statements are output. | + | The aim of JPF is to find bugs in Java code. Instead of |
- | to detect bugs in log4j by means of JPF with very limited succes. | + | using testing to find those bugs, JPF uses model checking. |
+ | The facts that JPF is downloaded hundreds of times per month | ||
+ | and that some of the key papers on JPF have been cited more | ||
+ | than a thousand times reflect the popularity of JPF. In | ||
+ | fact it is the most popular model checker for Java. | ||
+ | |||
+ | A study done by Cambridge University in 2014 found that the | ||
+ | global cost of debugging code has risen to $312 billion annually. | ||
+ | Furthermore, | ||
+ | programming time with finding and fixing bugs. | ||
+ | advocating | ||
+ | |||
+ | Installing JPF is far from trivial. | ||
+ | implemented in Java. Therefore, it should, in theory, be | ||
+ | feasible | ||
+ | This would make it significantly simplifying the installation | ||
+ | process | ||
+ | accessible to its potential users. | ||
- | Recently, in collaboration with Shafiei (NASA) we have developed | + | The aim of this project |
- | an extension of JPF called jpf-nhandler. | + | Since JPF relies on a number of configuration files, so-called |
- | is to apply this extension | + | Java properties files, incorporating these properly into the |
+ | jar is one of the challenges. | ||
+ | another challenge. | ||
+ | our modifications | ||
+ | few classes, yet another challenge. | ||
- | [1] David A. Dickey, B. Sinem Dorter, J. Michael German, Benjamin D. Madore, Mark W. Piper, Gabriel L. Zenarosa. " | + | In this project you may collaborate with graduate students |
+ | of the DisCoVeri group (discoveri.eecs.yorku.ca) and | ||
+ | computer scientists of NASA. For more information, feel | ||
+ | free to send email to franck@cse.yorku.ca. | ||
**Required Background: | **Required Background: |
projects.1439325791.txt.gz · Last modified: 2015/08/11 20:43 by jarek