User Tools

Site Tools


ongoing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
ongoing [2011/05/30 17:17] bilongoing [2011/08/23 20:46] dymond
Line 1: Line 1:
-====== Ongoing projects ======+====== Previous projects ======
  
 ====== Imputation of missing values in microarray data ====== ====== Imputation of missing values in microarray data ======
Line 9: Line 9:
 __Description__ __Description__
  
-    Microarrays are a relatively new technology that have had tremendous impact +Microarrays are a relatively new technology that have had tremendous 
-    on many areas within biology and bioinformatics.  Microarray technology +impact on many areas within biology and bioinformatics.  Microarray 
-    enables researchers to study the behaviour of many genes and/or conditions +technology enables researchers to study the behaviour of many genes 
-    in a single experiment.+and/or conditions in a single experiment.
  
-    Due to technological limitations and experiment design issues, microarray +Due to technological limitations and experiment design issues, 
-    data sets typically have several missing values.  It has been shown [3] that +microarray data sets typically have several missing values.  It has been 
-    imputation of these values improves the accuracy of different processing +shown that imputation of these values improves the accuracy of 
-    tasks, including clustering, that are typically done on these data sets. +different processing tasks, including clustering, that are typically 
-    Therefore, good imputation algorithms are required.+done on these data sets.  Therefore, good imputation algorithms are 
 +required.
  
-    In this project, we will explore fast and accurate imputation algorithms for +In this project, we will explore fast and accurate imputation algorithms 
-    microarray data.  The student will first read the papers assigned and write +for microarray data.  The student will first read the papers assigned 
-    a short summary of them.  Then, he will study the performance a few +and write a short summary of them.  Then, he will study the performance 
-    algorithms from the literature (many algorithms are already implemented but +a few algorithms from the literature (many algorithms are already 
-    1 - 2 may need to be implemented).  Finally, he will work with the +implemented but 1 - 2 may need to be implemented).  Finally, he will 
-    supervisor on the design of better algorithms for the problem being +work with the supervisor on the design of better algorithms for the 
-    studied.  He will use publicly available data sets to compare the +problem being studied.  He will use publicly available data sets to 
-    performance (accuracy and speed) of the new algorithm(s) to the GMCImpute +compare the performance (accuracy and speed) of the new algorithm(s) to 
-    algorithm and several other existing ones+the GMCImpute algorithm and several other existing ones.
- +
-    Throughout the course, the student is required to maintain a course website +
-    to report any progress and details about the project.+
  
 +Throughout the course, the student is required to maintain a course
 +website to report any progress and details about the project.
  
 ====== An Open Source Structural Equation Modeling Graph Drawing Application ====== ====== An Open Source Structural Equation Modeling Graph Drawing Application ======
Line 42: Line 42:
 __Description__ __Description__
  
-Structural equation modeling (SEM) is a statistical technique that is becoming increasingly popular in the sciences. SEM allows researchers to test the validity of hypothesized models involving complex relationships among multiple variables. These models can include latent variables, which are not measured directly but are constructs inferred by observed variables. Structural equation models can be represented visually by graphs (Figure 1 - attached) To generate figure 1 currently in R would require over 80 lines of code which has no reusability and has to be re written each time a new graph has to be developed or analyzed (R is a UNIX based command line only program, however it is a very powerful analytic research tool).+Structural equation modeling (SEM) is a statistical technique that is becoming increasingly popular in the sciences. SEM allows researchers to test the validity of hypothesized models involving complex relationships among multiple variables. These models can include latent variables, which are not measured directly but are constructs inferred by observed variables. Structural equation models can be represented visually by graphs. To generate such graphs currently in R would require over 80 lines of code which has no reusability and has to be re written each time a new graph has to be developed or analyzed (R is a UNIX based command line only program, however it is a very powerful analytic research tool).
  
 Collected data is used to estimate the parameters of the equations and assessing the fit of the model. There are several SEM software options available to researchers, however all have serious limitations (Windows only, Unix only, expensive licensing fees, text based or command line only, no GUI, etc). Collected data is used to estimate the parameters of the equations and assessing the fit of the model. There are several SEM software options available to researchers, however all have serious limitations (Windows only, Unix only, expensive licensing fees, text based or command line only, no GUI, etc).
ongoing.txt · Last modified: 2013/04/19 20:29 by mb