oxfordfeb162012:start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
oxfordfeb162012:start [2015/11/07 15:38] – magiero | oxfordfeb162012:start [2015/11/09 19:59] (current) – magiero | ||
---|---|---|---|
Line 21: | Line 21: | ||
==== - Present Invention [0013]-[0020] ==== | ==== - Present Invention [0013]-[0020] ==== | ||
• [0013] The **present invention** shows a method for analyzing measurements dependent on the identity of k-mers consisting of...\\ | • [0013] The **present invention** shows a method for analyzing measurements dependent on the identity of k-mers consisting of...\\ | ||
- | • [0014] 1.) deriving a **feature vector** from the measurements | + | • [0014] 1.) deriving a **feature vector** from the measurements.\\ |
• [0015] 2.) determining the similarity to at least one other feature vector.\\ | • [0015] 2.) determining the similarity to at least one other feature vector.\\ | ||
• [0017] The present invention **does not try to extract the exact sequence**. | • [0017] The present invention **does not try to extract the exact sequence**. | ||
Line 203: | Line 203: | ||
* //abundance of biomarker miRNA//: 20-25-mer RNA circulating in blood; expression associated with disease/ | * //abundance of biomarker miRNA//: 20-25-mer RNA circulating in blood; expression associated with disease/ | ||
* //foetal copy number variation//: | * //foetal copy number variation//: | ||
- | * // | + | * // |
* //viral or bacterial load//: measure of infection severity; number of pathogen RNA and DNA per ml of blood is measured (possibly with enrichment); | * //viral or bacterial load//: measure of infection severity; number of pathogen RNA and DNA per ml of blood is measured (possibly with enrichment); | ||
* //probes//: e.g. aptamers to a biomarker panel some of which attach to a target molecule; count those that did/ | * //probes//: e.g. aptamers to a biomarker panel some of which attach to a target molecule; count those that did/ | ||
Line 212: | Line 212: | ||
=== - Measurement of Differences [0302]-[0305] === | === - Measurement of Differences [0302]-[0305] === | ||
- | • [0303] | + | • [0303] |
- | • [0304] | + | • [0304] |
+ | • [0305] __Example 3__: **Splice variants** and/or **translocation breakpoints**; | ||
=== - Identification of Presence (with Confidence) [0306]-[0311] === | === - Identification of Presence (with Confidence) [0306]-[0311] === | ||
+ | • [0308] __Example 1__: Identify populations related to some degree, but not identical to known molecule. E.g. homology of DNA or protein in rapidly mutating diseases.\\ | ||
+ | • [0309] __Example 2__: [[https:// | ||
+ | • [0310] __Example 3__: Diagnosis of [[https:// | ||
+ | • [0311] __Example 4__: Comparing plural parts of DNA to plural stored features; e.g. DNA sequence for known protein domains may be stored in library; part of derived feature vector may match a catalytic domain and part may match a DNA binding domain thus the function of the protein may be deduced.\\ | ||
=== - Assembly Application [0312]-[317] === | === - Assembly Application [0312]-[317] === | ||
+ | • | ||
==== - Use Examples [0318]-[0351] ==== | ==== - Use Examples [0318]-[0351] ==== | ||
=== - Data Acquisition [0318]-[0325] === | === - Data Acquisition [0318]-[0325] === | ||
+ | • [0323] Single-channel currents measured on Axopatch 200B equipped with 1440A digitizers.\\ | ||
+ | • [0323] Via Pt electrodes cis connected to ground of Axopatch head stage and trans connected to active electrode of the headstage.\\ | ||
=== - Identification and Quantification of DNA [0326]-[0333] === | === - Identification and Quantification of DNA [0326]-[0333] === | ||
+ | • [0326] This example describes the process of identification of DNA molecules in a solution from a pre-determined library of feature vectors.\\ | ||
+ | • [0327] Library construction performed as follows: | ||
+ | - Take [[https:// | ||
+ | - Chop it up into 18 400-mer sequences overlapping adjacent strands by 100 bases | ||
+ | - All strands contain sequence at beginning and end common to all strands and not part of the larger genome (these are artificially attached on?!?!?!?) | ||
+ | - Library feature vectors constructed for mean current (5-mer model) | ||
+ | • [0329] Candidate molecule are reduce to feature vectors (by measurement)\\ | ||
+ | • [0330] Candidates compared against library using **alignment algorithm**. | ||
+ | • [0331] Comparison by alignment score used a **gap penalty** of -1 and a scoring function of reciprocal absolute difference (i.e. closer matches are higher scores). | ||
+ | {{alignscore.png? | ||
+ | • [0333] All test molecules used in this experiment were molecule 13. Occasionally (< 10% of the time by my guess) these are incorrectly identified as molecule 12 from the library.\\ | ||
=== - Measurement of SNPs [0334]-[0339] === | === - Measurement of SNPs [0334]-[0339] === | ||
+ | • [0335] The candidate molecule is a 19th pattern. | ||
+ | {{snpcurrent.png? | ||
+ | • [0336] Alignment based identification method used previously was repeated. | ||
+ | • [0337] For SNP calling: | ||
+ | * HMM and Viterbi path used for alignment | ||
+ | * This has better path constraint (i.e. will align better through mismatched SNP regions) than... | ||
+ | * Needleman-Wunsch | ||
+ | Alignments shown below compare well with idealized library mutations shown above.\\ | ||
+ | {{snpcurrent2.png? | ||
+ | • [0338] 176 molecules were aligned and their SNPs clearly identified. | ||
+ | • [0338] In the case of 335 and 357 changes seen at **several position** (i.e. not just 1 as intended). | ||
+ | {{snpcompare.png? | ||
=== - Identification of Population and Sub-Population [0340]-[0343] === | === - Identification of Population and Sub-Population [0340]-[0343] === | ||
+ | • [0341] This example is worked through with simulated data.\\ | ||
+ | • [0341] A set of 60 mean current feature vectors of library component 13 is simulated.\\ | ||
+ | • [0341] 10 of these contain a SNP.\\ | ||
+ | • [0341] Gaussian noise of std. dev. 1-pA is added to each value and 5% of values within each vector are deleted at random.\\ | ||
+ | • [0342] Using this dataset a consensus is constructed using the **landmark process** outlined earlier. | ||
+ | {{consensus.png? | ||
+ | • [0343] SNP of populations 51-60 gathered from consensus is clear in the figure below.\\ | ||
+ | {{snppop.png? | ||
=== - Identification of a Number of Populations [0344]-[0347] === | === - Identification of a Number of Populations [0344]-[0347] === | ||
+ | • [0345] Experiments on 2 and 3 simulated species were done. That is populations of 2 and 3 different DNA samples were fed into the machine. | ||
+ | {{tree2.png? | ||
+ | • [0347] Identifications (as in example 2) were run for both experiments identifying the population distributions (see histograms). | ||
+ | {{hist2.png? | ||
=== - Assembly of a Library [0348]-[0351] === | === - Assembly of a Library [0348]-[0351] === | ||
+ | • [0349] Using the 1-18 overlapping strands mentioned above (but without the extra common pattern tacked on to both ends of each strand).\\ | ||
+ | • [0350] A tree by **neighbour joining** on **pairwise alignment** scores was constructed.\\ | ||
+ | • [0350] Since relatively large non-similar regions were expected, a scoring function that does not penalize gaps at the beginning or end of the alignment as strongly as those within alignment was used (see result below).\\ | ||
+ | {{libtree.png? | ||
+ | • [0350] All sequences have similar relation to two other sequences representing the ~100 base overlap on either side.\\ | ||
+ | • [0351] Progressing through the tree in order of relatedness: | ||
+ | - consensus landmarks for the aligned sequences are constructed | ||
+ | - the landmark from that alignment then serves as the feature vector for alignment with the next sequence | ||
+ | - output of the process is a fully assembled feature vector | ||
+ |
oxfordfeb162012/start.1446910681.txt.gz · Last modified: 2015/11/07 15:38 by magiero