Weening myself off of Excel
In some sense, the computing I did today isn’t really useful, since I already worked out these things using Microsoft Excel. But I’ve been ordered by my bioinformatics consultants to stop with the Excel already. So as practice, I worked out some of the expected features of degenerate oligos again, but this time using R.
The main motivation for doing this besides practice is that I am fairly sure we should be ordering degenerate oligos with more degeneracy than we have previously considered. I won’t make that argument here, but just repeat some analytical graphs I’d previously made.
It took a while (since I’m learning), but was still much more straight-forward than doing it in a spreadsheet. The exercise was extremely useful, as I learned a bunch of stuff (especially about plots in R), while doing the following:
Problem #1: Given a percentage of degeneracy per base, d, in an n length oligo, what is the proportion of oligos with k mismatches?
Answer #1: Use the binomial distribution. For a 32mer with different levels of degeneracy (shown in legend):
Problem #2: Given a million instances of such an oligo, how well would each possible oligo with k mismatches be observed?
Answer #2: Simply adjust each of the above values by dividing the number of classes within each of k mismatches (i.e. choose(n, k)):
Problem #3: If some number of bases, m, in the n-length oligo are “important”, what proportion of oligos with k mismatches will have x “hits”?
Answer #3: Use the hypergeometric distribution. The below plot is as for Problem #1 for 0.12 degeneracy, but with the # of hits broken down for each k:
I didn’t try super-hard to make the perfect graphs, but it did take some effort to make a stacked bar plot…
Related Posts
- November 27, 2009 -- BaseClear to bundle Illumina data with CLC bio’s software (0)
Today, the leading provider of Next Generation Sequencing analysis solutions, CLC bio, announced tha... - November 27, 2009 -- Increase your productivity with our new integrable database solution (0)
Today we have released CLC Bioinformatics Database. This powerful and versatile database solution en... - November 27, 2009 -- DNA: No rules violated in giving visa to Rana: Indian Consul General http://bit.ly/5DtzZN (0)
DNA: No rules violated in giving visa to Rana: Indian Consul General http://bit.ly/5DtzZN... - November 26, 2009 -- Genetics = Real Science (0)
Matchmaking services are adding DNA testing to their list of offers. The DNA test analyzes HLA genes... - November 26, 2009 -- Esneme teknikleri (0)
WH, egzersizden sonra yapman gereken doğru esneme tekniklerini açıklıyor. Bu teknikleri uygularsan s... - November 26, 2009 -- AMCP Posts Summary Of Key Provisions In Senate’s H.R. 3590" (0)
On Saturday, November 21, 2009 the U.S. Senate voted 60-39 to end debate on a motion that sets the s... - November 26, 2009 -- Tai Chi! (0)
Felsefe ve savunmanın elele olduğu bir sanat...... - November 26, 2009 -- The 4 Best Foods for Runners (0)
From almonds to tuna--foods that give you that extra boost for your run.... - November 26, 2009 -- Gereksiz bilgileri at gitsin (0)
Stanford Üniversitesi uzmanları, gereksiz bilgileri kafasından atan insanların, yararlı detayları da... - November 26, 2009 -- Aşk nedir ki (0)
Milyonlarca insanın üzerine şiirler, yazılar, şarkılar yazdığı bir konuda fikir belirtmek kolay deği...
This entry was posted
on Wednesday, November 25th, 2009 at 9:20 pm and is filed under DNMA vitamins, Genomic DNA, blood test, dna, dna test.
You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.