I need a function that calculates the Levenshtein distance between two strings. Unfortunately, I'm not sure whether recursion between cases is best, or if mapping out each change in memory before choosing is best. I'm using it to find similarities in base sequences of RNA. Any advice?
since youll be calculating the differences between RNA strands, i dont think we'll need to account for differences in length. now since you are finding similarities between the two, we'll prob need a list to keep track of where it's different (or in this case, similar). the easiest method would prob be to recursively test each element of the string and record it to a list like:
length(str1→dim(L1 for(A,1,length(str1 sub(str1,A,1)=sub(str2,A,1→L1(A end
we can prob drop a str1=str2 at the very beginning in case the entire thing is the same.
The Silver Phantom welcomes you
That's what I was thinking, since I guess there will not be a shorter Levenshtein distance this way… but wait, say stop and tops. The distance is 2, but this program would have sum(L1 to be 4. Maybe I don't need Levenshtein distance at all for RNA… but I do want to account for mutations. Is there a low-memory way to do it for strings of different lengths?