系统发育学/比对准备/全局比对
外观
全局比对方法旨在比对两个序列的整个长度。如果没有序列演化的模型,这些方法是非常通用的,并且可以应用于任何序列(不仅仅是生物学的)相似性。
要讨论的第一个算法是 Needleman-Wunsch。
requires: seq1 = first sequence; seq2 = second sequence //scoring matrices values match = 1 mismatch = -1 gap = -1 sc_matrix //initialize scoring matrix the size of which is the largest sequence length p_matrix //initialize pointing matrix the size of which is the largest sequence length sc_matrix[0][0] = 0 p_matrix[0][0] = 0 // 0 = no where; 1 = left; 2 = right; 3 = down; 4 = up; diagonal = 5; for i=0 to length of seq1 sc_matrix[0][i] = gap * i p_matrix[0][i] = 1 end for for i=0 to length of seq2 sc_matrix[i][0] = gap * i p_matrix[i][0] = 1 end for for i=0 to length of seq2 for j=0 to length of seq1 subseq1 <= seq1[j] subseq2 <= seq2[i] if subseq1 = subseq2 then diag <= sc_matrix[i][j] + match else diag <= sc_matrix[i][j] + mismatch end if up <= sc_matrix[i][j] + gap left <= sc_matrix[i][j] + gap if diag >= up then if diag>= left then sc_matrix[i][j] <= diag p_matrix[i][j] <= 5 else sc_matrix[i][j] <= left p_matrix[i][j] <= 1 end if else if diag>= left then sc_matrix[i][j] <= up p_matrix[i][j] <= 4 else sc_matrix[i][j] <= left p_matrix[i][j] <= 1 end if end for end for al_seq1 <= "" al_seq2 <= "" keep_going <= true i <= seq1.length j <= seq2.length while keep_going = true if p_matrix = 0 keep_going = false end if if p_matrix[i][j] = 5 then al_seq1.add(seq1[i]) al_seq2.add(seq1[j]) i <= i - 1 j <= j - 1 else if p_matrix[i][j] = 1 then al_seq1.add(seq1[i]) al_seq2.add("-") i <= i - 1 else if p_matrix[i][j] = 4 then al_seq1.add("-") al_seq2.add(seq1[j]) j <= j - 1 end if end while al_seq1 <= al_seq1.reverse al_seq2 <= al_seq2.reverse