Principal components analysis of protein and nucleotide alignments in Jalview
Articles Blog

Principal components analysis of protein and nucleotide alignments in Jalview

October 29, 2019

In this Jalview Online Training video, I will
consider how to do principal components analysis on the protein alignments. Principal Components Analysis is an alternative
to cluster analysis as a way to visualise groups of similar sequences. To demonstrate this feature, I will import
a group of sequences into Jalview that have already been aligned. The URL is
shown on the screen and is in the notes below. To analysis the alignment, I go to the Calculate
drop down menu and select Principle Component Analysis. When the calculation is completed, a PCA viewer
opens and displays the results. Each sequence is represented by a small square. If I placed the cursor over a square, a tool
tip appears identifying the sequence. The axes can be rotated by clicking and dragging
the left mouse button and zoomed using the mouse’s scroll wheel or the up down arrow
keys. I will generate a tree by going to the Calculate
drop down menu, and select, calculate Tree, and Neighbour
Joining using BLOSUM62. A tree will appear in new window. Colouring groups of sequences in the Tree
window, will colour sequences in PCA window. Sequences can be selected by holding the [CTRL] button and the clicking on the sequence. And unselected from the alignment window. Similar sequences lie close to each other
in 3D space. Labels can be added by going to View menu
and selecting the Show Labels option. The background colour can be changed by going
to the View menu, and clicking Background Colour and selecting
the colour of choice. By going to the File drop down menu, the PCA
graph can be exported as an EPS or PNG image, by selecting the appropriate Save As option. Data and coordinates can be exported using
options available in the file menu. By default, the protein principal components
analysis uses BLOSUM 62 pairwise substitution scores. Note: PCA can also be performed on nucleotide
alignments. The calculation method can be modified by
going to the Change Parameters drop down menu. A final point, the principal components analysis
calculation is computationally expensive, so there may be memory issues if the alignment
is large. For more information please, go to the Jalview
online documentation or read our Jalview manual available on our
web site, Good bye.

Only registered users can comment.

Leave a Reply

Your email address will not be published. Required fields are marked *