Sequence analysis is the application of Information Technologies to Molecular Biology. It deals with biological sequences, and processes them to extract significant information that may yield new insights and guidelines in the understanding of biological organisms
Basics for sequence analysis
Proteins
A protein is typically built of a series of basic blocks called amino acids , chained together in a linear sequence of blocks. Amino acids may come in a variety of shapes and properties: they may be small or bulky, hidrophobic or hidrophyllic, electrically charged or neutral, etc... hence allowing for very complex shapes and interactions to be produced.
Amino acids are commonly referred to by name or by an abbreviation, usually in three or one letter. This allows for more efficient descriptions of how they are chained together to build a protein:
Neutral-Nonpolar | 3-letter | 1-letter |
Glycine | Gly | G |
L-Alanine | Ala | A |
L-Valine | Val | V |
L-Isoleucine | Ile | I |
L-Leucine | Leu | L |
L-Phenylalanine | Phe | F |
L-Proline | Pro | P |
L-Methionine | Met | M |
Neutral-Polar | ||
L-Serine | Ser | S |
L-Threonine | Thr | T |
L-Tyrosine | Tyr | Y |
L-Tryptophan | Trp | W |
L-Asparagine | Asn | N |
L-Glutamine | Gln | Q |
L-Cysteine | Cys | C |
Acidic | ||
L-Aspartic | Asp | D |
L-Glutamic | Glu | E |
Basic | ||
L-Lysine | Lys | K |
L-Arginine | Arg | R |
L-Histidine | His | H |