1.5 N and C terminal sequencing

Determining N- and C-termini of biologics is an essential part of biochemical characterizations required by regulatory agencies. Gemini Bio provides LC-MS based protein N- and C-termini determination services, which utilize several different techniques including multiple enzyme digestions, N-terminal and C-terminal chemical labeling and bioinformatics analysis to achieve conclusive and quantitative determinations of protein N and C-terminals.  We are able to determine protein N and C-termini even under some difficult situations such as modified N or C-terminal, heterogeneous N or C-termini or where only low amounts of protein are samples available.

(i) N-terminal determination

Although Edman degradation is widely used for protein N-terminal determination,  there are some situations that a protein N-terminal is difficult or even impossible to be determined by Edman degradation thus the LC-MS method becomes a method of the choice:

  • Protein N-terminal is capped or modified

Many proteins in mammalian cells have caped N-terminals. The primary amine in a protein N-terminal can be modified by N-myristoylation, N-acylation, and other post-translational modifications. When glutamine is the terminal amino acid in a mAb it is often converted to pyroglutamic acid in vivo. In those situations, Edman degradation can’t be carried out directly. 

  • Protein sample contains heterogeneous N-terminals.

Many proteins may contain heterogonous N-terminus due to proteolytic cleavage at different amino acids. The difference between those N-terminals may only be a few amino acids. In such a case, it is often difficult to separate protein with heterogenous N-terminals. Direct Edman degradation may give incomplete or incorrect N-terminal information.  In such a case, LC-MS is often a preferred method.

Methods of N-terminal determination by LC-MS

1. Peptide Mapping with Multiple Digestions

In this method, a protein sample is digested with multiple enzymes and N-terminal peptides from those multiple digestions are sequenced by MS/MS. The cleavage by some non-specific proteolytic enzymes (such as chymotrypsin) often gives a serial of N-terminal peptides with a common N-terminal amino acid. Combining with N-terminal results from digestion with some specific cleavage sites (such as trypsin),  a protein N-terminus or termini can be determined confidently.

2. Chemical labeling

We use an amine reactive chemical agent to label protein N-terminal primary amine group(s). We have developed the software tool that can distinguish this labeling on Lys side chain or protein N-terminal, and to specifically identify N-terminal peptide.

(ii) C-terminal determination

Currently there is no standard chemical method to sequence protein C-terminus, and  LCMS is often a preferred method in protein C-terminal determination.


  • Peptide Mapping with Multiple Digestions

Similar to N-terminus mapping, LCMS based on multiple proteolytic digestions is used for the determination of protein C-terminus.  The overlapping C-terminal peptides resulted from difference enzymatic digestions can give a conclusive result about the protein C-terminus.  In addition to normal C-terminal mapping, the procedure can also be applied to identifying modified protein C-terminus and heterogous C-terminals. The method  suitable for protein samples

  • C-terminal Chemical Labeling

We have also developed a method based on chemical labeling of carboxyl group at a protein’s C-terminus, and we have developed a software too that can specifically identify the peptides containing the chemical tag at the peptide C-terminal carboxyl group.