RPM/RPKM/FPKM/TPM是我們在定義表達(dá)量時常用的幾種計算方式,那么究竟有什么區(qū)別呢?
RPM/CPM
RPM/CPM: Reads/Counts of exon model per million mapped reads
Calculate Formula:
RPM=Total exon reads/ Mapped reads(Millions)
We can get the decision easily: The longer the gene, the greater the number of reads.
So, we calculate the RPKM to exclude the effect of gene length
RPKM
RPKM: Reads Per Kilobase of exon model per Million mapped reads
Range of Use: Single-end RNA-seq
Calculate Formula:
RPKM=Total exon reads/[Mapped reads(Millions)*Exon length(Kb)]
Example of Calculating RPKM

Gene B is twice as long as gene A, and that might explain why it always gets twice as many reads, regardless of replicate.
Sample3 has way more reads than other replicates, regardless of the gene.
RPKM-Step1:normalize for Read Depth

For the purpose of this 4 gene examples, we’re scaling the total read counts by 10 instead of 1,000,000.
Originally,1,000,000 was picked just because it made the numbers look nice.(i.e. they didn’t require too many decimal places)

RPM-scaled using the ‘per million’ factors.
RPKM-Step2:normalize for gene length

Reads are scaled for depth(M) and gene length(K).
FPKM
RPKM and FPKM-two very closely related terms
RPKM=Reads Per Kilobase Million
FPKM=Fragments per Kilobase Million
RPKM is for single-end RNA-seq.
FPKM is for paired-end RNA-seq.
Differences
針對Single-end RPKM與FPKM基本沒有差異
針對Paired-end,如果一對paired-read都比對上那么FPKM計算方法中認(rèn)為這一對read為一個fragment(RPKM則計為2),如果一對中僅有一個比對上,則將比對上的計為一個fragment.

TPM
TPM is like RPKM and FPKM, except the order of operation is switched.
因此比對TPM和FPKM的公式可以發(fā)現(xiàn),F(xiàn)PKM的分母沒有考慮基因長度的影響,所以TPM更加符合我們對相對表達(dá)量的定義。
Example of Calculating TPM
TPM-Step1:Normalize for gene length

RPK-scaled by gene length
TPM-Step2:normalize for sequencing depth

TPM-scaled by gene length and sequencing depth(M)
RPKM vs TPM

With TPM, everyone gets the same sized pie

