記錄《Bioinformatics Data Skills》中關(guān)于R的實(shí)用操作

##################2019年1月18日14:34:07##########################

example("pheatmap") #獲取函數(shù)的示例
help.search("heatmap") #根據(jù)關(guān)鍵詞搜索相關(guān)的函數(shù)
library(help="pheatmap") #查看包的詳細(xì)信息
ls() #We can see objects we’ve created in the global environment 
length() #return the length of vector

Alt - on Windows 快捷生成 “<-”

特點(diǎn)

  • R does not have a type for a single value (known as a scalar) such as 3.1 or “AGCTACGACT.” Rather, these values are stored in a vector of length 1.
    (R沒有類型的變量用來存儲(chǔ)一個(gè)值,例如字符串xx,相對(duì)應(yīng),這些值被存儲(chǔ)在長度為1的向量中)
  • R’s vectors are the basis of one of R’s most important features: vectorization. Vectorization allows us to loop over vectors elementwise, without the need to write an explicit loop.
    (向量的一個(gè)重要特點(diǎn)是能夠?qū)υ剡M(jìn)行迭代而不需要明確的循環(huán))

################2019年1月22日09:48:01#######################

  • When we assign a value in our R session, we’re assigning it to an environment known
    as the global environment.
  • Calling the function search() returns where R looks when searching for the value of a variable—which includes the global environment (.GlobalEnv) and attached packages.
    (當(dāng)使用search()查找變量的值時(shí),會(huì)返回R在全局變量(.GlobalEnv)以及相應(yīng)的包中查找的結(jié)果。
  • if one vector is longer than the other, R will recycle the values in the
    shorter vector. This is an intentional behavior, so R won’t warn you when this hap‐
    pens
> x <- c(1,2,3)
> x + 1
[1] 2 3 4
> y <- c(1,2)
> x + y #當(dāng)兩個(gè)元素的向量不是乘積倍的時(shí)候
[1] 2 4 4
Warning message:
In x + y : longer object length is not a multiple of shorter object length
  • R will return a missing value (NA; more on this later) if you try to access an ele‐
    ment in a position that’s greater than the number of elements.
> z[c(2, 1, 10)]
[1] 2.2 3.4 NA

It’s also possible to exclude certain elements from lists using negative indexes
(使用負(fù)號(hào)來跳過數(shù)據(jù))

> order(z)
[1] 4 3 5 2 1
> z[order(z)]
> order(z, decreasing=TRUE)
[1] 1 2 5 3 4
> z[order(z, decreasing=TRUE)] #order返回排序后的索引
[1] 3.4 2.2 1.2 0.4 -0.4
> sort(b,decreasing = T) #返回排序后的值
  b  a1  a3  a2   c 
5.4 3.4 2.0 1.0 0.4

Again, often we use functions to generate indexing vectors for us. For example, one
way to resample a vector (with replacement) is to randomly sample its indexes using
the sample() function:
[1] http://www.itdecent.cn/p/38d0a44630f8
[2] https://bbs.pinggu.org/thread-3068145-1-1.html

> set.seed(0) # we set the random number seed so this example is reproducible
> i <- sample(length(z), replace=TRUE) #replace是否放回取樣
> i
[1] 5 2 2 3 5
> z[i]
[1] 1.2 2.2 2.2 0.4 1.2

NA is R’s built-in value to represent missing data.
NULL represents not having a value
-Inf, Inf These are just as they sound, negative infinite and positive infinite values.
NaN stands for “not a number,” which can occur in some computations that don’t
return numbers, i.e., 0/0 or Inf + -Inf.

> is.nan(0/0)
[1] TRUE
> x <- c()
> is.null(x)
[1] TRUE
> y <- c(1,2,3)
> is.na(y[4])
[1] TRUE

Because all elements in a vector must have homogeneous data type, R will silently coerce elements so that they have the same type.
(當(dāng)構(gòu)建向量時(shí),R會(huì)自動(dòng)進(jìn)行數(shù)據(jù)類的強(qiáng)轉(zhuǎn)。)

  • When called on numeric values, summary() returns a numeric summary with the
    quartiles and the mean.
  • Likewise, R’s data-reading functions can also read gzipped files directly—there’s
    no need to uncompress gzipped files first.
  • reshape2 package provides functions to reshape data: the function melt()
    turns wide data into long data, and cast() turns long data into wide data.
  • One nice feature of data.frame() is that if you provide vectors as named arguments, data.frame() will use these names as column names.
    ################2019年1月23日09:29:13#######################
    Omitting the row index retrieves all rows, and omitting the column index retrieves all columns.
    (省略列索引將檢索所有的行,省略行索引將檢索所有的列。)
> y <- cbind(x1 = 3, x2 = c(4:1))
> y
     x1 x2
[1,]  3  4
[2,]  3  3
[3,]  3  2
[4,]  3  1
> y['x1']
[1] NA
> y[1,'x1']
x1 
 3 
> y[,'x1'] 
[1] 3 3 3 3
  • It’s a good idea to avoid referring to specific dataframe rows in your
    analysis code.
  • From summary(), we see that this varies quite considerably across all windows on chromosome 20:
> summary(d$total.SNPs)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 3.000 7.000 8.906 12.000 93.000
  • Remember, columns of a dataframe are just vectors. If you only need the data from
    one column, just subset it as you would a vector:
  • Note that there’s no need to use a comma in the bracket because d$percent is a vector, not a two-dimensional dataframe
> d$percent.GC[d$Pi > 16]
[1] 39.1391 38.0380 36.8368 36.7367 43.0430 41.1411 [...]

Thus, d[$Pi > 3, ] is identical to d[which(d$Pi > 3), ];

> d$Pi > 3
[1] FALSE TRUE FALSE TRUE TRUE TRUE [...]
> which(d$Pi > 3)
[1] 2 4 5 6 7 10 [...]

subset() takes two arguments: the dataframe to operate on, and then conditions to include a
row. With subset(), d[dPi > 16 & dpercent.GC > 80, ] can be expressed as:

$ subset(d, Pi > 16 & percent.GC > 80)
start end total.SNPs total.Bases depth [...]
58550 63097001 63098000 5 947 2.39 [...]
  • Note that we (somewhat magically) don’t need to quote column names. This is
    because subset() follows special evaluation rules, and for this reason, subset() is
    best used only for interactive work.
> subset(d, Pi > 16 & percent.GC > 80,
c(start, end, Pi, percent.GC, depth))
start end Pi percent.GC depth
58550 63097001 63098000 41.172 82.0821 2.39
58641 63188001 63189000 16.436 82.3824 3.21
58642 63189001 63190000 41.099 80.5806 1.89

#####################ggplot2##################

  • ggplot2 works exclusively with dataframes, so you’ll need to get your data tidy and into a dataframe before visualizing it with ggplot2.
  • Each layer updates our plot by adding geometric objects such as the points in a scatterplot, or the lines in a line plot.
    Geom = Geometric =幾何學(xué)
    aes =aesthetic = 美學(xué)的
  • We specify the mapping of aesthetic attributes to columns in our dataframe using the function aes().
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi閱讀 7,854評(píng)論 0 10
  • 老婆用她半個(gè)月的工資給我買了一個(gè)手機(jī)作為生日禮物,我開心極了,每天拿在手上視如珍寶。 有一天加班到晚上9點(diǎn)多才回到...
    秋天播放閱讀 273評(píng)論 2 0
  • 每個(gè)人的心里,都有一方魂?duì)繅?mèng)縈的土地。得意時(shí)想到它,失意時(shí)想到它。逢年過節(jié),觸景生情,那就是讓我魂?duì)繅?mèng)繞生我...
    梧桐樹開花閱讀 446評(píng)論 0 0
  • 我們的相逢 恰似那稍作安歇的過客 匆匆的 卻留下清晰的足跡 微弱的燈光 指引我拾起記憶的碎片 回憶濺起的漣漪 一直...
    你好佳豫有約閱讀 225評(píng)論 1 1
  • 二十一年來的成長教育學(xué)校都在傳達(dá)著一種積極向上的社會(huì)主義價(jià)值觀念,譬如:人心是善良的,法律是公正的!但這些亙古不變...
    福音閱讀 633評(píng)論 0 1

友情鏈接更多精彩內(nèi)容