處理單細(xì)胞數(shù)據(jù)的時(shí)候可能會(huì)遇到這種格式,一般是某些軟件的輸入。
1. 創(chuàng)建Loom文件
需要提供矩陣文件(numpy ndarray or scipy sparse matrix),兩個(gè)分別表示行列屬性的字典 (with attribute names as keys, and numpy ndarrays as values).
import numpy as np
import loompy
filename = "test.loom"
matrix = np.arange(10000).reshape(100,100)
row_attrs = { "SomeRowAttr": np.arange(100) }
col_attrs = { "SomeColAttr": np.arange(100) }
loompy.create(filename, matrix, row_attrs, col_attrs)
有時(shí)候,可能會(huì)遇到這種錯(cuò)誤:
OSError: Unable to create file (file locking disabled on this file system (use HDF5_USE_FILE_LOCKING environment variable to override), errno = 38, error message = 'Function not implemented')
需要先臨時(shí)改變環(huán)境配置, 在終端直接輸入:
export HDF5_USE_FILE_LOCKING='FALSE'
loompy也支持10X的數(shù)據(jù),
loompy.create_from_cellranger(folder, output_filename)
2. 查看,操作Loom文件
ds = loompy.connect("test.loom")
在下文的操作都執(zhí)行完后,記得關(guān)閉連接:ds.close()
2.1 Shape, indexing and slicing
跟一般的numpy對(duì)象處理起來差不多
>>> ds.shape
(100, 100)
>>> ds[0:4,0:4]
array([[ 0, 1, 2, 3],
[100, 101, 102, 103],
[200, 201, 202, 203],
[300, 301, 302, 303]])
>>> ds[[0,2],0:4]
array([[ 0, 1, 2, 3],
[200, 201, 202, 203]])
2.2 行列屬性
>>> ds.ra.keys()
['SomeRowAttr']
>>> ds.ca.keys()
['SomeColAttr']
>>> ds.ra.SomeRowAttr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, ......
>>> del ds.ra.SomeRowAttr # Delete the SomeRowAttr row attribute
You can pick out multiple attributes into a single numpy array, as long as they have the same type:
b = ds.ca["TSNE", "PCA", "UMAP"]
用屬性值來截取矩陣
>>> ds[(ds.ra['SomeRowAttr'] == 0) | (ds.ra['SomeRowAttr'] == 1), :]
在numpy中,與或非的表示:(a == b) & (a > c) | ~(c <= b),且注意括號(hào)的使用。
2.3 Layers
在ggplot2里面這個(gè)詞是圖層的意思,可以理解為平行空間的對(duì)象,比如Seurat里面的count,標(biāo)準(zhǔn)化之后叫data,歸一化之后叫scale.data。
>>> ds.layers["big"] = ds[:,:]+1
>>> ds.layers.keys()
['', 'big'] #這里的''表示的是主圖層
>>> ds.layers["big"][0:4,0:4]
array([[ 1, 2, 3, 4],
[101, 102, 103, 104],
[201, 202, 203, 204],
[301, 302, 303, 304]])
#或
>>> ds["big"][0:4,0:4]
array([[ 1, 2, 3, 4],
[101, 102, 103, 104],
[201, 202, 203, 204],
[301, 302, 303, 304]])
還有一些進(jìn)階的操作這里就不一一列舉了,請(qǐng)移步原文鏈接進(jìn)行學(xué)習(xí)。
參考:http://linnarssonlab.org/loompy/apiwalkthrough/index.html