# Creating and connecting

##Creating Loom files

創(chuàng)建一個loom文件，需要提供一個矩陣數(shù)據(jù)（numpy ndarray or scipy sparse matrix）和兩個關(guān)于行列屬性的字典（字典中，屬性名是鍵值，numpy ndarrays是對應(yīng)的值）。如果矩陣數(shù)據(jù)是N×M，行屬性必需有N個元素，列屬性應(yīng)該有M個元素。

例，創(chuàng)建一個有100x100維矩陣數(shù)據(jù)以及一個行屬性，一個列屬性的loom文件：

import numpy as np
import loompy
filename = "test.loom"
matrix = np.arange(10000).reshape(100,100)
row_attrs = { "SomeRowAttr": np.arange(100) }
col_attrs = { "SomeColAttr": np.arange(100) }
loompy.create(filename, matrix, row_attrs, col_attrs)

loompy.create()接受numpy dense matrices (numpy.ndarray) ，scipy sparse matrices(scipy.sparse.coo_matrix, scipy.sparse.csc_matrix, or scipy.sparse.csr_matrix)

import numpy as np
import loompy
import scipy.sparse as sparse
filename = "test.loom"
matrix = sparse.coo_matrix((100, 100))
row_attrs = { "SomeRowAttr": np.arange(100) }
col_attrs = { "SomeColAttr": np.arange(100) }
loompy.create(filename, matrix, row_attrs, col_attrs)

#loompy.create() 只是創(chuàng)建文件文件
#loompy.connect()可以連接loom文件，進行處理

loompy.new()可以創(chuàng)建一個空的loom文件，并且返回與loom文件之間的鏈接，然后就可以向文件中填充數(shù)據(jù)。

    for sample in samples:
        with loompy.connect(sample) as dsin:
            logging.info(f"Appending {sample}.")
            dsout.add_columns(ds.layers, col_attrs=dsin.col_attrs, row_attrs=dsin.row_attrs)

loompy.combine()根據(jù)一個已經(jīng)存在的loom文件創(chuàng)建新的loom文件。新的內(nèi)容寫進去之后會根據(jù)列的維度進行填充，所以新內(nèi)容和源文件內(nèi)容應(yīng)該有一樣行數(shù)；如果對應(yīng)的行的順序不一致，可以通過添加參數(shù)Key進行調(diào)整。

loompy.combine(files, output_filename, key="Accession")

loompy.create_from_cellranger()導(dǎo)入10X Genomics cellranger結(jié)果

loompy.create_from_cellranger(folder, output_filename)

##Connecting to Loom files

#loompy.connect()可以創(chuàng)建loom文件的鏈接，不會直接讀入文件；因此，就算文件特別大也不影響其速度。

可以使用一個with語句來處理連接:

with loompy.connect("filename.loom") as ds:
  # do something with ds

#with會自動關(guān)閉文件，不使用with的時候，你需要自行關(guān)閉

ds = loompy.connect("filename.loom")
ds.close()

#Manipulate data

##Shape, indexing and slicing

LoomConnection.shape返回包含行列數(shù)的一個元組

>>> ds.shape
(100, 2345)

#loom matrix中數(shù)據(jù)可以通過索引和切片進行獲取。

Indices: anything that can be converted to a Python long
Slices (i.e. : or 0:10)
Lists of the rows/columns you want (i.e. [0, 34, 576])
Mask arrays (i.e. numpy array of bool indicating the rows/columns you want)

ds[0:10, 0:10]    # Return the 10x10 submatrix starting at row and column zero
ds[99, :]         # Return the 100th row
ds[:, 99]         # Return the 100th column
ds[[0,3,5], :]    # Return rows with index 0, 3 and 5
ds[:, bool_array] # Return columns where bool_array elements are True

#提示：一次選取太多的行或者列，速度會很慢，這是由于h5py造成的。當然，如果你把整個文件讀入到內(nèi)存中，不存在這個問題。還有一種選擇，可以使用LoomConnection.scan()，當隨機選取的列或者行超過數(shù)據(jù)1%時，它會很快。

## ##Sparse data

不同層的數(shù)據(jù)都是以塊的方式存放，有助于數(shù)據(jù)儲存和讀??；不同層的數(shù)據(jù)都可以賦于新的稠密或者稀疏矩陣數(shù)據(jù)，

可以導(dǎo)入主矩陣或者任意層的數(shù)據(jù)，返回稀疏矩陣

ds.layers["exons"].sparse()  # Returns a scipy.sparse.coo_matrix
ds.layers["unspliced"].sparse(rows, cols)  # Returns only the indicated rows and columns (ndarrays of integers or bools)

給層非陪數(shù)據(jù)

ds.layers["exons"] = my_sparse_matrix

## ##Modifying layers

修改不同層的數(shù)據(jù)，直接索引數(shù)重新進行賦值就可以了。

ds[:, :] = newdata         # Assign a full matrix
ds[3, 500] = 31            # Set the element at (3, 500) to the value 31
ds[99, :] = rowdata        # Assign new values to row with index 99
ds[:, 99] = coldata        # Assign new values to column with index 99

## ##Global attributes

全局屬性使用.attrs.訪問，通過賦值創(chuàng)建新的屬性，del可以刪除屬性

>>> ds.attrs.title
"The title of the dataset"

>>> ds.attrs.title = "New title"
>>> ds.attrs["title"]
"New title"

>>> del ds.attrs.title

loom的屬性可以像字典一樣遍歷處理

>>> ds.attrs.keys()
["title", "description"]

>>> for key, value in ds.attrs.items():
>>>   print(f"{key} = {value}")
title = New title
description = Fancy dataset

#全局屬性可以是標量或任意形狀的多維數(shù)組，元素可以是整數(shù)、浮點數(shù)或字符串。

##Row and column attributes

行和列屬性是分別通過ds.ra，ds.ca訪問的。

ds.ca.keys()       # Return list of column attribute names
ds.ra.Gene = ...   # Create or replace the Gene attribute
a = ds.ra.Gene     # Assign the array of gene names (assuming the attribute exists)
del ds.ra.Gene     # Delete the Gene row attribute

屬性也可以通過索引來訪問:

a = ds.ra["Gene"]     # Assign the array of gene names (assuming the attribute exists)
del ds.ra["Gene"]     # Delete the Gene row attribute

你可以將多個具有相同類型的屬性提取到一個numpy數(shù)組中

a = ds.ra["Gene", "Attribute"]     # Returns a 2D array of shape (n_genes, 2)
b = ds.ca["PCA1", "PCA2"]          # Returns a 2D array of shape (n_cells, 2)

訪問多個屬性

a = ds.ra["Gene", "GeneName"]     # Return one or the other (if only one exists)
b = ds.ca["TSNE", "PCA", "UMAP"]  # Return the one that exists (if only one exists)

使用屬性來模糊索引主矩陣，在選擇子數(shù)組時可以使用非常緊湊和可讀的語法:

array([[  2.,   9.,   9., ...,   0.,  14.,   0.]], dtype=float32)

>>> ds[(ds.ra.Gene == "Actb") | (ds.ra.Gene == "Gapdh"), :]
array([[  2.,   9.,   9., ...,   0.,  14.,   0.],
       [  0.,   1.,   4., ...,   0.,  14.,   3.]], dtype=float32)

>>> ds[:, ds.ca.CellID == "AAACATACATTCTC-1"]
array([[ 0.],
       [ 0.],
       [ 0.],
       ...,
       [ 0.],
       [ 0.],
       [ 0.]], dtype=float32)

#可以使用位運算符，| for ‘or’, & for ‘a(chǎn)nd’ and ~ for ‘not’。

(a == b) & (a > c) | ~(c <= b)

##Modifying attributes

與層不同，屬性總是只能完整地讀寫。因此，給切片賦值不會修改磁盤上儲存的屬性內(nèi)容。要為一個屬性賦予新值，必須為該屬性賦值一個完整的列表或ndarray:

with loompy.connect("filename.loom") as ds:
  ds.ca.ClusterNames = values  # where values is a list or ndarray with one element per column
  # This does not change the attribute on disk:
  ds.ca.ClusterNames[10] = "banana"

##Adding columns

您可以向現(xiàn)有的loom文件添加列。添加行或刪除矩陣的任何部分無法操作。

ds.add_columns(submatrix, col_attrs)

如果要向空文件添加列，還必須提供行屬性:

ds.add_columns(submatrix, col_attrs, row_attrs={"Gene": genes})

你也可以添加另一個.loom文件的內(nèi)容:other_file添加到ds中

ds.add_loom(other_file, key="Gene")

#注：other_file和ds應(yīng)該有一樣的行數(shù)，在列的維度進行合并，如果行的順序不一致，可以通過參數(shù)key進行重排，缺失值可以自動通過fill_values填充。如果文件包含任何具有沖突值的全局屬性，則可以通過將convert_attrs=True傳遞給該方法來自動將這些屬性轉(zhuǎn)換為列屬性。

##Layers

loom支持多層, 每個loom只有一個的主矩陣，但可以有一個或多個具有相同行數(shù)和列數(shù)的額外層。使用LoomConnection對象上的Layers屬性訪問層。

層支持與屬性相同的python API:

ds.layers.keys()            # Return list of layers
ds.layers["unspliced"]      # Return the layer named "unspliced"
ds.layers["spliced"] = ...  # Create or replace the "spliced" layer
a = ds.layers["spliced"][:, 10] # Assign the 10th column of layer "spliced" to the variable a
del ds.layers["spliced"]     # Delete the "spliced" layer

#主矩陣也可以通過“”進行訪問（類似使用層名）；但是不能刪除主矩陣，其它層的方法也適用于主矩陣。

為了方便起見，層也可以直接在連接對象上使用。

ds["spliced"] = ...  # Create or replace the "spliced" layer
a = ds["spliced"][:, 10] # Assign the 10th column of layer "spliced" to the variable a
del ds["spliced"]     # Delete the "spliced" layer

有時您可能需要創(chuàng)建一個空層(全部為零)，以便稍后填充。空層是通過為層名指定類型來創(chuàng)建的。例如:

ds["empty_floats"] = "float32"
ds["empty_ints"] = "int64"

##Graphs

Loom支持以行或列作為節(jié)點的稀疏圖。例如，細胞的稀疏圖(存儲在列中)可以表示細胞的KNN圖。在這種情況下，細胞就是節(jié)點(如果主矩陣中有M列，則圖中有M個節(jié)點)，它們由任意數(shù)量的邊連接。圖可以被認為是有向的或無向的，并且可以在邊緣上有浮點值的權(quán)值。Loom甚至支持多圖(允許節(jié)點對之間有多條邊)。圖存儲為邊和相關(guān)邊權(quán)值的數(shù)組。

行圖和列圖在使用ds.row_graphs 和 ds.col_graphs訪問。例如:

ds.row_graphs.keys()      # Return list of row graphs
ds.col_graphs.KNN = ...   # Create or replace the column-oriented graph KNN
a = ds.col_graphs.KNN     # Assign the KNN column graph to variable a
del ds.col_graphs.KNN     # Delete the KNN graph

##Views

loompy Views的功能是通過切片將loom文件特定的內(nèi)容讀入到內(nèi)存，然后進行查看。

ds.view[:, 10:20]

##Operations

###Map

通過map方法可以在數(shù)據(jù)行或者列的維度上對數(shù)據(jù)進行迭代處理:

ds.map([np.mean, np.std], axis=1)

map()結(jié)合scan()可以遍歷處理整個文件內(nèi)容。

map對于函數(shù)設(shè)置的參數(shù)接受一個函數(shù)list，然后將每個函數(shù)都在整個數(shù)據(jù)進行遍歷，就算一個函數(shù)也應(yīng)該是list結(jié)構(gòu)。

(means,) = ds.map([np.mean], axis=1)

###Permutation

排列行或列的順序:

rdering = np.random.permutation(np.arange(ds.shape[1]))
ds.permute(ordering, axis=1)

#排列文件中的行或列的順序，是不需要將整個文件加載到RAM中。

###Scan

對于非常大的織機文件，分批(batches)scan文件(沿著行或列)是非常有用的，以避免將整個文件加載到內(nèi)存中。這可以使用scan()方法實現(xiàn):

for (ix, selection, view) in ds.scan(axis=1):
  # do something with each view

#axis=0表示行，axis=1表示列；
#ix是提取內(nèi)容的索引起始位置
#selection是提取的內(nèi)容列的list
#view是一個loom對象；loom的一些方法都可以進行運用。

import numpy as np
import loompy
filename = "test.loom"
matrix = np.arange(10000).reshape(100,100)
row_attrs = { "SomeRowAttr": np.arange(100) }
col_attrs = { "SomeColAttr": np.arange(100) }
loompy.create(filename, matrix, row_attrs, col_attrs)


ds = loompy.connect("test.loom")
for (ix, selection, view) in ds.scan(axis=1):
    print(ix, selection)
    
0 [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]

還可以選定的列或行子集內(nèi)容進行掃描。例如:

cells = # List of columns you want to see
for (ix, selection, view) in ds.scan(items=cells, axis=1):
  # do something with each view

#cell就是對應(yīng)的index; array([ 0, 1, 2, 3])

loom R版本參考： loomR介紹及使用指南

#原文

API Walkthrough
Source code for loompy.loompy

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

loompy API

loompy API

# Creating and connecting

##Creating Loom files

##Connecting to Loom files

#Manipulate data

##Shape, indexing and slicing

## ##Sparse data

## ##Modifying layers

## ##Global attributes

##Row and column attributes

##Modifying attributes

##Adding columns

##Layers

##Graphs

##Views

##Operations

###Map

map()結(jié)合scan()可以遍歷處理整個文件內(nèi)容。

###Permutation

###Scan

#原文

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

loompy API

# Creating and connecting

##Creating Loom files

##Connecting to Loom files

#Manipulate data

##Shape, indexing and slicing

## ##Sparse data

## ##Modifying layers

## ##Global attributes

##Row and column attributes

##Modifying attributes

##Adding columns

##Layers

##Graphs

##Views

##Operations

###Map

map()結(jié)合scan()可以遍歷處理整個文件內(nèi)容。

###Permutation

###Scan

#原文

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av