Altair是基于Vega-Lite的Python下的聲明式統(tǒng)計可視化庫。
通過Altair,可以將更多的時間花在理解數(shù)據(jù)及其含義上。Altair的API非常簡單和友好,它基于Vega-Lite可視化語法構(gòu)建,這使得可以使用少量的代碼構(gòu)造出優(yōu)雅高效的可視化結(jié)果。
Altair的安裝
最簡單的安裝方式無疑是使用pip進行安裝:
pip install altair
Altair的使用
輸入數(shù)據(jù)
Altair內(nèi)部使用的數(shù)據(jù)以Pandas中的Dataframe格式存儲,但有以下三種方式傳入:
- 以Pandas的DataFrame格式傳入;
data = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E'], 'y': [5, 3, 6, 7, 2]}) - 以Data對象傳入;
data = alt.Data(values=[{'x': 'A', 'y': 5}, {'x': 'B', 'y': 3}, {'x': 'C', 'y': 6}, {'x': 'D', 'y': 7}, {'x': 'E', 'y': 2}]) - 以指向csv或json文本的url傳入;
data = 'https://vega.github.io/vega-datasets/data/cars.json'
最終數(shù)據(jù)將作為Chart、LayeredChart或FacetedChart對象的第一個參數(shù)傳入。
圖形選擇
定義好數(shù)據(jù)之后,首先需要做的就是選擇需要顯示的圖形,即將數(shù)據(jù)以何種形式顯示,目前Altair中有如下幾種類型可供選擇:
| Mark Name | Method | Description |
|---|---|---|
| area | mark_area() | A filled area plot. |
| bar | mark_bar() | A bar plot. |
| circle | mark_circle() | A scatter plot with filled circles. |
| line | mark_line() | A line plot. |
| point | mark_point() | A scatter plot with configurable point shapes. |
| rule | mark_rule() | A vertical or horizontal line spanning the axis. |
| square | mark_square() | A scatter plot with filled squares. |
| text | mark_text() | A scatter plot with points represented by text |
| tick | mark_tick() | A vertical or horizontal tick mark. |
編碼方式
編碼方式定義了圖片顯示的各種屬性,如每個圖片的位置,圖片軸的屬性等:
通道選擇
位置通道定義了圖片中位置相關(guān)的屬性:
| Channel | Altair Class | Description |
|---|---|---|
| column | Column | The column of a faceted plot |
| row | Row | The row of a faceted plot |
| x | X | The x-axis value |
| y | Y | The y-axis value |
通道描述:
| Channel | Altair Class | Description |
|---|---|---|
| color | Color | The color of the mark |
| opacity | Opacity | The opacity of the mark |
| shape | Shape | The shape of the mark |
| size | Size | The size of the mark |
通道排序:
| Channel | Altair Class | Description |
|---|---|---|
| order | Order | - |
| path | Path | - |
通道域信息:
| Channel | Altair Class | Description |
|---|---|---|
| text | Text | The text to display at each mark |
| detail | Detail | Additional level of detail for a grouping, without mapping to any particular channel |
| label | Label | – |
數(shù)據(jù)類型
定義數(shù)據(jù)如何在圖片中顯示:
| Data Type | Shorthand Code | Description |
|---|---|---|
| quantitative | Q | a continuous real-valued quantity |
| ordinal | O | a discrete ordered quantity |
| nominal | N | a discrete unordered category |
| temporal | T | a time or date value |
分類與聚合
| Aggregate | Description |
|---|---|
| sum | Sum of values |
| mean | Arithmetic mean of values |
| average | Arithmetic mean of values |
| count | Total number of values |
| distinct | Number of distinct values |
| variance | Variance of values |
| variancep | ?? |
| stdev | Standard Deviation of values |
| stdevp | ?? |
| median | Median of values |
| q1 | First quartile of values |
| q3 | Third quartile of values |
| modeskew | ?? |
| min | Minimum value |
| max | Maximum value |
| argmin | Index of minimum value |
| argmax | Index of maximum value |
| values | ?? |
| valid | ?? |
| missing | ?? |
分類與聚合定義了數(shù)據(jù)在顯示之前的處理方式:
| Aggregate | Description |
|---|---|
| sum | Sum of values |
| mean | Arithmetic mean of values |
| average | Arithmetic mean of values |
| count | Total number of values |
| distinct | Number of distinct values |
| variance | Variance of values |
| variancep | ?? |
| stdev | Standard Deviation of values |
| stdevp | ?? |
| median | Median of values |
| q1 | First quartile of values |
| q3 | Third quartile of values |
| modeskew | ?? |
| min | Minimum value |
| max | Maximum value |
| argmin | Index of minimum value |
| argmax | Index of maximum value |
| values | ?? |
| valid | ?? |
| missing | ?? |
簡寫方式
一些常用的屬性值可以簡寫:
| Shorthand | Equivalent long-form |
|---|---|
| x='name' | X('name') |
| x='name:Q' | X('name', type='quantitative') |
| x='sum(name)' | X('name', aggregate='sum') |
| x='sum(name):Q' | X('name', aggregate='sum', type='quantitative') |
數(shù)據(jù)展示
Jupyter展示
本質(zhì)上Altair只是將數(shù)據(jù)轉(zhuǎn)換為Vega-Lite所使用的Json格式,可以通過如下方式查看:
print(chart.to_json(indent=2))
通常情況下都是在Jupyter中展示數(shù)據(jù):
c = alt.Chart(...)
c
或者使用IPython的display()方法進行展示:
c = alt.Chart(...)
display(c)
或者:
c = alt.Chart(...)
c.display()
WebServer展示
通過Web方式展示如下:
chart.serve()
存儲為圖片
目前Altair不支持存儲為圖片,要將其存儲為圖片則需要使用到Nodejs的包。
存儲為HTML文件
具體代碼如下:
chart.savechart('plot.html')
其他表現(xiàn)形式
Altair圖標(biāo)可以表示為HTML、Python字典、Json對象和Altair的chart,在這些對象之間轉(zhuǎn)換的函數(shù)有:
- Chart.to_dict() / Chart.from_dict()
- Chart.to_json() / Chart.from_json()
- Chart.to_html()
- Chart.to_python()
相關(guān)資源
- Altair源碼:
https://github.com/altair-viz/altair - Altair文檔:
https://altair-viz.github.io/ - Vega-Lite源碼:
https://github.com/vega/vega-lite - Vega-Lite文檔:
https://vega.github.io/vega-lite/