一. neo4j安裝
-
安裝jdk
可以安裝openjdk,neo4j 4.0版本以上需要openjdk-11,3.5版本需要openjdk-8。
如果默認(rèn)軟件源沒有openjdk,可以添加ppa源。
如果ubuntu版本比較舊(如16.04),可能裝openjdk-11比較麻煩,可以裝openjdk-8。
sudo add-apt-repository -y ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-8-jdk
2. 安裝neo4j
wget -O - https://debian.neo4j.org/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.org/repo stable/' | sudo tee -a /etc/apt/sources.list.d/neo4j.list
sudo apt-get update
sudo apt-get install neo4j
sudo apt-get install cypher-shell
3. 啟動(dòng)或停止服務(wù)
neo4j status
neo4j start
neo4j stop
通過cypher-shell可以進(jìn)入neo4j交互界面,默認(rèn)用戶名和密碼是"neo4j"。
在交互界面可以通過CALL dbms.changePassword('password'); 修改密碼。
4. 設(shè)置遠(yuǎn)程瀏覽器訪問
默認(rèn)只能localhost訪問,需要遠(yuǎn)程訪問需修改/etc/neo4j/neo4j.conf,去掉注釋即可
#dbms.connectors.default_listen_address=0.0.0.0
二. py2neo使用
節(jié)點(diǎn)和關(guān)系
In [1]: from py2neo import Graph, Node, Relationship
In [2]: a = Node("Person", name="Alice")
In [3]: b = Node("Person", name="Bob")
In [4]: ab = Relationship(a, "KNOWS", b)
In [5]: print(type(a))
<class 'py2neo.data.Node'>
In [6]: print(a)
(:Person {name: 'Alice'})
In [7]: print(type(ab))
<class 'py2neo.data.KNOWS'>
In [8]: print(ab)
(Alice)-[:KNOWS {}]->(Bob)
這樣就成功創(chuàng)建了兩個(gè) Node 和兩個(gè) Node 之間的 Relationship。 Node 和 Relationship 都繼承了 PropertyDict 類,它可以賦值很多屬性,類似于字典的形式。
Subgraph
Subgraph子圖,是 Node 和 Relationship 的集合,最簡(jiǎn)單的構(gòu)造子圖的方式是通過關(guān)系運(yùn)算符,如下:
# 創(chuàng)建subgraph
In [10]: s = a | b | ab
In [11]: print(type(s))
<class 'py2neo.data.Subgraph'>
In [12]: print(s)
Subgraph({Node('Person', name='Alice'), Node('Person', name='Bob')}, {KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob'))})
# 可以通過 nodes () 和 relationships () 方法獲取所有的 Node 和 Relationship
In [20]: type(s.nodes)
Out[20]: py2neo.collections.SetView
In [18]: list(s.nodes)
Out[18]: [Node('Person', name='Alice'), Node('Person', name='Bob')]
In [19]: list(s.relationships)
Out[19]: [KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob'))]
# subgraph求交集
In [21]: s2 = a | b
In [22]: s&s2
Out[22]: Subgraph({Node('Person', name='Alice'), Node('Person', name='Bob')}, {})
walkable
Walkable 是增加了遍歷信息的 Subgraph,可以通過 + 號(hào)便可以構(gòu)建一個(gè) Walkable 對(duì)象,如:
In [34]: a = Node("Person", name="Alice")
In [35]: b = Node("Person", name="Bob")
In [36]: c = Node("Person", name="Jack")
In [37]: d = Node("Dog", name="Pupy")
In [38]: ab = Relationship(a, "KNOWS", b)
In [39]: bc = Relationship(b, "LIKES", c)
In [40]: cd = Relationship(c, "HAS", d)
# 創(chuàng)建walkable對(duì)象
In [41]: w = ab+bc+cd
In [42]: print(type(w))
<class 'py2neo.data.Path'>
In [43]: print(w)
(Alice)-[:KNOWS {}]->(Bob)-[:LIKES {}]->(Jack)-[:HAS {}]->(Pupy)
In [44]: from py2neo import walk
# 用walk方法從起始節(jié)點(diǎn)遍歷到終止節(jié)點(diǎn)
In [45]: for item in walk(w):
...: print(item)
(:Person {name: 'Alice'})
(Alice)-[:KNOWS {}]->(Bob)
(:Person {name: 'Bob'})
(Bob)-[:LIKES {}]->(Jack)
(:Person {name: 'Jack'})
(Jack)-[:HAS {}]->(Pupy)
(:Dog {name: 'Pupy'})
# 用 start_node ()、end_node ()、nodes ()、relationships () 方法來獲取起始 Node、終止 Node、所有 Node 和 Relationship
In [47]: w.start_node
Out[47]: Node('Person', name='Alice')
In [48]: w.end_node
Out[48]: Node('Dog', name='Pupy')
In [49]: w.nodes
Out[49]:
(Node('Person', name='Alice'),
Node('Person', name='Bob'),
Node('Person', name='Jack'),
Node('Dog', name='Pupy'))
In [50]: w.relationships
Out[50]:
(KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob')),
LIKES(Node('Person', name='Bob'), Node('Person', name='Jack')),
HAS(Node('Person', name='Jack'), Node('Dog', name='Pupy')))
Graph
- 初始化
Graph是和 Neo4j 數(shù)據(jù)交互的 最重要得API,提供了許多方法來操作 Neo4j 數(shù)據(jù)庫。 Graph 在初始化的時(shí)候需要傳入連接的 URI,初始化參數(shù)有 bolt、secure、host、http_port、https_port、bolt_port、user、password,詳情參考:http://py2neo.org/v3/database.html#py2neo.database.Graph。 初始化的實(shí)例如下:
g = Graph(host='localhost', auth=('neo4j', 'passwd'))
- 創(chuàng)建數(shù)據(jù)
可以直接創(chuàng)建子圖,也可以創(chuàng)建單個(gè)節(jié)點(diǎn)或關(guān)系
In [34]: a = Node("Person", name="Alice")
In [35]: b = Node("Person", name="Bob")
In [36]: c = Node("Person", name="Jack")
In [37]: d = Node("Dog", name="Pupy")
In [38]: ab = Relationship(a, "KNOWS", b)
In [39]: bc = Relationship(b, "LIKES", c)
In [40]: cd = Relationship(c, "HAS", d)
In [41]: ss = a|b|c|d|ab|bc|cd
In [42]: g.create(ss)
得到如下結(jié)果:

再添加一個(gè)關(guān)系
r = Relationship(a, 'KONWS', c)
g.create(r)
得到結(jié)果如下:

- 查找節(jié)點(diǎn)
使用NodeMatcher查找節(jié)點(diǎn)。
In [40]: from py2neo import NodeMatcher, RelationshipMatcher
In [41]: nm = NodeMatcher(g)
In [43]: res = nm.match('Person')
In [44]: list(res)
Out[44]:
[Node('Person', name='Bob'),
Node('Person', name='Alice'),
Node('Person', name='Jack')]
# 返回查找結(jié)果得第一個(gè)
In [58]: res = nm.match('Person').first()
In [59]: res
Out[59]: Node('Person', name='Bob')
In [49]: res = nm.match('Dog', name='Pupy')
In [50]: list(res)
Out[50]: [Node('Dog', name='Pupy')]
# 使用正則匹配查詢
In [56]: res = nm.match('Person').where('_.name=~"A.*"')
In [57]: list(res)
Out[57]: [Node('Person', name='Alice')]
first()返回單個(gè)節(jié)點(diǎn)
limit(amount)返回底部節(jié)點(diǎn)的限值條數(shù)
skip(amount)返回頂部節(jié)點(diǎn)的限值條數(shù)
order_by(fields)排序
where(conditions, **properties)篩選條件
- 查找關(guān)系
可以使用g.match查找關(guān)系,也可以使用RelationshipMatcher,后者更強(qiáng)大。
In [40]: from py2neo import NodeMatcher, RelationshipMatcher
In [42]: rm = RelationshipMatcher(g)
In [96]: list(g.match())
Out[96]:
[LIKES(Node('Person', name='Bob'), Node('Person', name='Jack')),
KONWS(Node('Person', name='Alice'), Node('Person', name='Jack')),
KNOWS(Node('Person', name='Alice'), Node('Person', name='Bob')),
HAS(Node('Person', name='Jack'), Node('Dog', name='Pupy'))]
In [63]: res = g.match(r_type='LIKES')
In [64]: list(res)
Out[64]: [LIKES(Node('Person', name='Bob'), Node('Person', name='Jack'))]
# 查詢以某個(gè)節(jié)點(diǎn)為頭節(jié)點(diǎn)的某個(gè)關(guān)系,例如要查詢白血病的并發(fā)癥
In [293]: a = nm.match('疾病', name='白血病').first()
In [294]: a
Out[294]: Node('疾病', name='白血病')
In [295]: list(g.match(r_type='并發(fā)癥', nodes=[a]))
Out[295]:
[并發(fā)癥(Node('疾病', name='白血病'), Node('疾病', name='白血病性中樞神經(jīng)感染')),
并發(fā)癥(Node('疾病', name='白血病'), Node('疾病', name='白血病腦出血')),
并發(fā)癥(Node('疾病', name='白血病'), Node('疾病', name='腸功能衰竭')),
并發(fā)癥(Node('疾病', name='白血病'), Node('疾病', name='卡氏肺囊蟲感染'))]
In [66]: res2 = rm.match(r_type='LIKES')
In [67]: list(res2)
Out[67]: [LIKES(Node('Person', name='Bob'), Node('Person', name='Jack'))]
- 批量插入
批量插入時(shí)要注意避免插入很多相同節(jié)點(diǎn)(即使類型和值都相同,但多次用Node構(gòu)建,產(chǎn)生的節(jié)點(diǎn)就是不同的,因?yàn)閕d不同),如下示例:
In [258]: a1 = Node('Person', '小明')
In [259]: a2 = Node('Person', '小明')
In [260]: a1==a2
Out[260]: False
In [261]: id(a1)
Out[261]: 139971127871536
In [262]: id(a2)
Out[262]: 139971551445936
因此在批量插入時(shí),尤其是對(duì)表格類數(shù)據(jù),要注意避免多次構(gòu)造具有相同類型和值的節(jié)點(diǎn),可以在用Node構(gòu)建節(jié)點(diǎn)前先用NodeMatcher查詢是否已經(jīng)存在相同類型和值的節(jié)點(diǎn)。下邊是一個(gè)據(jù)體的批量插入的例子:
g = Graph(host='localhost', auth=('neo4j', 'password'))
nm = NodeMatcher(g)
for i in data:
spos = i['spo_list']
for spo in spos:
p, sub, obj, sub_type, obj_type = spo.values()
sub_existed = nm.match(sub_type, name=sub).first() # 查詢是否已存在相同類型和值的節(jié)點(diǎn)
obj_existed = nm.match(obj_type, name=obj).first()
if sub_existed and obj_existed: # 兩個(gè)節(jié)點(diǎn)之間只能有一種關(guān)系,因此如果sub和obj都已經(jīng)存在了,就不再插入
continue
elif sub_existed:
obj_node = Node(obj_type, name=obj) # 只存在sub節(jié)點(diǎn),則需要構(gòu)建新的obj節(jié)點(diǎn)
rel = Relationship(sub_existed, p, obj_node)
elif obj_existed:
sub_node = Node(sub_type, name=sub)
rel = Relationship(sub_node, p, obj_existed)
else:
sub_node = Node(sub_type, name=sub)
obj_node = Node(obj_type, name=obj)
rel = Relationship(sub_node, p, obj_node)
g.create(rel)