Graph Databases

What is Graph DB

In graph DB entities(are nodes) and edges are relationships how these nodes are connected

Represent Entities & Relations

Small Social Graph(Eg: Twitter)	Ruth has published Messages
	- Ruth has published a string of messages. Her most recent message can be found marked CURRENT. - The PREVIOUS relationships then create Ruth’s timeline.

Comments on Photos by users

Consider facebook social graph, User1 uploaded a photo1
User2, User3 added comment2, comment3 on the photo1.

Nodes:
- User Node: Represents each user (e.g., User1, User2)
- Photo Node: Represents photo uploaded
- Comment Node: Represents each comment made on the photo


// Neoj4. property graph model
User1: (:User {id: "user1", name: "User 1"})
User2: (:User {id: "user2", name: "User 2"})
Photo: (:Photo {id: "photo1", url: "http://example.com/photo.jpg"})
Comment1: (:Comment {id: "comment1", text: "Nice photo!"})
Comment2: (:Comment {id: "comment2", text: "Great shot!"})

Relationships:
- COMMENTED_ON: Relationship from a User node to the Photo node indicating
that the user commented on that photo.
- HAS_COMMENT: Relationship from the Photo node to each Comment node
indicating that the photo has these comments.
- WRITTEN_BY: Relationship from each Comment node to the User node who wrote it.


(:User {id: "user1"})-[:COMMENTED_ON]->(:Photo {id: "photo1"})
(:Photo {id: "photo1"})-[:HAS_COMMENT]->(:Comment {id: "comment1"})
(:Comment {id: "comment1"})-[:WRITTEN_BY]->(:User {id: "user1"})

Count Comments:
Reach Photo node, count the number of HAS_COMMENT relationships


// Get all comments on a specific photo
MATCH (:Photo {id: "photo1"})-[:HAS_COMMENT]->(c:Comment)
RETURN c.text

All photos commented on by a specific user:


MATCH (u:User {id: "user1"})-[:COMMENTED_ON]->(p:Photo)
RETURN p.url

3 Friends(A,B,C). A created video. B commented. C shared and liked

Nodes
User: Represents users like UserA, UserB, and UserC.
Video: Represents each video uploaded.
Comment: Represents comments on videos.
Like: Represents likes on videos.

Relationships:
FRIEND: Connects User nodes (e.g., UserA is FRIEND with UserB and UserC).
CREATED: Connects User to Video (e.g., UserA CREATED Video1).
COMMENTED: Connects User to Comment (e.g., UserB COMMENTED Comment1).
LIKED: Connects User to Video (e.g., UserC LIKED Video1).
SHARED: Connects User to Video (e.g., UserC SHARED Video1).
ON: Connects Comment to Video (e.g., Comment1 ON Video1).


(UserA)-[:CREATED]->(Video1 {url: "https://objectstore.com/video1"})
(UserB)-[:FRIEND]->(UserA)
(UserC)-[:FRIEND]->(UserA)
(UserB)-[:COMMENTED]->(Comment1 {text: "Great video!"})-[:ON]->(Video1)
(UserC)-[:LIKED]->(Video1)
(UserC)-[:SHARED]->(Video1)

How efficiently particular node is searched in graphDB?

Method-1 (Store address of user nodes)

In seperate database directly store addresses of user nodes, whenever user quries, directly start from that node

Method-2 (Indexing)

Advatanges over SQL

Connected Data

SQL Graph DB

Which DB to choose for Connected Dataset join-intensive query performance deteriorates as the dataset gets bigger
Since need to traverse whole graph performance remain relatively constant, even as the dataset grows, because queries are localized to a portion of the graph
Execution time for each query is proportional only to the size of the part of the graph traversed rather than the size of the overall graph

Friends of Friend Query / Friends of Friends of Friend Query

Query to get friend of Bob on SQL database is complex
When we want Friends of friends of Bob(it will get more complicated), bcoz it will involve
Recursive JOINS
Traversing whole DB


Person Table:
+----+--------+
| ID | Person |
+----+--------+
|  1 | alice  |
|  2 | Bob    |
| 99 | zach   |
+----+--------+
PersonFriend Table:
+----------+----------+
| PersonID | FriendID |
+----------+----------+
|        1 |        2 |
|        2 |        1 |
|        2 |       99 |
|       99 |        1 |
+----------+----------+

We can add any number of relationships in graph database. trie

Execution time comparison

Depth	RDBMS Execution time(s)	Neo4j execution time(s)	Records returned
2	0.016	0.01	~2500
3	30.267	0.168	~110,000
4	1543.505	1.359	~600,000
5	Unfinished	2.132	~800,000