01_Introduction

Keine Dateien in diesem Thema verfügbar.

Lernmaterialien

Neo4j

Neo4j is a graph database—a database designed to store and query data as a network of nodes (things) and relationships (connections between things), both of which can have properties (key–value data).

How it models data

Node: an entity (e.g., Person, Company, Movie)
Relationship: a typed, directed connection (e.g., (:Person)-[:FRIENDS_WITH]->(:Person))
Properties: data on nodes/relationships (e.g., name, age, since)

This is often called the property graph model.

How you query it

Neo4j’s main query language is Cypher, which looks a bit like ASCII-art patterns:

MATCH (p:Person {name: "Alice"})-[:FRIENDS_WITH]->(f:Person)
RETURN f.name

That reads: “Find Alice, follow her FRIENDS_WITH relationships to other Person nodes, return their names.”

Why use Neo4j

Neo4j is especially good when your main questions are about connections:

social networks (“friends of friends”)
recommendations (“people who bought X also bought Y”)
fraud detection (suspicious rings of accounts/devices)
knowledge graphs and entity linking
network/IT topology and dependency analysis

Graph databases shine because they can traverse relationships efficiently without lots of costly joins.

How it differs from relational databases

Relational: relationships are implied via foreign keys + joins
Neo4j: relationships are first-class stored data, so multi-hop queries can be simpler and often faster.

If you tell me what kind of data/problem you’re working with (e.g., students/courses, products/users, etc.), I can sketch what the Neo4j graph model would look like and show a few example Cypher queries.

Differences

Neo4j vs Relational (Oracle/Postgres/MySQL)

Data model

Relational: tables + rows; relationships via foreign keys.
Neo4j: nodes + relationships (stored directly) + properties.

Query style

Relational: set-based SQL; relationships expressed with JOINs.
Neo4j: pattern matching (“find this shape in the graph”) in Cypher.

Strengths

Relational excels at: well-structured data, reporting/aggregations, lots of tabular operations, strong integrity constraints.
Neo4j excels at: “connected” questions (multi-hop traversals): friends-of-friends, shortest paths, dependency chains, recommendations, fraud rings.

Performance “shape”

Relational: queries that need many joins (especially variable-depth like 2–6 hops, unknown depth) can get complex and expensive.
Neo4j: traversals over relationships are typically efficient, and variable-length path queries are natural.

Integrity & normalization

Relational: normalization is common; constraints (FK, CHECK, UNIQUE) are core.
Neo4j: you can add constraints (e.g., uniqueness), but graph modeling often emphasizes relationships over normalization.

Neo4j vs MongoDB (document DB)

Data model

MongoDB: documents (JSON-like) in collections; nested objects/arrays; relationships often handled by embedding or references.
Neo4j: explicit relationships as first-class records; good when links are central.

How you represent relationships

MongoDB embedding: fast when you usually fetch data “with its children” (e.g., an order with its line items).
MongoDB referencing: possible, but multi-step traversals become multiple queries or aggregation pipelines.
Neo4j: traversals are the core operation—follow edges naturally across many hops.

Querying

MongoDB: document queries + aggregation pipeline (great for document-shaped analytics).
Neo4j: Cypher pattern queries (great for network/graph patterns).

Performance “shape”

MongoDB: great for single-entity reads/writes and “retrieve a document and its embedded data” patterns.
Neo4j: great for deep or wide relationship navigation and graph algorithms (paths, centrality, communities).

Schema flexibility

MongoDB: very flexible schema per document.
Neo4j: also flexible (properties/labels can vary), but the shape of relationships matters more than document structure.

Quick rule of thumb

Choose a relational DB when your world is naturally tables, you need strong constraints, and most queries are joins/aggregations over well-defined entities.
Choose MongoDB when your data is naturally document-shaped, you want easy horizontal scaling, and you mostly fetch/update whole documents.
Choose Neo4j when your hardest/most valuable questions are about connections, paths, and network structure.

Tiny example: “friends of friends”

SQL: self-joins, potentially multiple joins for multiple hops.
MongoDB: usually multiple lookups/aggregation stages or app-side joins.
Neo4j (Cypher): MATCH (a)-[:FRIEND*2]->(b) (variable-length is natural).

Components

https://graphacademy.neo4j.com/courses/modeling-fundamentals/1-getting-started/3-purpose-of-model/

Main components:

Nodes (entities)
- Blue nodes: people (e.g., Tom Hanks, Meg Ryan)
- Red nodes: movies (e.g., Apollo 13, Sleepless in Seattle)
Labels (node types / classes)
- :Person and :Movie (shown in the black “Person/Movie” tags)
Properties (attributes on nodes)
- For :Person: name, tmdbId, born
- For :Movie: title, tmdbId, released, imdbRating
Relationships (connections / edges)
- Type: :ACTED_IN connecting (:Person) -> (:Movie) (direction matters)
Relationship properties (attributes on relationships)
- role (e.g., “Jim Lovell”, “Annie Reed”) stored on the ACTED_IN relationship

Nodes

Nodes are the “things” (entities) in a Neo4j graph. Each node represents one real-world object or concept—like a Person or a Movie in your diagram.

How to recognize a node

A node is identified by:

Labels (type/category)
Example: :Person, :Movie
Properties (key–value fields)
Example: name, born, title, released
An internal Neo4j id (exists, but you usually don’t rely on it in app design)

How to identify (find) nodes in practice

1) By label + property (most common)

MATCH (p:Person {name: "Tom Hanks"})
RETURN p

2) By a unique identifier property (best practice)

Use something like tmdbId as a stable ID:

MATCH (m:Movie {tmdbId: 568})
RETURN m

Even better: enforce uniqueness with a constraint:

CREATE CONSTRAINT person_tmdbId_unique IF NOT EXISTS
FOR (p:Person) REQUIRE p.tmdbId IS UNIQUE;

3) By internal id (ok for debugging, not for long-term use)

MATCH (n) WHERE id(n) = 123
RETURN n

In the diagram

The circles are nodes:

Blue circles (Tom Hanks, Meg Ryan, …) are :Person nodes
Red circles (Apollo 13, …) are :Movie nodes

If you tell me what dataset you’re using (movies, students/courses, etc.), I can suggest what property should be your “unique ID” and the right constraints to add.

Relationships

Relationships are the connections (edges) between two nodes in Neo4j. They’re typed (have a name like ACTED_IN), directed (have a start and end node), and can also hold properties (like role).

In your diagram:

(Tom Hanks)-[:ACTED_IN {role:"Jim Lovell"}]->(Apollo 13)
(Meg Ryan)-[:ACTED_IN {role:"Annie Reed"}]->(Sleepless in Seattle)

What makes up a relationship

Type: the relationship name, e.g. :ACTED_IN
Direction: from start node → end node
Properties: key–value data on the relationship, e.g. role
Internal relationship id (exists, mainly for debugging)

How to identify (find) relationships

1) By pattern (most common)

Find all acting relationships for Tom Hanks:

MATCH (:Person {name:"Tom Hanks"})-[r:ACTED_IN]->(m:Movie)
RETURN r, m

2) By relationship properties

Find acting relationships where the role is “Jim Lovell”:

MATCH (p:Person)-[r:ACTED_IN {role:"Jim Lovell"}]->(m:Movie)
RETURN p, m

3) By start/end node + type

Find the relationship specifically between Tom Hanks and Apollo 13:

MATCH (:Person {name:"Tom Hanks"})-[r:ACTED_IN]->(:Movie {title:"Apollo 13"})
RETURN r

4) By internal relationship id (debugging)

MATCH ()-[r]->() WHERE id(r) = 42
RETURN r

Tips / best practice

Identify relationships in apps by (start node, type, end node) and/or relationship properties.
Unlike nodes, relationships typically don’t have a “natural primary key” unless you model one yourself.
You can add constraints/indexes for relationship properties in newer Neo4j versions, but most of the time you query them via the surrounding pattern.

Semester_8

01_Introduction

Lernmaterialien

Neo4j

How it models data

How you query it

Why use Neo4j

How it differs from relational databases

Differences

Neo4j vs Relational (Oracle/Postgres/MySQL)

Data model

Query style

Strengths

Performance “shape”

Integrity & normalization

Neo4j vs MongoDB (document DB)

Data model

How you represent relationships

Querying

Performance “shape”

Schema flexibility

Quick rule of thumb

Tiny example: “friends of friends”

Components

Nodes

How to recognize a node

How to identify (find) nodes in practice

1) By label + property (most common)

2) By a unique identifier property (best practice)

3) By internal id (ok for debugging, not for long-term use)

In the diagram

Relationships

What makes up a relationship

How to identify (find) relationships

1) By pattern (most common)

2) By relationship properties

3) By start/end node + type

4) By internal relationship id (debugging)

Tips / best practice