Query 3: Sensor Health + Maintenance Correlation

Per-aircraft correlation of sensor health metrics (EGT, vibration) with maintenance event frequency. Aircraft with higher sensor readings may correlate with more frequent maintenance and this query reveals that relationship.

Federation Method: Neo4j Spark Connector → temp view → JOIN with Delta tables

Data Flow

Sensor Health + Maintenance Correlation Flow

Load Neo4j Data

Load maintenance events from Neo4j into a Spark temp view:

neo4j_maintenance = spark.read.format("org.neo4j.spark.DataSource") \
    .option("url", NEO4J_BOLT_URI) \
    .option("authentication.type", "basic") \
    .option("authentication.basic.username", NEO4J_USER) \
    .option("authentication.basic.password", NEO4J_PASSWORD) \
    .option("labels", "MaintenanceEvent") \
    .load()

neo4j_maintenance.createOrReplaceTempView("neo4j_maintenance")

Federated SQL Query

WITH aircraft_ref AS (
    SELECT `:ID(Aircraft)` AS aircraft_id, tail_number, model, manufacturer, operator
    FROM aircraft
),
sensor_health AS (
    SELECT
        sys.aircraft_id,
        ROUND(AVG(CASE WHEN sen.type = 'EGT' THEN r.value END), 1) AS avg_egt,
        ROUND(MAX(CASE WHEN sen.type = 'EGT' THEN r.value END), 1) AS max_egt,
        ROUND(AVG(CASE WHEN sen.type = 'Vibration' THEN r.value END), 4) AS avg_vibration,
        ROUND(MAX(CASE WHEN sen.type = 'Vibration' THEN r.value END), 4) AS max_vibration
    FROM sensor_readings r
    JOIN sensors sen ON r.sensor_id = sen.`:ID(Sensor)`
    JOIN systems sys ON sen.system_id = sys.`:ID(System)`
    GROUP BY sys.aircraft_id
),
maintenance_summary AS (
    SELECT
        aircraft_id,
        COUNT(*) AS total_events,
        SUM(CASE WHEN severity = 'CRITICAL' THEN 1 ELSE 0 END) AS critical,
        SUM(CASE WHEN severity = 'MAJOR' THEN 1 ELSE 0 END) AS major,
        SUM(CASE WHEN severity = 'MINOR' THEN 1 ELSE 0 END) AS minor
    FROM neo4j_maintenance
    GROUP BY aircraft_id
)
SELECT
    a.tail_number,
    a.model,
    a.operator,
    COALESCE(m.total_events, 0) AS maint_events,
    COALESCE(m.critical, 0) AS critical,
    COALESCE(m.major, 0) AS major,
    COALESCE(m.minor, 0) AS minor,
    s.avg_egt AS avg_egt_c,
    s.max_egt AS max_egt_c,
    s.avg_vibration AS avg_vib_ips,
    s.max_vibration AS max_vib_ips
FROM aircraft_ref a
LEFT JOIN maintenance_summary m ON a.aircraft_id = m.aircraft_id
LEFT JOIN sensor_health s ON a.aircraft_id = s.aircraft_id
ORDER BY m.total_events DESC NULLS LAST

What This Query Does

  • Neo4j (Spark Connector): Loads all MaintenanceEvent nodes with their properties, grouped by aircraft and severity

  • Delta (4-table join): Aggregates sensor readings per aircraft, computing avg/max for EGT and vibration

  • Federation: LEFT JOINs both summaries on aircraft_id so every aircraft appears even without maintenance events

Actual Output: Load Maintenance Events (Databricks Runtime 17.3 LTS)

The following output was captured from a live Databricks cluster using the Neo4j Spark Connector against Neo4j Aura:

Loaded 300 maintenance events from Neo4j

Sample maintenance events:
+-----------+---------------+--------+----------------------------+
|aircraft_id|fault          |severity|corrective_action           |
+-----------+---------------+--------+----------------------------+
|AC1002     |Contamination  |CRITICAL|Adjusted tolerance          |
|AC1017     |Sensor drift   |MINOR   |Replaced component          |
|AC1003     |Leak           |MAJOR   |Replaced component          |
|AC1009     |Fuel starvation|CRITICAL|Inspected and found no fault|
|AC1008     |Bearing wear   |MINOR   |Replaced component          |
+-----------+---------------+--------+----------------------------+
only showing top 5 rows

Actual Output: Federated Query (Databricks Runtime 17.3 LTS)

This federated result was produced live on Databricks, JOINing Neo4j maintenance events with Delta sensor readings:

Sensor Health + Maintenance Correlation
Delta: sensor_readings, sensors, systems, aircraft
Neo4j: MaintenanceEvent nodes (Spark Connector)
====================================================================================================
+-----------+--------+-----------+------------+--------+-----+-----+---------+---------+-----------+-----------+
|tail_number|model   |operator   |maint_events|critical|major|minor|avg_egt_c|max_egt_c|avg_vib_ips|max_vib_ips|
+-----------+--------+-----------+------------+--------+-----+-----+---------+---------+-----------+-----------+
|N76758N    |A320-200|SkyWays    |26          |7       |13   |6    |671.6    |698.2    |0.2659     |0.4911     |
|N53032E    |B737-800|ExampleAir |25          |8       |9    |8    |671.6    |698.4    |0.266      |0.4913     |
|N54980C    |A321neo |RegionalCo |20          |8       |6    |6    |671.6    |697.8    |0.266      |0.4898     |
|N46224T    |E190    |NorthernJet|18          |4       |9    |5    |671.6    |697.1    |0.2661     |0.4922     |
|N30268B    |A320-200|SkyWays    |18          |6       |8    |4    |671.6    |697.2    |0.2659     |0.4907     |
|N86057G    |A321neo |RegionalCo |18          |6       |4    |8    |671.6    |698.5    |0.2659     |0.4874     |
|N84110I    |B737-800|ExampleAir |17          |6       |4    |7    |671.6    |699.3    |0.2659     |0.4912     |
|N37272D    |E190    |NorthernJet|15          |6       |4    |5    |671.6    |698.3    |0.266      |0.4875     |
|N96107S    |A321neo |RegionalCo |15          |3       |3    |9    |671.6    |697.6    |0.2659     |0.4922     |
|N81338F    |A320-200|SkyWays    |14          |6       |3    |5    |671.6    |698.8    |0.2659     |0.4943     |
|N15332P    |E190    |NorthernJet|14          |2       |4    |8    |671.6    |697.3    |0.2658     |0.4901     |
|N13211H    |E190    |NorthernJet|13          |6       |3    |4    |671.6    |699.1    |0.266      |0.4897     |
|N26760M    |B737-800|ExampleAir |13          |6       |5    |2    |671.6    |697.0    |0.2659     |0.4884     |
|N44342Q    |B737-800|ExampleAir |13          |5       |2    |6    |671.6    |697.9    |0.2658     |0.4953     |
|N32276J    |A320-200|SkyWays    |12          |3       |4    |5    |671.6    |698.9    |0.266      |0.4929     |
|N89365K    |A321neo |RegionalCo |12          |5       |2    |5    |671.6    |696.9    |0.2659     |0.4898     |
|N95040A    |B737-800|ExampleAir |11          |5       |4    |2    |671.6    |697.8    |0.2659     |0.4886     |
|N84703L    |E190    |NorthernJet|10          |5       |1    |4    |671.6    |698.3    |0.2659     |0.4976     |
|N65164O    |A321neo |RegionalCo |9           |3       |2    |4    |671.7    |697.3    |0.2659     |0.4902     |
|N63098R    |A320-200|SkyWays    |7           |1       |4    |2    |671.6    |699.3    |0.2658     |0.4959     |
+-----------+--------+-----------+------------+--------+-----+-----+---------+---------+-----------+-----------+

Why Spark Connector Here

UC JDBC’s remote_query() wraps queries in subqueries for schema inference, which prevents GROUP BY. The Spark Connector loads row-level data that Spark SQL can freely aggregate and JOIN, which is essential for per-aircraft grouping.