Query 1: Verify Data Sources

Confirm both the Delta lakehouse tables and the Neo4j UC JDBC connection are accessible before running federated queries.

Federation Method: remote_query() (UC JDBC) + Delta SQL

Data Flow

Verify Data Sources Flow

Verify Delta Lakehouse Tables

Count rows in each Delta table to confirm they are populated:

for table in ["aircraft", "systems", "sensors", "sensor_readings"]:
    count = spark.sql(f"SELECT COUNT(*) AS cnt FROM {table}").collect()[0]["cnt"]
    print(f"  {table}: {count:,} rows")

Sample aircraft query:

SELECT `:ID(Aircraft)` AS aircraft_id, tail_number, model, manufacturer, operator
FROM aircraft LIMIT 5
Actual Output (from Databricks cluster)
============================================================
DELTA LAKEHOUSE TABLES
============================================================
  aircraft: 20 rows
  systems: 80 rows
  sensors: 160 rows
  sensor_readings: 345,600 rows

Sample aircraft data:
+-----------+-----------+--------+------------+-----------+
|aircraft_id|tail_number|model   |manufacturer|operator   |
+-----------+-----------+--------+------------+-----------+
|AC1001     |N95040A    |B737-800|Boeing      |ExampleAir |
|AC1002     |N30268B    |A320-200|Airbus      |SkyWays    |
|AC1003     |N54980C    |A321neo |Airbus      |RegionalCo |
|AC1004     |N37272D    |E190    |Embraer     |NorthernJet|
|AC1005     |N53032E    |B737-800|Boeing      |ExampleAir |
+-----------+-----------+--------+------------+-----------+

Verify Neo4j UC JDBC Connection

Use remote_query() to count nodes by label:

SELECT * FROM remote_query('neo4j_uc_connection',
    query => 'SELECT COUNT(*) AS cnt FROM MaintenanceEvent')

SELECT * FROM remote_query('neo4j_uc_connection',
    query => 'SELECT COUNT(*) AS cnt FROM Flight')

Test a graph traversal (Flight→Airport relationships):

SELECT * FROM remote_query('neo4j_uc_connection',
    query => 'SELECT COUNT(*) AS cnt
              FROM Flight f
              NATURAL JOIN DEPARTS_FROM r
              NATURAL JOIN Airport a')

What This Verifies

  • Delta tables are accessible in the current catalog/schema context

  • The Neo4j UC JDBC connection (remote_query()) can execute SQL queries against Neo4j

  • Neo4j’s SQL2Cypher translation handles NATURAL JOIN for relationship traversals

  • Both data sources are ready for federated queries in subsequent examples