Query 1: Verify Data Sources
Confirm both the Delta lakehouse tables and the Neo4j UC JDBC connection are accessible before running federated queries.
Federation Method: remote_query() (UC JDBC) + Delta SQL
Verify Delta Lakehouse Tables
Count rows in each Delta table to confirm they are populated:
for table in ["aircraft", "systems", "sensors", "sensor_readings"]:
count = spark.sql(f"SELECT COUNT(*) AS cnt FROM {table}").collect()[0]["cnt"]
print(f" {table}: {count:,} rows")
Sample aircraft query:
SELECT `:ID(Aircraft)` AS aircraft_id, tail_number, model, manufacturer, operator
FROM aircraft LIMIT 5
Actual Output (from Databricks cluster)
============================================================ DELTA LAKEHOUSE TABLES ============================================================ aircraft: 20 rows systems: 80 rows sensors: 160 rows sensor_readings: 345,600 rows Sample aircraft data: +-----------+-----------+--------+------------+-----------+ |aircraft_id|tail_number|model |manufacturer|operator | +-----------+-----------+--------+------------+-----------+ |AC1001 |N95040A |B737-800|Boeing |ExampleAir | |AC1002 |N30268B |A320-200|Airbus |SkyWays | |AC1003 |N54980C |A321neo |Airbus |RegionalCo | |AC1004 |N37272D |E190 |Embraer |NorthernJet| |AC1005 |N53032E |B737-800|Boeing |ExampleAir | +-----------+-----------+--------+------------+-----------+
Verify Neo4j UC JDBC Connection
Use remote_query() to count nodes by label:
SELECT * FROM remote_query('neo4j_uc_connection',
query => 'SELECT COUNT(*) AS cnt FROM MaintenanceEvent')
SELECT * FROM remote_query('neo4j_uc_connection',
query => 'SELECT COUNT(*) AS cnt FROM Flight')
Test a graph traversal (Flight→Airport relationships):
SELECT * FROM remote_query('neo4j_uc_connection',
query => 'SELECT COUNT(*) AS cnt
FROM Flight f
NATURAL JOIN DEPARTS_FROM r
NATURAL JOIN Airport a')
What This Verifies
-
Delta tables are accessible in the current catalog/schema context
-
The Neo4j UC JDBC connection (
remote_query()) can execute SQL queries against Neo4j -
Neo4j’s SQL2Cypher translation handles
NATURAL JOINfor relationship traversals -
Both data sources are ready for federated queries in subsequent examples