Geospatial Anomaly Detection: Part 1 — Massively Scalable Geospatial Anomaly Detection with Apache Kafka and Cassandra

Part 1: The Problem and Initial Ideas

1. Space: The Final Frontier

Geospatial Anomaly Detection with Kafka and Cassandra - black hole in a galaxy far away
Geospatial Anomaly Detection with Kafka and Cassandra - Project Blue Book archives

2. The Geospatial Anomaly Detection Problem

Geospatial Anomaly Detection with Kafka and Cassandra - Space
Geospatial Anomaly Detection with kafka and cassandra - Location and Scale challenges
Geospatial Anomaly Detection with kafka and cassandra - Treasure Island
Geospatial Anomaly Detection with kafka and cassandra - event spaced far away
Geospatial Anomaly Detection with kafka and cassandra - Near events
Geospatial Anomaly Detection with kafka and cassandra- Flat earth theory

3. Latitude and Longitude

CREATE TABLE latlong (
country text,
time timestamp,
lat double,
long double,
PRIMARY KEY (country, time)
) WITH CLUSTERING ORDER BY (time DESC);
select * from latlong where country='nz' and lat=- 39.1296 and long=- 175.6358 limit 50;
“Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"
select * from latlong where country='nz' and lat=- 39.1296 and long=175.6358 limit 50 allow filtering;
Geospatial Anomaly Detection with kafka and cassandra - Volcano
Geospatial Anomaly Detection with kafka and cassandra - Square and stationary earth
Geospatial Anomaly Detection with kafka and cassandra - Earth Longitude and Latitude
Geospatial Anomaly Detection with kafka and cassandra- Mercator map
Geospatial Anomaly Detection with kafka and cassandra - Earth Cylindrical Projection
Geospatial Anomaly Detection with kafka and cassandra - Haversine formula

4. Bounding Box

Geospatial Anomaly Detection with kafka and cassandra - Bounding Box Query
select * from latlong where country='nz' and lat>= -39.58 and lat <= -38.67 and long >= 175.18 and long <= 176.08 limit 50 allow filtering;

5. Indexing

5.1 Clustering Columns

CREATE TABLE latlong (
country text,
time timestamp,
lat double,
long double,
PRIMARY KEY (country, lat, long, time)
) WITH CLUSTERING ORDER BY (lat DESC, long DESC, time DESC);
select * from latlong where country='nz' and lat= -39.58 and long >= 175.18 and long <= 176.08 limit 50;

5.2 Secondary Indexes

create index i1 on latlong (lat);
create index i2 on latlong (long);

5.3 SASI

  • AND or OR combinations of queries.
  • Wildcard search in string values.
  • Range queries.
create custom index i3 on latlong (long) using 'org.apache.cassandra.index.sasi.SASIIndex';

create custom index i4 on latlong (lat) using 'org.apache.cassandra.index.sasi.SASIIndex';

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Instaclustr

Instaclustr

Managed platform for open source technologies including Apache Cassandra, Apache Kafka, Apache ZooKeepere, Redis, Elasticsearch and PostgreSQL