Abstract
GeoTrellis is a Scala framework for fast, parallel processing of geospatial data. GeoTrellis also supports raster data processing on Apache Spark. GeoTrellis supports Hadoop HDFS and Accumulo as Spark backends. Cassandra is another popular distributed data store. This project aims to improve the GeoTrellis Catalog prototype implementation for Cassandra to allow processing of raster layers via Spark RDDs as well as add vector RDD capabilties, with a focus on a performance-based indexing scheme.
Link to the 2015 GSoC proposal
Further info
- few sentences about my preparations and considerations
- my Cassandra GeoTrellis contributions on GitHub
- a little support on the GeoTrellis site documentations
- integrating a Cassandra benchmark