GeoTrellis is an open source, geographic data processing library designed to work with large geospatial raster data sets. It is written in Scala and has an open-source Apache 2.0 license.
Video GeoTrellis
Description
GeoTrellis' core competency is raster data processing: enabling distributed processing of large geospatial raster data sets using the techniques of map algebra. In addition to support for raster data operations, GeoTrellis includes some support for operations using vector and point cloud data.
GeoTrellis leverages Apache Spark for distributed processing. Distributed processing relies on indexing large datasets based on a multi-dimensional space-filling curve (SFC). SFCs enable the translation of multi-dimensional indices into a single-dimensional one, while maintaining geospatial locality. This allows for efficient reading and writing of large datasets to be performed in parallel across multiple computers.
Python bindings have been developed for GeoTrellis as a sub-project called GeoPySpark that enables Python developers to access and use the GeoTrellis library.
Maps GeoTrellis
Project History
GeoTrellis started as a research project at Azavea, a geospatial software company based in Philadelphia. A precursor software component, DecisionTree, was developed beginning in 2006 with support from a Small Business Innovation Research grant from the U.S. Department of Agriculture. In 2009, with financial support from the William Penn Foundation and Stroud Water Research Center, Azavea embarked on early development of GeoTrellis.
GeoTrellis was released as an open source project in 2011 with the goal of supporting fast processing of geospatial raster data at scale.
GeoTrellis initially supported distributed computation through Akka, a Scala framework for building concurrent and distributed applications. The need to support additional use cases and features such as caching and sharding datasets across a storage cluster led to a search for a new distribution framework. GeoTrellis moved to Apache Spark as its distribution engine in 2014 in order to leverage management, scheduling, and other features in the Spark framework. One key use case that drove this phase of development was the need to efficiently process large, spatiotemporal datasets like those used for many earth science applications, such as climate change. The move to Apache Spark enabled efficient support for large climate forecast datasets published by the Intergovernmental Panel on Climate Change (IPCC).
GeoTrellis joined the Eclipse Foundation's LocationTech working group in 2013 and graduate from incubation with a 1.0 release in December 2016.
GeoTrellis has been used in a number of geospatial domains including: satellite image processing, forest growth simulation, agricultural yield predictions, planning, digital humanities, government infrastructure investment, and machine learning to support crime risk forecasting.
References
External links
- GeoTrellis homepage
- GeoTrellis Source
- Azavea homepage
- LocationTech homepage
- Raster Foundry homepage
- GeoPySpark Source
Source of the article : Wikipedia