Skip to content

Commit

Permalink
Merge pull request #43 from water3d/main
Browse files Browse the repository at this point in the history
Release 2023.11.13
  • Loading branch information
nickrsan authored Nov 15, 2023
2 parents 8d2fb16 + da8cb2c commit 9190a30
Show file tree
Hide file tree
Showing 13 changed files with 106 additions and 58 deletions.
11 changes: 7 additions & 4 deletions .github/workflows/github-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,14 @@ jobs:
version=$(python -c "from eedl import __version__; print(__version__)")
echo "VERSION=$version" >> $GITHUB_ENV
- name: Build WHL file
run: |
pip install setuptools wheel
python setup.py bdist_wheel
- name: Create a Release
uses: elgohr/Github-Release-Action@v5
- name: Create and attach whl to Release
env:
GH_TOKEN: ${{ github.token }}

with:
title: ${{ env.VERSION }}
run: |
gh release create ${{env.VERSION}} ./dist/*.whl --generate-notes
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright © 2023 Regents of the University of California

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
![EEDL Logo](docs/source/_static/logo/logo_black.png)
<img src="https://raw.githubusercontent.com/water3d/eedl/release/docs/source/_static/logo/logo_black.png" alt="EEDL Logo">

# Earth Engine Downloader

Expand All @@ -14,27 +14,28 @@ Earth Engine's export quotas still apply, especially for EECUs. For academic acc
not tested them on a commercial account.

## Installation
The package is still in development and we have not yet published to PyPI (pip) or conda, but have built infrastructure
for both. Current installation is to download this repository then run `python setup.py install`
EEDL users should take care to install the dependency on GDAL *before* installing EEDL itself. See below for more information.
After installing GDAL, EEDL is available on PyPI via pip as `python -m pip install eedl`, and can also
be downloaded from the [GitHub releases page](https://github.com/water3d/eedl/releases/).

EEDL is tested on Python 3.8-3.11 on Windows and Linux with both standard CPython and Anaconda distributions. EEDL is pure
Python, but depends on GDAL, which has numerous compiled C++ dependencies.
Python, but depends on GDAL, which has numerous compiled C++ dependencies where installation varies by platform.

### Windows
Windows users may want to use Anaconda, or [see this writeup about installing GDAL and other spatial packages on Windows](https://github.com/nickrsan/spatial_resources/edit/main/installing_spatial_python_windows.md).
To install GDAL, Windows users may want to use Anaconda, or [see this writeup about installing GDAL and other spatial packages on Windows](https://github.com/nickrsan/spatial_resources/edit/main/installing_spatial_python_windows.md).

### Linux
Linux users should follow the [GDAL
installation guide](https://pypi.org/project/GDAL/) and 1) Ensure that the gdal-bin and gdal-dev packages are installed and 2) The gdal version they install
for Python matches the gdal version of the system packages (`ogrinfo --version`). We don't pin a version of GDAL to allow
for this workflow. Further details in the GDAL documentation
for this workflow.

## Documentation
Documentation is under development at https://eedl.readthedocs.io. API documentation is most complete, but noisy right
now. We are working on additional details to enable full use of the package.

## Licensing
Licensing is still in progress with the University of California, but we are aiming for a permissive license. More to come.
EEDL is licensed under the MIT license. See <a href="https://github.com/water3d/eedl/blob/main/LICENSE">GitHub's license text and summary</a> for more details of what you can do with it.

## Authors
EEDL has been built by Nick Santos and Adam Crawford as part of the [Secure Water Future](https://securewaterfuture.net) project. This work is supported
Expand Down
13 changes: 12 additions & 1 deletion docs/source/user_guide/export_locations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,15 @@ to the bucket, so those do not become public.
in Drive, want to bill export storage to a Cloud project, want
the files available externally, or want the most reliable
export method, using Google Cloud exports may be a good option
for you.
for you.

Due to the current way EEDL pulls images from Google Cloud Storage, there
is a limit of 1000 tiles per exported image, though more than 1000 tiles can
be waited for at a time for multiple image exports. Few single image exports
will hit this limit - by default, EEDL allows 12800 pixels per side of a tile,
configurable to more or less if your image requires (especially multiband images).
Broad-scale, but high resolution images may hit this limit and will either need
to export to Google Drive, or will need to build an improved implementation
for pulling tiles from Google Cloud. The limit stems from Google Cloud Storage's
public bucket listing only providing 1000 items - EEDL doesn't detect or traverse
pages at this time.
2 changes: 1 addition & 1 deletion eedl/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2023.10.18"
__version__ = "0.2023.11.13"
8 changes: 4 additions & 4 deletions eedl/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ def _get_fiona_args(polygon_path: Union[str, Path]) -> Dict[str, Union[str, Path

def safe_fiona_open(features_path: Union[str, Path], **extra_kwargs) -> fiona.Collection:
"""
Handles opening things in fiona in a way that is safe, even for geodatabases where we need
to open the geodatabase itself and specify a layer. The caller is responsible for
ensuring the features are closed (e.g. a try/finally block with a call to features.close()
in the finally block should immediately follow calling this function.
Handles opening things in fiona in a way that is safe, even for geodatabases where we need
to open the geodatabase itself and specify a layer. The caller is responsible for
ensuring the features are closed (e.g. a try/finally block with a call to features.close())
in the finally block should immediately follow calling this function.
:param features_path: A Path object or string path to open with fiona
:param extra_kwargs: Keyword arguments to directly pass through to fiona. Helpful when trying to filter features, etc
:return:
Expand Down
29 changes: 19 additions & 10 deletions eedl/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,20 @@ def __init__(self, **kwargs):

def _single_item_extract(self, image, task_registry, zonal_features, aoi_attr, ee_geom, image_date, aoi_download_folder):
"""
This looks a bit silly here, but we need to construct this here so that we have access
to this method's variables since we can't pass them in and it can't be a class function.
:param image:
:param state:
:return:
This looks a bit silly here, but we need to construct this here so that we have access
to this method's variables since we can't pass them in and it can't be a class function.
Args:
image:
task_registry:
zonal_features:
aoi_attr:
ee_geom:
image_date:
aoi_download_folder:
Returns:
None
"""

export_image = EEDLImage(
Expand Down Expand Up @@ -147,8 +156,8 @@ def extract(self):
zonal_features = zonal_features_filtered

image = aoi_collection.filter(ee.Filter.eq("system:time_start", image_info[1])).first() # Get the image from the collection again based on ID.
timsetamp_in_seconds = int(str(image_info[1])[:-3]) # We could divide by 1000, but then we'd coerce back from a float. This is precise.
date_string = datetime.datetime.fromtimestamp(timsetamp_in_seconds, tz=datetime.timezone.utc).strftime("%Y-%m-%d")
timestamp_in_seconds = int(str(image_info[1])[:-3]) # We could divide by 1000, but then we'd coerce back from a float. This is precise.
date_string = datetime.datetime.fromtimestamp(timestamp_in_seconds, tz=datetime.timezone.utc).strftime("%Y-%m-%d")

self._single_item_extract(image, task_registry, zonal_features, aoi_attr, ee_geom, date_string, aoi_download_folder)

Expand Down Expand Up @@ -188,9 +197,9 @@ def _get_and_filter_collection(self):

def mosaic_by_date(image_collection):
"""
Adapted to Python from code found via https://gis.stackexchange.com/a/343453/1955
:param image_collection: An image collection
:return: ee.ImageCollection
Adapted to Python from code found via https://gis.stackexchange.com/a/343453/1955
:param image_collection: An image collection
:return: ee.ImageCollection
"""
image_list = image_collection.toList(image_collection.size())

Expand Down
35 changes: 21 additions & 14 deletions eedl/image.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ def __init__(self) -> None:
self.log_file: Optional[io.TextIOWrapper] = None # the open log file handle
self.raise_errors: bool = True

def add(self, image: ee.image.Image) -> None:
def add(self, image: "EEDLImage") -> None:
"""
Adds an Earth Engine image to the list of Earth Engine images.
Expand All @@ -94,7 +94,7 @@ def add(self, image: ee.image.Image) -> None:
self.images.append(image)

@property
def incomplete_tasks(self) -> List[ee.image.Image]:
def incomplete_tasks(self) -> List["EEDLImage"]:
"""
List of Earth Engine images that have not been completed yet.
Expand All @@ -108,7 +108,7 @@ def incomplete_tasks(self) -> List[ee.image.Image]:
return [image for image in self.images if image.last_task_status['state'] in self.INCOMPLETE_STATUSES]

@property
def complete_tasks(self) -> List[ee.image.Image]:
def complete_tasks(self) -> List["EEDLImage"]:
"""
List of Earth Engine images.
Expand All @@ -118,7 +118,7 @@ def complete_tasks(self) -> List[ee.image.Image]:
return [image for image in self.images if image.last_task_status['state'] in self.COMPLETE_STATUSES + self.FAILED_STATUSES]

@property
def failed_tasks(self) -> List[ee.image.Image]:
def failed_tasks(self) -> List["EEDLImage"]:
"""
List of Earth Engine images that have either been cancelled or that have failed
Expand All @@ -128,7 +128,7 @@ def failed_tasks(self) -> List[ee.image.Image]:
return [image for image in self.images if image.last_task_status['state'] in self.FAILED_STATUSES]

@property
def downloadable_tasks(self) -> List[ee.image.Image]:
def downloadable_tasks(self) -> List["EEDLImage"]:
"""
List of Earth Engine images that have not been cancelled or have failed.
Expand Down Expand Up @@ -165,9 +165,12 @@ def setup_log(self, log_file_path: Union[str, Path], mode='a'):

def log_error(self, error_type: str, error_message: str):
"""
:param error_type: Options "ee", "local" to indicate whether it was an error on Earth Engine's side or on
the local processing side
:param error_message: The error message to print to the log file
Args:
error_type (str): Options "ee", "local" to indicate whether it was an error on Earth Engine's side or on the local processing side
error_message (str): The error message to print to the log file
Returns:
None
"""
message = f"{error_type} Error: {error_message}"
date_string = datetime.datetime.now().strftime("%Y-%m-%d %H:%M")
Expand Down Expand Up @@ -215,15 +218,15 @@ def wait_for_images(self,
self.download_ready_images(download_location)
except OSError:
if try_again_disk_full:
print("OSError reported. Disk may be full - will try again - clear space")
print("OSError reported. Invalid disk or the disk may be full - will try again - clear space")
pass
else:
raise

time.sleep(sleep_time)

if len(self.failed_tasks) > 0:
message = f"{len(self.failed_tasks)} images failed to export. Example error message from first" \
message = f"{len(self.failed_tasks)} image(s) failed to export. Example error message from first" \
f" failed image \"{self.failed_tasks[0].last_task_status['description']}\" was" \
f" \"{self.failed_tasks[0].last_task_status['error_message']}\"." \
f" Check https://code.earthengine.google.com/tasks in your web browser to see status and" \
Expand Down Expand Up @@ -370,7 +373,7 @@ def export(self,
clip (Optional[ee.geometry.Geometry]): Defines the region of interest for export - does not perform a strict clip, which is often slower.
Instead, it uses the Earth Engine export's "region" parameter to clip the results to the bounding box of
the clip geometry. To clip to the actual geometry, set strict_clip to True.
strict_clip (Optional[bool]: When set to True, performs a true clip on the result so that it's not just the bounding box but also the
strict_clip (Optional[bool]): When set to True, performs a true clip on the result so that it's not just the bounding box but also the
actual clipping geometry. Defaults to False.
drive_root_folder (Optional[Union[str, Path]]): The folder for exporting if "drive" is selected.
Expand Down Expand Up @@ -544,12 +547,11 @@ def zonal_stats(self,
report_threshold: int = 1000,
write_batch_size: int = 2000,
use_points: bool = False,
inject_constants: dict = dict(),
inject_constants: Optional[dict] = None,
nodata_value: int = -9999,
all_touched: bool = False
) -> None:
"""
Args:
polygons (Union[str, Path]):
keep_fields (tuple[str, ...]):
Expand All @@ -558,10 +560,15 @@ def zonal_stats(self,
Set to None to disable.
write_batch_size (int): How many zones should we store up before writing to the disk? Defaults to 2000.
use_points (bool):
inject_constants(Optional[dict]):
nodata_value (int):
all_touched (bool):
Returns:
None
"""
if inject_constants is None:
inject_constants = dict()

self.zonal_output_filepath = zonal.zonal_stats(
polygons,
Expand All @@ -580,7 +587,7 @@ def zonal_stats(self,

def _check_task_status(self) -> Dict[str, Union[Dict[str, str], bool]]:
"""
Updates the status is it needs to be changed
Updates the status if it needs to be changed
Returns:
Dict[str, Union[Dict[str, str], bool]]: Returns a dictionary of the most up-to-date status and whether that status was changed
Expand Down
5 changes: 2 additions & 3 deletions eedl/mosaic_rasters.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import tempfile
from pathlib import Path
from typing import Sequence, Union

from osgeo import gdal


Expand All @@ -21,7 +20,7 @@ def mosaic_folder(folder_path: Union[str, Path], output_path: Union[str, Path],
"""
tifs = [os.path.join(folder_path, filename) for filename in os.listdir(folder_path) if filename.endswith(".tif") and filename.startswith(prefix)]

if len(tifs) == 1: # If we only got one image back, don't both mosaicking, though this will also skip generating overviews.
if len(tifs) == 1: # If we only got one image back, don't bother mosaicking, though this will also skip generating overviews.
shutil.move(tifs[0], output_path) # Just move the output image to the "mosaic" name, then return.
return

Expand All @@ -44,7 +43,7 @@ def mosaic_rasters(raster_paths: Sequence[Union[str, Path]],
"""

# gdal.SetConfigOption("GTIFF_SRC_SOURCE", "GEOKEYS")
vrt_path = tempfile.mktemp(suffix=".vrt", prefix="mosaic_rasters_")
vrt_path = tempfile.mkstemp(suffix=".vrt", prefix="mosaic_rasters_")

vrt_options = gdal.BuildVRTOptions(resampleAlg='nearest', resolution="highest")
my_vrt = gdal.BuildVRT(vrt_path, raster_paths, options=vrt_options)
Expand Down
6 changes: 2 additions & 4 deletions examples/basic.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
import ee
from ee import ImageCollection

# we should change the name of our Image class - it conflicts with the class image in the ee package, and people will
# likely be using both. Let's not cause confusion
from eedl.image import EEDLImage
import eedl
from eedl.image import EEDLImage


def test_simple() -> None:
Expand All @@ -14,7 +12,7 @@ def test_simple() -> None:
# Adam, make sure to set the drive root folder for your own testing - we'll need to fix this, and in the future,
# we can use a Google Cloud bucket for most testing this is clunky - we should make the instantiation of the image be able to take a kwarg that sets the value of image, I think.
image = EEDLImage()
image.export(s2_image, "valley_water_s2_test_image", export_type="Drive", drive_root_folder=r"G:\My Drive", clip=geometry)
image.export(s2_image, "valley_water_s2_test_image", export_type="Drive", clip=geometry, drive_root_folder=r"G:\My Drive")

# We need to make it check and report whether the export on the EE side was successful. This test "passed" because Earth Engine failed and there wasn't anything to download (oops)
# Adam, make sure to set the folder you want results to be downloaded to
Expand Down
6 changes: 3 additions & 3 deletions examples/get_gridmet_eto.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
ee.Initialize()


def get_days(month, year):
def get_days(input_month: str, input_year: int) -> int:
days = {
'01': 31,
'02': 28 if not year == 2020 else 29,
'02': 28 if not input_year == 2020 else 29,
'03': 31,
'04': 30,
'05': 31,
Expand All @@ -20,7 +20,7 @@ def get_days(month, year):
'12': 31
}

return days[month]
return days[input_month]


years = (2019, 2020, 2021)
Expand Down
Loading

0 comments on commit 9190a30

Please sign in to comment.