

# Earth Observation Jobs
<a name="geospatial-eoj"></a>

Using an Earth Observation job (EOJ), you can acquire, transform, and visualize geospatial data to make predictions. You can choose an operation based on your use case from a wide range of operations and models. You get the flexibility of choosing your area of interest, selecting the data providers, and setting time-range based and cloud-cover-percentage-based filters. After SageMaker AI creates an EOJ for you, you can visualize the inputs and outputs of the job using the visualization functionality. An EOJ has various use cases that include comparing deforestation over time and diagnosing plant health. You can create an EOJ by using a SageMaker notebook with a SageMaker geospatial image. You can also access the SageMaker geospatial UI as a part of Amazon SageMaker Studio Classic UI to view the list of all your jobs. You can also use the UI to pause or stop an ongoing job. You can choose a job from the list of available EOJ to view the **Job summary**, the **Job details** as well as visualize the **Job output**.

**Topics**
+ [Create an Earth Observation Job Using a Amazon SageMaker Studio Classic Notebook with a SageMaker geospatial Image](geospatial-eoj-ntb.md)
+ [Types of Operations](geospatial-eoj-models.md)

# Create an Earth Observation Job Using a Amazon SageMaker Studio Classic Notebook with a SageMaker geospatial Image
<a name="geospatial-eoj-ntb"></a>

**To use a SageMaker Studio Classic notebook with a SageMaker geospatial image:**

1. From the **Launcher**, choose **Change environment** under **Notebooks and compute resources**.

1. Next, the **Change environment** dialog opens.

1. Select the **Image** dropdown and choose **Geospatial 1.0**. The **Instance type** should be **ml.geospatial.interactive**. Do not change the default values for other settings.

1. Choose **Select**.

1. Choose **Create notebook**.

You can initiate an EOJ using a Amazon SageMaker Studio Classic notebook with a SageMaker geospatial image using the code provided below.

```
import boto3
import sagemaker
import sagemaker_geospatial_map

session = boto3.Session()
execution_role = sagemaker.get_execution_role()
sg_client = session.client(service_name="sagemaker-geospatial")
```

The following is an example showing how to create an EOJ in the in the US West (Oregon) Region.

```
#Query and Access Data
search_rdc_args = {
    "Arn": "arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/nmqj48dcu3g7ayw8",  # sentinel-2 L2A COG
    "RasterDataCollectionQuery": {
        "AreaOfInterest": {
            "AreaOfInterestGeometry": {
                "PolygonGeometry": {
                    "Coordinates": [
                        [
                            [-114.529, 36.142],
                            [-114.373, 36.142],
                            [-114.373, 36.411],
                            [-114.529, 36.411],
                            [-114.529, 36.142],
                        ]
                    ]
                }
            }
        },
        "TimeRangeFilter": {
            "StartTime": "2021-01-01T00:00:00Z",
            "EndTime": "2022-07-10T23:59:59Z",
        },
        "PropertyFilters": {
            "Properties": [{"Property": {"EoCloudCover": {"LowerBound": 0, "UpperBound": 1}}}],
            "LogicalOperator": "AND",
        },
        "BandFilter": ["visual"],
    },
}

tci_urls = []
data_manifests = []
while search_rdc_args.get("NextToken", True):
    search_result = sg_client.search_raster_data_collection(**search_rdc_args)
    if search_result.get("NextToken"):
        data_manifests.append(search_result)
    for item in search_result["Items"]:
        tci_url = item["Assets"]["visual"]["Href"]
        print(tci_url)
        tci_urls.append(tci_url)

    search_rdc_args["NextToken"] = search_result.get("NextToken")
        
# Perform land cover segmentation on images returned from the sentinel dataset.
eoj_input_config = {
    "RasterDataCollectionQuery": {
        "RasterDataCollectionArn": "arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/nmqj48dcu3g7ayw8",
        "AreaOfInterest": {
            "AreaOfInterestGeometry": {
                "PolygonGeometry": {
                    "Coordinates": [
                        [
                            [-114.529, 36.142],
                            [-114.373, 36.142],
                            [-114.373, 36.411],
                            [-114.529, 36.411],
                            [-114.529, 36.142],
                        ]
                    ]
                }
            }
        },
        "TimeRangeFilter": {
            "StartTime": "2021-01-01T00:00:00Z",
            "EndTime": "2022-07-10T23:59:59Z",
        },
        "PropertyFilters": {
            "Properties": [{"Property": {"EoCloudCover": {"LowerBound": 0, "UpperBound": 1}}}],
            "LogicalOperator": "AND",
        },
    }
}
eoj_config = {"LandCoverSegmentationConfig": {}}

response = sg_client.start_earth_observation_job(
    Name="lake-mead-landcover",
    InputConfig=eoj_input_config,
    JobConfig=eoj_config,
    ExecutionRoleArn=execution_role,
)
```

After your EOJ is created, the `Arn` is returned to you. You use the `Arn` to identify a job and perform further operations. To get the status of a job, you can run `sg_client.get_earth_observation_job(Arn = response['Arn'])`.

The following example shows how to query the status of an EOJ until it is completed.

```
eoj_arn = response["Arn"]
job_details = sg_client.get_earth_observation_job(Arn=eoj_arn)
{k: v for k, v in job_details.items() if k in ["Arn", "Status", "DurationInSeconds"]}
# List all jobs in the account
sg_client.list_earth_observation_jobs()["EarthObservationJobSummaries"]
```

After the EOJ is completed, you can visualize the EOJ outputs directly in the notebook. The following example shows you how an interactive map can be rendered.

```
map = sagemaker_geospatial_map.create_map({
'is_raster': True
})
map.set_sagemaker_geospatial_client(sg_client)
# render the map
map.render()
```

The following example shows how the map can be centered on an area of interest and the input and output of the EOJ can be rendered as separate layers within the map.

```
# visualize the area of interest
config = {"label": "Lake Mead AOI"}
aoi_layer = map.visualize_eoj_aoi(Arn=eoj_arn, config=config)

# Visualize input.
time_range_filter = {
    "start_date": "2022-07-01T00:00:00Z",
    "end_date": "2022-07-10T23:59:59Z",
}
config = {"label": "Input"}

input_layer = map.visualize_eoj_input(
    Arn=eoj_arn, config=config, time_range_filter=time_range_filter
)
# Visualize output, EOJ needs to be in completed status.
time_range_filter = {
    "start_date": "2022-07-01T00:00:00Z",
    "end_date": "2022-07-10T23:59:59Z",
}
config = {"preset": "singleBand", "band_name": "mask"}
output_layer = map.visualize_eoj_output(
    Arn=eoj_arn, config=config, time_range_filter=time_range_filter
)
```

You can use the `export_earth_observation_job` function to export the EOJ results to your Amazon S3 bucket. The export function makes it convenient to share results across teams. SageMaker AI also simplifies dataset management. We can simply share the EOJ results using the job ARN, instead of crawling thousands of files in the S3 bucket. Each EOJ becomes an asset in the data catalog, as results can be grouped by the job ARN. The following example shows how you can export the results of an EOJ.

```
sagemaker_session = sagemaker.Session()
s3_bucket_name = sagemaker_session.default_bucket()  # Replace with your own bucket if needed
s3_bucket = session.resource("s3").Bucket(s3_bucket_name)
prefix = "eoj_lakemead"  # Replace with the S3 prefix desired
export_bucket_and_key = f"s3://{s3_bucket_name}/{prefix}/"

eoj_output_config = {"S3Data": {"S3Uri": export_bucket_and_key}}
export_response = sg_client.export_earth_observation_job(
    Arn=eoj_arn,
    ExecutionRoleArn=execution_role,
    OutputConfig=eoj_output_config,
    ExportSourceImages=False,
)
```

You can monitor the status of your export job using the following snippet.

```
# Monitor the export job status
export_job_details = sg_client.get_earth_observation_job(Arn=export_response["Arn"])
{k: v for k, v in export_job_details.items() if k in ["Arn", "Status", "DurationInSeconds"]}
```

You are not charged the storage fees after you delete the EOJ.

For an example that showcases how to run an EOJ, see this [blog post](https://aws.amazon.com/blogs/machine-learning/monitoring-lake-mead-drought-using-the-new-amazon-sagemaker-geospatial-capabilities/).

For more example notebooks on SageMaker geospatial capabilities, see this [GitHub repository](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-geospatial).

# Types of Operations
<a name="geospatial-eoj-models"></a>

When you create an EOJ, you select an operation based on your use case. Amazon SageMaker geospatial capabilities provide a combination of purpose-built operations and pre-trained models. You can use these operations to understand the impact of environmental changes and human activities over time or identify cloud and cloud-free pixels.

**Cloud Masking**

Identify clouds in satellite images is an essential pre-processing step in producing high-quality geospatial data. Ignoring cloud pixels can lead to errors in analysis, and over-detection of cloud pixels can decrease the number of valid observations. Cloud masking has the ability to identify cloudy and cloud-free pixels in satellite images. An accurate cloud mask helps get satellite images for processing and improves data generation. The following is the class map for cloud masking.

```
{
0: "No_cloud",
1: "cloud"
}
```

**Cloud Removal**

Cloud removal for Sentinel-2 data uses an ML-based semantic segmentation model to identify clouds in the image. Cloudy pixels can be replaced by with pixels from other timestamps. USGS Landsat data contains landsat metadata that is used for cloud removal.

**Temporal Statistics**

Temporal statistics calculate statistics for geospatial data through time. The temporal statistics currently supported include mean, median, and standard deviation. You can calculate these statistics by using `GROUPBY` and set it to either `all` or `yearly`. You can also mention the `TargetBands`.

**Zonal Statistics**

Zonal statistics performs statistical operations over a specified area on the image. 

**Resampling**

Resampling is used to upscale and downscale the resolution of a geospatial image. The `value` attribute in resampling represents the length of a side of the pixel.

**Geomosaic**

Geomosaic allows you to stitch smaller images into a large image.

**Band Stacking**

Band stacking takes more than one image band as input and stacks them into a single GeoTIFF. The `OutputResolution` attribute determines the resolution of the output image. Based on the resolutions of the input images, you can set it to `lowest`, `highest` or `average`.

**Band Math**

Band Math, also known as Spectral Index, is a process of transforming the observations from multiple spectral bands to a single band, indicating the relative abundance of features of interests. For instance, Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) are helpful for observing the presence of green vegetation features.

**Land Cover Segmentation**

Land Cover segmentation is a semantic segmentation model that has the capability to identify the physical material, such as vegetation, water, and bare ground, at the earth surface. Having an accurate way to map the land cover patterns helps you understand the impact of environmental change and human activities over time. Land Cover segmentation is often used for region planning, disaster response, ecological management, and environmental impact assessment. The following is the class map for Land Cover segmentation.

```
{
0: "No_data",
1: "Saturated_or_defective",
2: "Dark_area_pixels",
3: "Cloud_shadows",
4: "Vegetation",
5: "Not_vegetated",
6: "Water",
7: "Unclassified",
8: "Cloud_medium_probability",
9: "Cloud_high_probability",
10: "Thin_cirrus",
11: "Snow_ice"
}
```

## Availability of EOJ Operations
<a name="geospatial-eoj-models-avail"></a>

The availability of operations depends on whether you are using the SageMaker geospatial UI or the Amazon SageMaker Studio Classic notebooks with a SageMaker geospatial image. Currently, notebooks support all functionalities. To summarize, the following geospatial operations are supported by SageMaker AI:


| Operations |  Description  |  Availability  | 
| --- | --- | --- | 
| Cloud Masking | Identify cloud and cloud-free pixels to get improved and accurate satellite imagery. | UI, Notebook | 
| Cloud Removal | Remove pixels containing parts of a cloud from satellite imagery. | Notebook | 
| Temporal Statistics | Calculate statistics through time for a given GeoTIFF. | Notebook | 
| Zonal Statistics | Calculate statistics on user-defined regions. | Notebook | 
| Resampling | Scale images to different resolutions. | Notebook | 
| Geomosaic | Combine multiple images for greater fidelity. | Notebook | 
| Band Stacking | Combine multiple spectral bands to create a single image. | Notebook | 
| Band Math / Spectral Index | Obtain a combination of spectral bands that indicate the abundance of features of interest. | UI, Notebook | 
| Land Cover Segmentation | Identify land cover types such as vegetation and water in satellite imagery. | UI, Notebook | 