Migrating ArcGIS Data Workflows to the Cloud
Solution Architecture Choices and Tradeoffs: Migrating ArcGIS Data Workflows from On-Premises to Microservices and ArcGIS Online SaaS
Migrating geospatial data workflows from traditional on-premises ArcGIS Server environments to modern, cloud-native architectures is a complex process that involves several architectural decisions and tradeoffs. This post explores the options and considerations when moving a feature service based on ArcGIS Server and a local SDE (Spatial Database Engine) MSSQLServer database to cloud-based solutions, focusing on two main approaches: a lift-and-shift migration to AWS and a serverless, SaaS-integrated workflow leveraging ArcGIS Online.
The On-Premises Baseline
Traditionally, many organizations have managed their geospatial data using:
- ArcGIS Server: Hosting feature services for web and desktop GIS clients.
- SDE (ArcSDE) with MSSQLServer: Storing spatial data in a local enterprise geodatabase.
- ArcGIS Pro: Used for data processing, scripting, and ETL tasks, often running on dedicated servers.
While this setup offers full control and integration with enterprise systems, it comes with significant operational overhead, including hardware maintenance, software updates, and scaling challenges.
Option 1: Lift-and-Shift to AWS
A straightforward migration path is to replicate the on-premises architecture in the cloud:
Architecture Overview
- ArcGIS Server: Deployed on EC2 instances in AWS.
- Database: Migrated to Amazon Aurora PostgreSQL (with PostGIS for spatial support).
- Data Processing: ArcGIS Pro installed on a Windows EC2 instance for scheduled or ad-hoc processing.
Pros
- Minimal Change: Existing workflows, scripts, and integrations can often be reused with little modification.
- Familiarity: Staff can leverage existing skills and tools.
- Control: Full control over server configuration, security, and update cycles.
Cons
- Ongoing Maintenance: Servers and databases still require patching, monitoring, and scaling.
- Cost: Always-on resources (EC2, Aurora) can be expensive, especially for variable workloads.
- Limited Cloud-Native Benefits: The architecture does not fully leverage the elasticity and managed services of the cloud.
Option 2: Serverless and SaaS-Integrated Workflow
A more modern approach is to re-architect the workflow using cloud-native and SaaS services:
Architecture Overview
- Data Upload: Input data is uploaded to AWS S3.
- Processing: AWS Lambda functions process the data (e.g., validation, transformation, conversion to GeoPackage).
- Data Storage: Processed data is appended to an ArcGIS Online Feature Layer, which is SaaS-hosted and can ingest data from GeoPackages.
Pros
- No Server Maintenance: No need to manage EC2 instances or database servers.
- Scalability: Lambda functions scale automatically with demand.
- Cost Efficiency: Pay only for what you use (S3, Lambda, ArcGIS Online storage).
- Integration: ArcGIS Online provides robust sharing, visualization, and analysis tools out-of-the-box.
Cons
- Workflow Redesign: Requires rethinking data processing and integration logic for stateless, event-driven execution.
- Limits and Quotas: ArcGIS Online has limits on feature layer size, API calls, and data ingestion rates (this latter point means that we are considering upgrading to an M2 hosted feature store in ArcGIS Online, to provide dedicated I/O). Storing large data volumes in ArcGIS Online may be cost-prohibitive.
- Data Model Changes: Some advanced geodatabase features (e.g., versioning, complex relationships) may not be supported in ArcGIS Online.
Key Tradeoffs
| Aspect | Lift-and-Shift (AWS) | Serverless + ArcGIS Online SaaS |
|---|---|---|
| Maintenance | High (servers, DB) | Low (managed services) |
| Cost | Fixed, higher for idle time | Usage-based, potentially lower |
| Scalability | Manual (resize servers/DB) | Automatic (Lambda, S3, SaaS) |
| Flexibility | Full control | Limited to SaaS/API capabilities |
| Modernization | Minimal | High (cloud-native, event-driven) |
| Migration Effort | Lower (reuse existing code) | Higher (re-architect workflows) |
Recommended Approach
For organizations seeking to minimize operational overhead and maximize the benefits of cloud and SaaS, the serverless and ArcGIS Online-based workflow may be the best long-term solution. By leveraging S3 for intermediate storage, AWS Lambda for processing, and ArcGIS Online for data hosting and sharing, you can build a scalable, cost-effective, and modern geospatial data pipeline.
Example Workflow:
- User uploads data (e.g., NetCDF, shapefile, CSV) to an S3 bucket.
- S3 event triggers a Lambda function to process the data, convert it to a GeoPackage, and perform any necessary validation or transformation.
- Lambda appends the processed data to an ArcGIS Online Feature Layer via the ArcGIS REST API.
- Data is immediately available for visualization, analysis, and sharing in ArcGIS Online.
Conclusion
Migrating from on-premises ArcGIS Server and SDE to the cloud is an opportunity to modernize your geospatial workflows. While a lift-and-shift approach offers a quick path to the cloud, embracing serverless and SaaS solutions like ArcGIS Online can deliver greater agility, lower costs, and reduced maintenance in the long run. The right choice depends on your organization’s requirements, existing investments, and appetite for change.
Tags: #ArcGIS #CloudMigration #Serverless #AWS #ArcGISOnline #Geospatial #Architecture #SaaS