Moving and preparing data in a multicloud environment can require repeatedly running and managing multiple scripts. You can move data between AWS, on-premises file systems, and other cloud storage services using AWS DataSync, a secure service that automates and accelerates the movement of data between storage systems without needing to write and run scripts to manage repeated transfers. With DataSync, you can access data across 12 storage locations spanning other clouds, on-premises, and edge, and move it to and from AWS to support workflows and processing.
Make data preparation easier with AWS Glue, a serverless data integration service. With AWS Glue, you can discover and connect to over 80 diverse data sources, including other cloud databases, such as Google BigQuery, and analytics services. You can also manage your data in a centralized data catalog, and visually create, run, and monitor ETL (extract, transform, and load) pipelines to load data into your data lakes. You can move data bidirectionally between Amazon S3, and either Azure Blob Storage or Azure Dake Lake Storage, via connectors. You can also leverage new database connectors for AWS Glue Apache Spark, including Teradata, SAP HANA, Azure SQL, Azure Cosmos DB, Vertica, and MongoDB.
Want to safely collaborate with your partners without copying or sharing source data, oftentimes with datasets stored outside AWS? With AWS Clean Rooms, you can leverage privacy-enhancing controls to gain insights collaborating with your partners’ datasets across multiple data sources and clouds, such as Amazon S3, Amazon Athena, and Snowflake, with zero ETL (extract, transform, and load) and without needing to copy, share, or move your underlying data.