1
Building and operationalizing data processing systems
hard
A company is migrating their on-premises Hadoop cluster to Google Cloud. The existing cluster runs HDFS, Hive, and Spark jobs. The migration must minimize changes to existing job code and configuration. The data volume is 50 TB and growing. The team expects to run both batch and interactive SQL queries. Which architecture should they use?