Cloud - MultiCloud Application Migration

Introduction

Docker - In a Nutshell

Introduction

Apache Airflow - Introduction

Introduction The most important task for doing data engineering is to build a data workflow. The workflow will include geathering data from different data sources, do transformation to the raw data and then load the data to the target place.

Apache Spark - SparkSQL Performance Tuning

Introduction

Apache Spark - SparkSQL Running in Clusters

Introduction Our previous experiments are all on the local mode. But the point of using spark is to add more worker nodes and running in cluster mode to scale computation.