Data Science Articles


Depolying Airflow DAGs using Google Cloud Composer

"Airflow is a platform to programmatically author, schedule and monitor workflows. Use Airflow to author workflows as Directed Acyclic Graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a …


SMOTE (Synthetic Minority Oversampling Technique)

Several machine learning classification techniques tend to perform poorly on datasets where the target class (the minority class) represents a small fraction of the overall data. However, sometimes it is the minority class that we are interested in. Examples include medical applications in which we try to predict the occurrence …


Minkowski distance and its effects on KNN Classification

Minkowski distance is a generalized version of the distance calculations we are accustomed to. It can be defined as: Euclidean & Manhattan distance: Manhattan distances are the sum of absolute differences between the Cartesian coordinates of the points in question. Manhattan distances can be thought of as the sum of the …