How to do regression using Dask.
Generalized linear models are a broad category of normally used models. These implementations scale flow to massive datasets either on one machine or distributed cluster. They will be powered by a range of optimized algorithms and use a range of regularizers.
These follow the scikit-learn estimator API, and so can fall into existing routines like grid search and pipelines, they are being enforced outwardly with new, ascendable algorithms and so it can consume distributed dask arrays and DataFrames instead of simply single-machine NumPy and Pandas arrays and DataFrames.
Step 1- Importing Libraries.
#!pip install dask_glm #!pip install dask_ml from dask_glm.datasets import make_regression from dask_ml.linear_model import LinearRegression
Seperating the dataset into X, y by using predefined make_regression() function from Dask.
We will initialize the Linear Regression and fit the dataset into the model, and calculate the score by the conventional method.
X, y = make_regression() lr = LinearRegression() lr.fit(X, y) z=lr.predict(X) lr.score(X, y)
Visualizing the predicted chunks of predicted array.