How to Install Scikit-Learn?

Before we start: This Python tutorial is a part of our series of Python Package tutorials.

Scikit-learn is an open source machine learning library for Python. You have a number of options when it comes to installing scikit-learn, depending on your needs:

  • If you don’t have Python installed, you can install scikit-learn as part of a Python distribution, such as ActivePython.
  • If you already have Python and prefer to install pre-built binaries, you can install scikit-learn by simply running the following command:

    pip install scikit-learn
  • Pre-built binaries may contain malicious code, especially if you mistakenly install a typo-squatted version. Instead, consider installing Python libraries from source code. The simplest way to build scikit-learn from source is to use the ActiveState Platform to automatically build and package it for Windows, Mac or Linux.

scikit learn package

Scikit-Learn Step by Step Installation

For most users, the best approach is to install the binary version of scikit-learn using an official release from pypi.org, the Python Package Index. You can do so with the following steps:

1. Scikit-learn requires Python 3.6+. To check which version of Python you have installed, run the following command:

python3 --version

The output should be similar to:

Python 3.8.2

2. If you have a valid Python version you can run the following command to download and install a pre-built binary of scikit-learn:

pip install scikit-learn

The following dependencies will be automatically installed along with scikit-learn:

  • NumPy 1.13.3+
  • SciPy 0.19.1+
  • Joblib 0.11+
  • threadpoolctl 2.0.0+

Alternatively, if you already have scikit-learn and/or any of its dependencies are already installed, they can be updated as part of the installation by running the following command:

pip install -U scikit-learn

You can verify your Scikit-learn installation with the following command:

python -m pip show scikit-learn

The output should be similar to:

verify your Scikit-learn installation

If you want to create plots and charts based on the data you use in scikit-learn, you may also want to consider installing matplotlib. For information about matplotlib and how to install it, refer to ‘What is Matplotlib in Python’?

How to Import Scikit-Learn in Python

Once scikit-learn is installed, you can start working with it. A scikit-learn script begins by importing the scikit-learn library:

import sklearn

It’s not necessary to import all of the scitkit-learn library functions. Instead, import just the function(s) you need for your project. For example, to import the linear regression model, enter:

from sklearn import linear_model

Or try:

from sklearn.linear_model import LinearRegression

The following tutorials will provide you with step-by-step instructions on how to work with machine learning Python packages:

Get a version of Python, pre-compiled with Scikit-Learn and other popular ML Packages

ActivePython is the trusted Python distribution for Windows, Linux and Mac, pre-bundled with top Python packages for machine learning – free for development use.

Some Popular ML Packages You Get Pre-compiled – With ActivePython

Machine Learning:

  • TensorFlow (deep learning with neural networks)*
  • scikit-learn (machine learning algorithms)
  • keras (high-level neural networks API)

Data Science:

  • pandas (data analysis)
  • NumPy (multidimensional arrays)
  • SciPy (algorithms to use with numpy)
  • HDF5 (store & manipulate data)
  • matplotlib (data visualization)

Get ActivePython for Machine Learning for Windows, macOS or Linux here.

Why use ActivePython instead of open source Python?

While the open source distribution of Python may be satisfactory for an individual, it doesn’t always meet the support, security, or platform requirements of large organizations.

This is why organizations choose ActivePython for their data science, big data processing and statistical analysis needs.

Pre-bundled with the most important packages Data Scientists need, ActivePython is pre-compiled so you and your team don’t have to waste time configuring the open source distribution. You can focus on what’s important–spending more time building algorithms and predictive models against your big data sources, and less time on system configuration.

ActivePython is 100% compatible with the open source Python distribution, and provides the security and commercial support that your organization requires.

With ActivePython you can explore and manipulate data, run statistical analysis, and deliver visualizations to share insights with your business users and executives sooner–no matter where your data lives.

Download ActivePython Community Edition to get started or contact us to learn more about using ActivePython in your organization.

Related Reads:

How to Clean Machine Learning Datasets Using Pandas

Python Cheatsheet for Machine Learning: Clever Tips and Tricks

qr sidebar image 2

Use ActivePython and accelerate your Python projects.

  • The #1 Python solution used by innovative enterprise teams
  • Comes pre-bundled with top Python packages
  • Spend less time resolving dependencies and more time on quality coding

Take a look at ActivePython

Suhani S