This article will teach you on how to set up your machine for this data science course. This tutorial focusses on installing python & some relevant libraries used for data science purposes.
How to Install Python in Your System?
- Download the Python Installer from below based on your operating system.
Python 3.8.6 Installer for Windows
Python 3.8.6 Installer for Linux
Python 3.8.6 Installer for Mac
We know that Python 3.9 has already been released. But libraries like tensorflow still didn’t announce their test results and support with Python 3.9. Also if you have any other versions of Python already installed, please note that the required version for this course lies between 3.5 – 3.8. You may check your version of python installed, by typing the below code in your terminal/ command prompt.
- Double click the downloaded Python installer file. You will see the welcome page as below. Click on Continue. (The pics here are for mac users. For windows/ linux users, the screen design vary, but the content almost remains the same. If you are a windows user and still facing difficulties, we recommend you follow this tutorial here.)
3. You will see the Read Me page. Click on Continue.
4. You will get a licence page. Click on Continue. It will pop a dialog box to accept their terms. Click on Agree.
5. Click on Install to proceed with the installation. (If you want to change the default installation location you may select Change Install Location.)
Once you are done with the installation, please verify the installation by typing the following command in terminal/ command prompt.
If you are getting the version as Python 3.8.6, you are done with the installation part.
Setting up a Virtual environment
This step is not mandatory. But this step comes handy when you have multiple applications using different versions of Python or its libraries. Let us say, if you have a Django application which uses Python 3.5. And you have another machine learning project which uses Python 3.8. In such cases you use a virtual environment. The usage is simple. You will use two virtual environments, one for Django application with Python 3.5 installed, and another with Python 3.8 for the machine learning project. In that case, the packages or libraries used by both the projects will remain isolated and hence there won’t be any compatibility issues. Please note that these things are important when you do it in production. If you need to bundle out things or migrate the projects from one server to another, these kind of virtual environments prove very much useful. Nowadays we use docker containers for making these things as simple than ever, which we will learn later. Now, let us see how we set up these kind of virtual environments in our system.
- To create virtual environments, we make use of a python library called ‘virtualenv’. We need to install tha package first. To install any package, we have a tool called ‘pip’ which comes preinstalled with Python. Since we have pip in your system, things become easier. Just open up the command prompt and type ‘pip install packagename‘ for installing any python package. And for uninstalling, we use the command ‘pip uninstall packagename‘. So to install virtualenv, the required command is:
pip install virtualenv
- Now we can create any number of virtual environments using virtualenv. For example, to create a virtual environment named ‘venv’, first navigate to the folder you need to create the virtual environment and use the following command:
virtualenv -p python3 venvHere we use ‘python3’ in the command, because we need to create the virtualenv using Python3. If you have multiple versions of Python 3 installed, you may use the command, ‘virtualenv -p python3.5 venv‘. Replace python3.5 with your versions of Python.
- Once you type the above command, it will create a folder named venv inside the folder you are in, which means the virtual environment ‘venv’ is created. Now you need to activate the virtual environment. For that use the following command:
source venv/bin/activateYou will see a similar screen once your venv is activated (venv enclosed in brackets means you are inside the virtual environment). Now we need to install the python libraries supporting data science works.
Installing Python Libraries for Data Science
These are some libraries which we will use throughout our course.
You may install these packages using pip tool. For example,
pip install jupyter pip install pandas pip install numpy
We are not installing every packages now. We will install these packages only when required. Now, we are done on the setup part.
You may deactivate the virtual environment whenever its not required, using the command:
The real part starts from the next post onwards. If you face any difficulties, please let us know in comments.
Bonus Pip Commands
The below command will list all the libraries installed inside the virtual environment:
And if you need to store this to a file, say, requirements.txt we use the below command:
pip freeze > requirements.txt
When we normally bundle out a project, it is recommended that you include the ‘requirements.txt‘ along with. So that anyone new can easily set up the environment for the project to run. If you need to install all the libraries inside ‘requirements.txt‘ at once, use the below command:
pip install -R requirements.txt