Data Science Academy — NumPy with Python(Arrays)
I mentioned in the introduction of the course, we will go through many python data analysis libraries and the first one is NumPy. Take your sit open and enjoy.
NumPy is a Linear Algebra Library for Python. The reason for the popularity of NumPy is that almost all of the libraries in the PyData Ecosystem rely on the NumPy as their main building blocks. Yes of course, we can use different libraries and we will use them but why do not start with the most spread library.
Prerequisites:
We have to install 2 things but it is straightforward. In the previous lecture, we installed Anaconda but we did not install NumPy at all. So let’s change it. Open your command prompt or terminal, it depends on your computer operating system, and type these two commands:
- conda install numpy
- pip install numpy
NumPy Arrays
We will use NumPy Arrays throughout the whole course so please try to understand and play around with it when you will have a chance. I plan to create a section with the exercise and solution but play around with it a little bit. Do you promise to me? Thanks, man.
NumPy Arrays come in two ways:
- vectors: are strictly 1-dimensional arrays
- matrices: are 2-dimensional arrays(but be aware it can contain only one 1 dimension)
Great, it is time which many of you waited for and here it is. Let’s open Jupiter notebook and let’s write some code.
Hit the launch of the Jupyter option.
Click to New and to Python 3 option.
Let’s create your first list
- my_first_list = [1,2,3]
It is quite nice but how to see the result of the variable of the my_first_list ?
- my_first_list
Hit shortcut CTRL + ENTER
Great, it looks great. But why do you told about NumPy when we do not use it? You are right and let’s use it right now.
- import numpy as np
I imported the NumPy library and story it to variable np. This way variable np contains all the methods of the NumPy library. Let’s check some useful methods.
- var_array = np.array(my_first_list)
I just assigned the result of an array to variable var_array and I can call it anytime I want. For this moment we only use vectors and no metrics. Metrics are very similar but the syntax is different. Let’s see.
- var_metric = [[1,2,3],[4,5,6],[7,8,9]]
We just created a list of list but lets use NumPy library in variable np and cast variable var_metric as array.
- np.array(var_metric)
We create metrics with 3 rows and 3 columns.
I have a question for you. How would you create an array with 100 numbers? You can write all the numbers from 1 to 100 but it is a better way. We will use method arange.
- np.arange(0,100)
You will get the number from 0 to 99 because as many programing language and python is not the exception, the starting index start from 0.
A very useful shortcut is the shortcut SHIFT+TAB which shows you the documentation for the method.
A would like to show you an additional method which can confuse you and it is the linspace method. It looks like arange method but it returns num evenly spaced samples, calculated over the interval [start, stop].
- np.linspace(0,5,10)
Do you see the difference ? Let’s use another amazing method which is called random.
- np.random.rand(5)
It will create an array and it is going to populate it with random samples with uniform distribution.
Another interesting method is randint method. It returns random integers from the low to the high number.
- np.random.randint(1,100000)
This method returned me the value 4478 but it does not have to return the same. I guarantee the value will be different.
I think I went through the most basic methods but do not worry that this list of articles will be easy. We are in the beginning and we need to focus on the basics. If you like the course please hit the clap, follow and I am looking for you in next articles.