NumPy (pronounced (NUM-py) or sometimes (NUM-pee)) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
How to use Kaggle
Go to www.kaggle.com
Sign Up/Login to your account.
Click on the notebook from the menu on the left side.
Click on the New Notebook button to create a notebook.
NumPy
Importing NumPy and system library:
import numpy as np
import sys
NumPy over regular Python lists:
NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Here is a small example.
import numpy as np
import sys
py_array = [1,2,3,4,5,6]
numpy_array = np.array([1,2,3,4,5,6])
sizeof_py_arr = sys.getsizeof(1) * len(py_array)
sizeof_numpy_arr = numpy_array.itemsize * numpy_array.size
print(sizeof_py_arr)
print(sizeof_numpy_arr)
Output: 168 48
Reshape the NumPy array
import numpy as np
n_md_arry = np.array([[1,2,3,4,5],[6,7,8,9,10]])
np_modmd_arr = n_md_arry.reshape(5,2)
print(np_modmd_arr)
n_md_arry2 = np.array([1,2,3,4])
np_modmd_arr2 = n_md_arry2.reshape(2,2)
print(np_modmd_arr2)
Output: [[ 1 2] [ 3 4] [ 5 6] [ 7 8] [ 9 10]] [[1 2] [3 4]]
NumPy arrange NumPyarange()is one of the array creation routines based on numerical ranges. It creates an instance of ndarray withevenly spaced values and returns the reference to it.
numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray
Example:
import numpy as np
np_arr = np.arange(0,20).reshape(5,4)
print(np_arr)
f_arr = np_arr.ravel()
print(f_arr)
Output: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19]] [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
Addition of two array
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]])
out_arr = np.add(in_arr1, in_arr2)
print (out_arr)
Output: [[ 8 10 12] [14 16 18]]
Multiplication of two array
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]])
out_arr = np.multiply(in_arr1, in_arr2)
print (out_arr)
Output: [[ 7 16 27] [40 55 72]]
NumPy Multiplication Matrix
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8],[9, 10], [11, 12]])
out_arr = in_arr1.dot(in_arr2)
print (out_arr)
Output: [[ 58 64] [139 154]]
Finding an element position in the array
import numpy as np
import sys
np_arr = np.array([1,2,0,4,5])
find = np.where(np_arr > 2)
print(find)
Output: (array([3, 4]),)
Finding a Non-zero element position
import numpy as np
import sys
np_arr = np.array([1,2,0,4,5])
find = np.nonzero(np_arr)
print(find)
Output: (array([0, 1, 3, 4]),)
Panda
Importing NumPy and Panda:
import numpy as np
import pandas as pd
How to Import .csv file from kaggle.
Go to www.kaggle.com
Go to your kaggle notebook.
Click on >| button on the top right corner.
Click on + Add data button
Search for a dataset in the search bar.
Click on the add button to add it to your dataset.
Click on .csv file and you will find the path of CSV file.
![](https://static.wixstatic.com/media/a27d24_dbea2fbf0a2b4077b3f40fc4c6fe782b~mv2.png/v1/fill/w_99,h_43,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/a27d24_dbea2fbf0a2b4077b3f40fc4c6fe782b~mv2.png)
![](https://static.wixstatic.com/media/a27d24_5c18f256690545b19f805d8cfd299345~mv2.png/v1/fill/w_49,h_12,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/a27d24_5c18f256690545b19f805d8cfd299345~mv2.png)
Import Dataset using the read function
import numpy as np
import pandas as pd
df = pd.read_csv('../input/pima-indians-diabetes-database/diabetes.csv')
df
![](https://static.wixstatic.com/media/a27d24_bbb35820697948ecab4ca0ca43dad542~mv2.png/v1/fill/w_48,h_21,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/a27d24_bbb35820697948ecab4ca0ca43dad542~mv2.png)
Output:
Fetching top 10 rows
df.head(10)
![](https://static.wixstatic.com/media/a27d24_2d8ecf3646f74b6c9cbe5180af32f7c7~mv2.png/v1/fill/w_47,h_18,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/a27d24_2d8ecf3646f74b6c9cbe5180af32f7c7~mv2.png)
Output:
Fetching last 10 rows
df.tail(10)
![](https://static.wixstatic.com/media/a27d24_cd09b38e92a94466a05dcf2a9594cca2~mv2.png/v1/fill/w_48,h_18,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/a27d24_cd09b38e92a94466a05dcf2a9594cca2~mv2.png)
Output:
Conclusion
This is an overview of the Pythons NumPy and Panda library. In this article, we learned the NumPy and Panda library with the help of a real-time data set. Here we have also explored how to perform various operations via the NumPy library and Panda, which is most commonly used in many data science applications.