NumPy (pronounced (NUM-py) or sometimes (NUM-pee)) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
How to use Kaggle
Go to www.kaggle.com
Sign Up/Login to your account.
Click on the notebook from the menu on the left side.
Click on the New Notebook button to create a notebook.
NumPy
Importing NumPy and system library:
import numpy as np
import sys
NumPy over regular Python lists:
NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Here is a small example.
import numpy as np
import sys
py_array = [1,2,3,4,5,6]
numpy_array = np.array([1,2,3,4,5,6])
sizeof_py_arr = sys.getsizeof(1) * len(py_array)
sizeof_numpy_arr = numpy_array.itemsize * numpy_array.size
print(sizeof_py_arr)
print(sizeof_numpy_arr)
Output: 168 48
Reshape the NumPy array
import numpy as np
n_md_arry = np.array([[1,2,3,4,5],[6,7,8,9,10]])
np_modmd_arr = n_md_arry.reshape(5,2)
print(np_modmd_arr)
n_md_arry2 = np.array([1,2,3,4])
np_modmd_arr2 = n_md_arry2.reshape(2,2)
print(np_modmd_arr2)
Output: [[ 1 2] [ 3 4] [ 5 6] [ 7 8] [ 9 10]] [[1 2] [3 4]]
NumPy arrange NumPyarange()is one of the array creation routines based on numerical ranges. It creates an instance of ndarray withevenly spaced values and returns the reference to it.
numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray
Example:
import numpy as np
np_arr = np.arange(0,20).reshape(5,4)
print(np_arr)
f_arr = np_arr.ravel()
print(f_arr)
Output: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19]] [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
Addition of two array
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]])
out_arr = np.add(in_arr1, in_arr2)
print (out_arr)
Output: [[ 8 10 12] [14 16 18]]
Multiplication of two array
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]])
out_arr = np.multiply(in_arr1, in_arr2)
print (out_arr)
Output: [[ 7 16 27] [40 55 72]]
NumPy Multiplication Matrix
import numpy as np
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]])
in_arr2 = np.array([[7, 8],[9, 10], [11, 12]])
out_arr = in_arr1.dot(in_arr2)
print (out_arr)
Output: [[ 58 64] [139 154]]
Finding an element position in the array
import numpy as np
import sys
np_arr = np.array([1,2,0,4,5])
find = np.where(np_arr > 2)
print(find)
Output: (array([3, 4]),)
Finding a Non-zero element position
import numpy as np
import sys
np_arr = np.array([1,2,0,4,5])
find = np.nonzero(np_arr)
print(find)
Output: (array([0, 1, 3, 4]),)
Panda
Importing NumPy and Panda:
import numpy as np
import pandas as pd
How to Import .csv file from kaggle.
Go to www.kaggle.com
Go to your kaggle notebook.
Click on >| button on the top right corner.
Click on + Add data button
Search for a dataset in the search bar.
Click on the add button to add it to your dataset.
Click on .csv file and you will find the path of CSV file.
Import Dataset using the read function
import numpy as np
import pandas as pd
df = pd.read_csv('../input/pima-indians-diabetes-database/diabetes.csv')
df
Output:
Fetching top 10 rows
df.head(10)
Output:
Fetching last 10 rows
df.tail(10)
Output:
Conclusion
This is an overview of the Pythons NumPy and Panda library. In this article, we learned the NumPy and Panda library with the help of a real-time data set. Here we have also explored how to perform various operations via the NumPy library and Panda, which is most commonly used in many data science applications.
Nirav, I have read many NumPy tutorials, but this one takes the cake with straight code level implementation on Kaggle. Sure this will have many fan readers who can now learn and implement on Kaggle in a matter of minutes. I look forward to your next blog.