top of page
hand-businesswoman-touching-hand-artificial-intelligence-meaning-technology-connection-go-

Getting started with NumPy and Panda using Kaggle

NumPy (pronounced (NUM-py) or sometimes (NUM-pee)) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.


How to use Kaggle

  1. Go to www.kaggle.com

  2. Sign Up/Login to your account.

  3. Click on the notebook from the menu on the left side.

  4. Click on the New Notebook button to create a notebook.

NumPy

Importing NumPy and system library:

import numpy as np
import sys

NumPy over regular Python lists:

NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Here is a small example.

import numpy as np
import sys

py_array = [1,2,3,4,5,6]
numpy_array = np.array([1,2,3,4,5,6])

sizeof_py_arr = sys.getsizeof(1) * len(py_array)
sizeof_numpy_arr = numpy_array.itemsize * numpy_array.size

print(sizeof_py_arr)
print(sizeof_numpy_arr)

Output: 168 48


Reshape the NumPy array

import numpy as np

n_md_arry = np.array([[1,2,3,4,5],[6,7,8,9,10]])
np_modmd_arr = n_md_arry.reshape(5,2)
print(np_modmd_arr)

n_md_arry2 = np.array([1,2,3,4])
np_modmd_arr2 = n_md_arry2.reshape(2,2)

print(np_modmd_arr2)

Output: [[ 1 2] [ 3 4] [ 5 6] [ 7 8] [ 9 10]] [[1 2] [3 4]]

NumPy arrange NumPyarange()is one of the array creation routines based on numerical ranges. It creates an instance of ndarray withevenly spaced values and returns the reference to it.


numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray


Example:

import numpy as np

np_arr = np.arange(0,20).reshape(5,4)
print(np_arr)

f_arr = np_arr.ravel()
print(f_arr)
 

Output: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19]] [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

Addition of two array

import numpy as np 
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) 
    
out_arr = np.add(in_arr1, in_arr2)  
print (out_arr)

Output: [[ 8 10 12] [14 16 18]]


Multiplication of two array

import numpy as np   
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) 
    
out_arr = np.multiply(in_arr1, in_arr2)  
print (out_arr)  

Output: [[ 7 16 27] [40 55 72]]


NumPy Multiplication Matrix

import numpy as np 
  
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8],[9, 10], [11, 12]]) 
    
out_arr = in_arr1.dot(in_arr2)  
print (out_arr)  

Output: [[ 58 64] [139 154]]


Finding an element position in the array

import numpy as np
import sys

np_arr = np.array([1,2,0,4,5])
find = np.where(np_arr > 2)

print(find)

Output: (array([3, 4]),)

Finding a Non-zero element position

import numpy as np
import sys

np_arr = np.array([1,2,0,4,5])
find = np.nonzero(np_arr)

print(find)

Output: (array([0, 1, 3, 4]),)

 

Panda

Importing NumPy and Panda:

import numpy as np
import pandas as pd

How to Import .csv file from kaggle.

  1. Go to www.kaggle.com

  2. Go to your kaggle notebook.

  3. Click on >| button on the top right corner.

  4. Click on + Add data button

  5. Search for a dataset in the search bar.

  6. Click on the add button to add it to your dataset.

  7. Click on .csv file and you will find the path of CSV file.


Import Dataset using the read function

import numpy as np
import pandas as pd

df = pd.read_csv('../input/pima-indians-diabetes-database/diabetes.csv')
df

Output:












Fetching top 10 rows

df.head(10)

Output:










Fetching last 10 rows

df.tail(10)

Output:










Conclusion

This is an overview of the Pythons NumPy and Panda library. In this article, we learned the NumPy and Panda library with the help of a real-time data set. Here we have also explored how to perform various operations via the NumPy library and Panda, which is most commonly used in many data science applications.

124 views1 comment

1 commentaire

Noté 0 étoile sur 5.
Pas encore de note

Ajouter une note
Supratim Dasgupta
Supratim Dasgupta
30 juin 2020

Nirav, I have read many NumPy tutorials, but this one takes the cake with straight code level implementation on Kaggle. Sure this will have many fan readers who can now learn and implement on Kaggle in a matter of minutes. I look forward to your next blog.

J'aime
bottom of page