Getting started with NumPy and Panda using Kaggle

NumPy (pronounced (NUM-py) or sometimes (NUM-pee)) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.


How to use Kaggle

  1. Go to www.kaggle.com

  2. Sign Up/Login to your account.

  3. Click on the notebook from the menu on the left side.

  4. Click on the New Notebook button to create a notebook.

NumPy

Importing NumPy and system library:

import numpy as np
import sys

NumPy over regular Python lists:

NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Here is a small example.

import numpy as np
import sys

py_array = [1,2,3,4,5,6]
numpy_array = np.array([1,2,3,4,5,6])

sizeof_py_arr = sys.getsizeof(1) * len(py_array)
sizeof_numpy_arr = numpy_array.itemsize * numpy_array.size

print(sizeof_py_arr)
print(sizeof_numpy_arr)

Output: 168 48


Reshape the NumPy array

import numpy as np

n_md_arry = np.array([[1,2,3,4,5],[6,7,8,9,10]])
np_modmd_arr = n_md_arry.reshape(5,2)
print(np_modmd_arr)

n_md_arry2 = np.array([1,2,3,4])
np_modmd_arr2 = n_md_arry2.reshape(2,2)

print(np_modmd_arr2)

Output: [[ 1 2] [ 3 4] [ 5 6] [ 7 8] [ 9 10]] [[1 2] [3 4]]

NumPy arrange NumPyarange()is one of the array creation routines based on numerical ranges. It creates an instance of ndarray withevenly spaced values and returns the reference to it.


numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray


Example:

import numpy as np

np_arr = np.arange(0,20).reshape(5,4)
print(np_arr)

f_arr = np_arr.ravel()
print(f_arr)
 

Output: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19]] [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

Addition of two array

import numpy as np 
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) 
    
out_arr = np.add(in_arr1, in_arr2)  
print (out_arr)

Output: [[ 8 10 12] [14 16 18]]


Multiplication of two array

import numpy as np   
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) 
    
out_arr = np.multiply(in_arr1, in_arr2)  
print (out_arr)  

Output: [[ 7 16 27] [40 55 72]]


NumPy Multiplication Matrix

import numpy as np 
  
in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
in_arr2 = np.array([[7, 8],[9, 10], [11, 12]]) 
    
out_arr = in_arr1.dot(in_arr2)  
print (out_arr)  

Output: [[ 58 64] [139 154]]


Finding an element position in the array

import numpy as np
import sys

np_arr = np.array([1,2,0,4,5])
find = np.where(np_arr > 2)

print(find)

Output: (array([3, 4]),)

Finding a Non-zero element position

import numpy as np
import sys

np_arr = np.array([1,2,0,4,5])
find = np.nonzero(np_arr)

print(find)

Output: (array([0, 1, 3, 4]),)

Panda

Importing NumPy and Panda:

import numpy as np
import pandas as pd

How to Import .csv file from kaggle.

  1. Go to www.kaggle.com

  2. Go to your kaggle notebook.

  3. Click on >| button on the top right corner.

  4. Click on + Add data button

  5. Search for a dataset in the search bar.

  6. Click on the add button to add it to your dataset.

  7. Click on .csv file and you will find the path of CSV file.


Import Dataset using the read function

import numpy as np
import pandas as pd

df = pd.read_csv('../input/pima-indians-diabetes-database/diabetes.csv')
df

Output:












Fetching top 10 rows

df.head(10)

Output:










Fetching last 10 rows

df.tail(10)

Output:










Conclusion

This is an overview of the Pythons NumPy and Panda library. In this article, we learned the NumPy and Panda library with the help of a real-time data set. Here we have also explored how to perform various operations via the NumPy library and Panda, which is most commonly used in many data science applications.

49 views1 comment

Recent Posts

See All

Text Summarization through use of Spacy library

Text summarization in NLP means telling a long story in short with a limited number of words and convey an important message in brief. There can be many strategies to make the large message short and

 

© Numpy Ninja.