Series Data Structure
Note:
1. One Dimensional Array of index data
2. It consists of two parts
a) An array of actual data
b) An associated array of index for the data array
3. Pandas has to be imported so as to create series object
Creation of Series Data Structure
Syntax:
pandas.Series( data=None, index=None, dtype=None )
Parameters:
data : array-like, Iterable, dict, or scalar value
Contains data stored in Series.
index : array-like or Index (1d) but not a scalar
Values must be hashable and have the same length as `data`.
Non-unique index values are allowed. Will default to
RangeIndex (0, 1, 2, ..., n) if not provided. If both a dict and index
sequence are used, the index will override the keys found in the
dict.
dtype : str, numpy.dtype, or ExtensionDtype, optional
Data type for the output Series. If not specified, this will be
inferred from `data`.
Syntax
|
Example
|
Remarks
|
Empty series object
<identifier>=pandas.Series()
|
srObj=pandas.Series()
|
Creates a empty series object whose value can be added later
|
Creating a series object from Sequence
<identifier>=pandas.Series(data,index)
|
srObj=pandas.Series(range(10))
|
Creates a series from range function and index starts from 0
|
srObj=pandas.Series([12,13,14,15,16],index=[1,2,3,4,5,6])
|
Creates a Series form List and index starts from 1
| |
srObj=pandas.Series((1,2,3,4,5))
|
Creating a series object from Tuple
| |
Creating a series object from ndarray
<numPyArray>=numpy.array(<squence>)
<identifier>=pandas.Series(<numPyArray>)
|
import numpy,pandas
npArray=numpy.array([4,5,6,7,8])
srObj=pandas.Series(npArray)
|
Any kind of ndarray can be passed to the series method
|
Creating a series object from a Scalar
<identifier>=pandas.Series(<scalar>)
|
srObj=pandas.Series(10,index=[1])
|
Scalar mean a number
|
Creating a series object from dictionary
|
srObj=pandas.Series({ ‘a’:2, ’b’:5, ’c’:45, ‘d’:32})
|
Data values are used as data and keys are used as index
|
Adding Data,index and Data type to series object
|
srObject=pandas.Series(data=[1,2,3,4],index=[‘a’,’b’,’c’,’d’],dtype=numpy.float64)
| |
Creating a series object from arithmetic operation
|
Ls=[2,3,4]
srObject=pandas(data=(Ls*2))
|
Creates a series object by multiplying each element of the list by 2
|
Series Data Structure Attributes
Note: Using the series data structure attributes we can access the various details about the Series Object
Attribute
|
Use
|
SeriesObject.index
|
Prints the range of the index of the series object I.e start_index, stop_index, step_index
|
SeriesObject.values
|
Return Series as ndarray or ndarray-like depending on the dtype
|
SeriesObject.dtype
|
Returns the data type of the under lying data in the series object
|
SeriesObject.shape
|
Return a tuple of the shape of the underlying data.
|
SeriesObject.nbytes
|
Returns the number of bytes in the underlying data I.e complete series object
|
SeriesObject.size
|
Returns the number of element in the series object
|
SeriesObject.itemsize
|
Prints the size of each data underlying in the series object
|
SeriesObject.hasnans
|
Prints True if the series object has NaN(Not a Number) else prints false
|
SeriesObject.empty
|
Return(prints) True if the series object is empty else return False
|
Example:
srObj=pandas.Series([2,3,4,5,6])
print(srObj.size)
print(srObj.empty)
Output:
5
False
Operation on Series Data Structure
Syntax
|
Example
|
Accessing an element in a series Object
| |
SeriesObject.[<valid index>]
|
dc={‘a’:45, ‘b’:56, ‘c’:78}
srObj=pandas.Series(dc)
print(srObj[‘b’])
Output:
56
|
Accessing a Slice of elements from a series Object
| |
SeriesObject.[start_index: Stop_index: step_value]
|
dc={‘a’:45, ‘b’:56, ‘c’:78, ‘d’:57, ‘e’:62}
srObj=pandas.Series(dc)
print(srObj[‘a’:’d’])
Output:
a 45
b 56
c 78
d 57
dtype: int64
|
Modification of elements in a series object
| |
seriesObject[index]=new_data_value
seriesObject[start: stop]=sequence_of_new_values
|
dc={‘a’:45, ‘b’:56, ‘c’:78}
srObj=pandas.Series(dc)
srObj[‘b’]=32
print(srObj)
a 45
b 32
c 78
dtype: int64
|
Vectorized operation on series object(operation appliers to each element of the object individually)
| |
<seriesObject> operator <scalar>
|
srObj +2 # adds 2 to each element of the series object
srObj * 5 # multiply 5 to each element
srObj > 3 #compares each element to 3 and return True/False for each
srObj = srObj+3 #adds 3 to each element and then assign to series object
|
Arithmetic operation between series object ( performs operation on the matching index)
| |
<seriesObject1> operator <seriesObject2>
Note:
1. The index of the resultant object is the union of common and different index of the series objects
2. If the index are not matching , the arithmetic operation result in NaN
|
srObj1 + srObj2
|
Reindexing an series object
| |
Identifier=seriesObject.reindex(<sequence>)
|
srObj=pandas.Series([10,12,13,14])
Obj1=srObj.reindex([‘a’, ‘b’, ‘c’, ‘d’])
|
Removing Element form a series Object
| |
seriesObject.drop(‘valid_index’)
|
dc={‘a’:45, ‘b’:56, ‘c’:78}
srObj=pandas.Series(dc)
srObj.drop(‘b’)
print(srObj)
Output:
a 45
c 78
|
References:
1. Informatics practices by Sumita Arora
2. Python Documentation in Jupyter Notebook