Python simplified

02 Jan 2016

Reading time ~18 minutes

Installation

1.Install python 3 from the offical downloads section

2.Install latest version of Java and Eclipse IDE

3.Now install pydev plugin inside eclipse under help, install software(make sure to add the http://www.pydev.com/updates website)

4.Install the dependencies

5.Go to preferences, under pydev, under interpreter python add the python interpreter by including the executable or the alias in usr/local/bin

6.Create a perspective for pydev and remove java perspective if needed

Creating a main script:

Type python in mac(comes with python 2 by default). Now it opens the interpreter. Now type import idlelib.idle. It will open the IDLE in mac.

#!/usr/bin/python3 - called as the shebang line, mentions the path to the interpreter

Whitespace is significant in python(4 spaces are traditional in python)

No semicolons

Python shell

Gives us instant feedback

Use ‘exit()’ to exit out of the shell

‘help’ gives us inf on almost anything. Eg: help(print) will give us adequate inf about the print func in python. To leave the help dialog press the Q key.

dir(str) - tells us all the associated attributes and methods

help(str.rjust) - gives us info abt the rjust method of str

General syntax

Comments in python:

print("Hello") # this is a single line comment

""" print("Hello") 
this is a multi line comment """

Everything in python 3 is an object

a,b = 0,1 - multiple statements

(1,2,3,4,5) is a tuple
[1,2,3,4,5] is a list

True,False are boolean values in python(they are objs of bool class)

a,b,c=10,5,4

None is a data type in python(should be seen as absence of something)

When working with == 0 , 0.0 , False are considered False

When working with ‘is’ 0 , 0.0 are not considered False

Conditionals

if a>b:
  print(a)
elif b>a:
  print(b)
else
  print("This is really cool")

s = "lessthan" if a<b else "not less than" # this is a cond express

Python doesnt have switch statement, but dictionaries can be used to achieve the same principle(selecting from multiple choices) in python

Functions

def is the keyword used to indicate functions

def func_name(args):
  print("Here goes the body of the function")

OOP

Everything in python 3 is an object, variables, functions even code

class Egg:
  
  def __init__(self,kind="fried"):
    self.kind = kind
    """this is the constructor, many py developers call it double underscores dunder"""

  def whatKind(self):
    return self.kind

fried = Egg() #instantaing obj so calls the constructor

Every object has an: 1. ID 2. Class 3. Contents

x=42
id(x)
type(x)

All variables in python are first class objects

Mutable and immutable objects

Objects in python may be mutable or immutable

Most fundamental types in python are immutable

Immutable: 1. Strings 2. Numbers 3. Tuples

Mutable: 1. Lists 2. Dictionaries 3. Other objs depending on their implementation

Numbers

We have Integers and Floats in python

// - integer division

round(float,precision)

int(42.9)

float(42)

The above two are actually the respective constructors of Integer and Float class

Using strings(powerful in python)

Can use single/double quotes for their creation

Can use escapes sequences like \n inside a string

Inorder to specify raw strings, ie not to interpret escape sequences like, path = C:\name. Here \n is not newline, it is a path. So lets use raw string, where the escape sequences are not interpreted by python s = r"This is a raw \n string" # this is often used in reg expr

len(str) - this is used to return the length of the string

Inserting variables in a string:

Python 3 way:

c= "hello" s = "This is {n} string!".format(n=c)

s = "This is {0:.2f} string!".format(n)
""" this {0:.2f} means we are doing a formating before printing. The 0(to the left of :) refers to 
the index of the parameter of the format func, ie 0 means the first parameter in the format 
functions comes inside {} and .2f means we are making sure it is float and has 2 numbers after 
the decimal point """

Python 2 way:

s = "This is %s string!" % n

’’’ or “”” allows us to define preformatted string in python

String can be considered as an immutable sequence, so we can do:

for(i in str): print(i)

Aggregating values with lists and tuples

(1,2,3,4,5) - tuple is immutable in py [1,2,3,4,5] - list is mutable in py

Dictionaries

Dictionaries are mutable objects

Its called hash or associative arrays in other languages {} - used for the definition of dictionaries

d = { 'one':1, 'two':2, 'three':3 }
for k in d: print(k,d[k])
for k in sorted(d.keys()): print(k,d[k])

We have the concept of keyword arguments in python

Keyword arguments are used to a great advantage in defining dict objects

Keys in dict have to be of immutable type and should be unique(ie strings,numbers or tuples not lists)

Values in the dict need not be unique and can be mutable/immutable data type

d = dict( one = 1, two = 2, three = 3}

"one" in d - this returns True

d.keys() #returns a list of keys of the dict

d.values() #returns a list of values of the dict

d.items() #returns a list of tuples(key-value pairs) of the dict

d['seven'] = 7

d.get("seven","othr") #this checks for the key seven,else returns othr

del d['four'] #this deletes this element from the dictionary

d.pop('four') #this deletes it from the dictionary and returns it

Dictionaries to implement switch:

choices = dict(one='first',two="two",three="three")
v = 'one'
print(choices[v])

Appending dictionaries

x= dict(four=4,five=5)
d= dict(one=1,two=2,three=3,**x)

Iterating dictionaries

for k,v in d.items(): print(k,v)
for k in d: print(k,d[v])

Type and id of a variable

id(a) #this returns the unique id of the object

== - checks for the equality of the value store

is - checks for the object id, eg: x is y

Loops

While loop:

a,b=0,1 
while(a<50):
  print(a,end='')
  a,b = b,a+b

For loop:

In python all container objects are iterable

for i in [1,2,3,4]:
	print(i)

Enumerating iterators:

Enumerate is used to get the corresponding index of iterators in python which can be particularly useful

enumerate(iterator) - it is a func ehich takes an iterator as an arg

s = "This is a string"
for index,value in enumerate(s):
  if value==='s': print("index {} is an s".format(index))

Alternate technique to deal with indexes:

s = "This is a string"
for index in range(len(s)):
  if s[index]==='s': print("index {} is an s".format(index))

Control flow keywords:

continue and break are avail in python(works just as same as in other c style languages)

else is another keyword - it is executed when the condition of the loop becomes completely false or if its never true in its first place

for(i in str):
  print(i)
else
  print("over")

Operators

+,-,*,/,//,%

+=,-=,*=,/=,//=

<,>,<=,>=,==,!=, is, is not

and,or(boolean operators)

For immutables the ids will be the same if the values are same

Increment and decrement operator is schieved like x+=1 and x-=1

Bitwise operators:

&,| - bitwise and and or operator

^ - xor operation

x « 4 - x shifted to the left by 4 bits

x » 4 - x shifted to the right by 4 bits

~ - bitwise complement operator

Slice operator:

Ranges in python are non-inclusive, ie range(0,10) means 0 to 9

Ranges can accept 3 parameters: 1. Start 2. Stop(non inclusive) 3. Step(can be positive or negative, negative means right to left)

for i in range(len(str)-1,-1,-1): #len(str)-1 is the last index
#the second parameter is non-inclusive so -1 ensure str(0) is printed
	print(i,str(i)) #prints the index and value of each char in reverse

list[27:42] #means begins at index 27 and ends at index 42 non-inclusive

list[27:42:3] - means begins at index 27 and ends at index 42 non-inclusive and gives the every third element

So the slice operator has 3 args(start,stop and step) - second and third are optional. If the step is negative then it grabs elements from right to left by step size(instead of left to right). Eg: list[::-1] means it will reverse the entire list.

for i in list[27:42:3]: print(i)

List is mutable and we can actually assign to the slice

list[27:42:3] = (99,99,99,99,99,99) - so now all these elements in the list are replaced by 99’s

hello = []

hello[:] = range(100) #list named hello contains numbers 0 to 99

Operator precedence

*,/ have higher precendence over +,-

Can use () to override operator precedence

Regular expressions

Very powerful to match patterns in text

Actually a simple lang in itself, a reg exp can be simple or complex

Implemented in python using the ‘re’ model(this model is in-built in python and comes by default with it)

Searching with regular expressions in python

Search for the line with the pattern and print the line

import re

def main():
  fh = open('raven.txt')
  for line in fh:
    if re.search('(Len|Neverm)ore',line):
      print(line,end='')
if __name__ == "__main__": main()

Search for pattern and print the pattern

import re

def main():
  fh = open('raven.txt')
  for line in fh:
    match = re.search('(Len|Neverm)ore',line)
    if match:
    print(match.group())
if __name__ == "__main__": main()

Search and replace using regular expr

Search and replace in 1 single step

import re

def main():
  fh = open('raven.txt')
  for line in fh:
    print(re.sub('(Len|Neverm)ore','####',line), end='')
if __name__ == "__main__": main()

Search and replace in 2 steps

def main():
  fh = open('raven.txt')
  for line in fh:
    match = re.search('(Len|Neverm)ore',line)
    if match:
      print(line.replace(match.group(),'###'),end='')
if __name__ == "__main__": main()

Pre-compile regex

Pythons reg exp module has a way that we can precompile a reg exp when we are going to be using it over and over again. This is efficient. Also this gives us some additional functionality.

def main():
  fh = open('raven.txt')
  pattern = re.compile('(Len|Neverm)ore')
  for line in fh:
    match = re.search(pattern,line)
    if match:
      print(line.replace(match.group(),'###'),end='')
if __name__ == "__main__": main()

Additional Functionality

def main():
  fh = open('raven.txt')
  pattern = re.compile('(Len|Neverm)ore',re.IGNORECASE)
  for line in fh:
    if re.search(pattern,line):
      print(pattern.sub('###',line),end='')
if __name__ == "__main__": main()

Exceptions

Key method of handling errors in python

“try” something and catch an exception with “except”

We can catch multiple exceptions by have multiple except clauses with their expection names

The code in the else block is executed only if the try block raises no exception(the else block is optional)

We can also add the finally clause, which is optional and gets executed no matter what. Usually resource cleaning/deallocation happens here like closing streams/file handles,etc.

We can raise our own exceptions with “raise”

If we have more than one line in the try suite then the execution stops at the line where the error happens(the lines below are not run) and goes to the except suite to handle it.

try:
  fh=open('filename')
except IOError as e:
    print("Could not open the file:", e)
else:
    for l in fh:print(l.strip())

Raising our own custom exceptions:

def main():
  try:
    for line in readline('xdoc.doc'): print(line.strip())
  except IOError as e:
    print('cannot read file:',e)
  except ValueError as e:
    print('bad filename',e)

def readfile(filename):
  if filename.endswith('.txt'):
    fh = open(filename)
    return fh.readlines()
  else:
  raise ValueError('File must be a .txt file')

if __name__ == "__main__": main()

Functions

To have a function without a body we use the pass keyword which is essentially a NOOP(No Operation)

def empty():
  pass

We can use default arguments in python to make the arguments optional in a functional call

None is a singleton obj and identity is a good way to test for it. (x is None)

List of arguments(arbitary list)

*args - the asterisk means list of optional arguments

The list of arguments come as a normal tuple in python

Named parameters

Mostly it is commonly called kwargs meaning keyword args. kwargs is actually a dictionary

def main():
  testfunc(one=1,two=2,three=3)

def testfunc(**kwargs):
  print('This is a test func',this,that, kwargs['one'],kwargs['two'],kwargs['three'])
  # for k in kwargs: print(k,kwargs[k])

  if __name__ == "__main__": main()

We can combine normal positional arguments, arbitary tuple arguments and keyword arguments in python. The only restriction is that they should be in this order: normal positional args followed by arbitary tuple args followed by keyword args.

def main():
  testfunc(1,2,3,4,5,6,one=1,two=2,three=3)

def testfunc(this,that,other,*args,**kwargs):
  print('This is a test func',kwargs['one'],kwargs['two'],kwargs['three'])

  if __name__ == "__main__": main()

Return values

Can return almost any obj from a function using the “return” keyword

Create a seq from a generator func

Generator function is a func that returns an iterator obj

‘yield’ returns the value and the next time the function is called it continues the execution from the next line(thats what makes it different from ‘return’)

def main():
  for i in inclusive_range(0,25,1):
    print(i,end='')

def inclusive_range(start,stop,step):
  i = start
  while(i<=stop):
    yield i
    i = i+step

  if __name__ == "__main__": main()

Classes

Convention is to use the camel-casing for classes

We can import classes from a file, say from file_name import class_name

When an obj calls a method that self variable gets passed as a reference to the object(not the class), it is as though we are passing the obj itself as an argument when we are instantiating the object of the class.

def __init__(self) - this method is the constructor

Using obj data:

Use accessor methods to read/write attributes of a class. It avoids side effects

Commonly in python we store the obj data in dictionary objects, which gives us a lot of flexibility.

class Duck:
  def __init__(self, **kwargs):
      self.properties = kwargs

  def quack(self):
      print('Quaaack!')

  def walk(self):
      print('Walks like a duck.')

  def get_properties(self):
      return self.properties

  def get_property(self, key):
      return self.properties.get(key, None)

def main():
    donald = Duck(color = 'blue')
    print(donald.get_property('color'))

if __name__ == "__main__": main()

Method1:

class Duck:
  def __init__(self, name="duckkk", legs=2, sound="quack"):
    self.name = name
    self.legs = legs
    self.sound= sound

Method2:

 class Duck:
    def __init__(self, **kwargs):
      self.name = kwargs.get("name","duckkk")
      self.name = kwargs.get("legs",2) #default value is 2
      self.name = kwargs.get("sound","quack")

Method3:

 class Duck:
    def __init__(self, **kwargs):
      for key, value in kwargs.items():
        setattr(self,key,value)

Python also provides with some utitlity functions to handle classes.

getattr(class_name, attribute_name) #returns attrib value

setattr(class_name,attribute_name,value) #sets attrib value

hasattr(class_name,attribute_name) #returns bool value

delattr(class_name,attribute_name) #deletes attribute from the class

isinstance(obj_name,class_name) #returns a bool value

Inheritance:

class Dog(Animal): - This means the class Dog now inherits from the Animal class

Methods in the sub class can be overriden(no need for the override keyword)

super().walk() - calling this in the walk method of the subclass invokes the walk method in the superclass

In python mutliple inheritance is possible, ie a child class can inherit from multiple parents, ie class ChildClass(Parent1,Parent2…..,ParentN)

issubclass(child_class,parent_class) - function to check if a class is child of another class(returns boolean value)

The Duck class above does not inherit from any class, so class Duck: will do, but this wont work in python 2. In python 2 every class has to inherit from some parent class. So we need to change it to class Duck(object): to make it work in both python 2 and 3.

Polymorphism

A strong advantage of loosely typed/ what they call duck typing is that polymorphism is natural

Generator objects

An obj that can be used in the context of an iterable, eg: range obj is used often in the context of an iterable(for loop)

def __iter__(self) - this method makes it an iterable object

String methods

Strings are immutable in python. So all functions which potentially change the string are actually not hacking inits place but returning new strings

"This is so cool {}".format(n)

s= "sailesh"
s.upper() #makes it uppercase
s.lower()
s.swapcase() #swaps/toogles the case of every individual char in the str
s.find('is') #find a word in the string
s.replace('this','that')
s.strip() #removes whitespace,newline from the beg and the end of the #string
s.rstrip() #removes whitespace from the end of the string not from beg
s.rstrip("\n") #removes whatever specified from the end of the string #not from beg
s.isalnum() #return true if all characters in the string are #alphanumeric 
s.isalpha() #returns true if all characters in the string are alphabetic
s.isdecimal() #returns true if all characters in the string are decimal

Splitting and joining strings

s = "This is a string of words"
s.split() # It folds on the whitespaces and splits based on it
s.split(" ") #It will do the splitting but wont do the folding
# split returns a list 
new_string = ','.join(list) #list of words in list in joined by ,

Other string methods

There are over 40 built in string methods in python3

Format strings

We can use the + operator to concatenate strings like “This is “+ var. Alternatively we can use “This is {}”.format(var).

'this is a placeholder {}'.format(a,b)
'this is {1}, that is {0}, there is {1}'.format(a,b)
'this is {bob} and that is {fred}'.format(bob=a,fred=b)
 d = dict(bob=a,fed=b)
'this is {bob}, that is {fred}'.format(**d)

Tuples

Tuples are created with the comma operator. They are 0 index based.

t=1,2,3,4,5 #t[0] holds the value 1, t[4] and t[-1] holds end of list

4 in t #this gives true, similarly we have 'not in'

t.count(5) #count the no of 5's in the tuple

t.index(5) #gives the index of 5

len(t) - gives the length of the list

t=1,2,3,4,5 or t=(1,2,3,4,5) - can have paranthesis, it is the comma operator that actually matters

Tuple with one element is created like t=(1,)

t = tuple(range(25))

t = tuple(range(25))

Lists

Lists in python can hold multiple data types

We can make lists from strings, ie l = list(‘hello’), l is [‘h’,’e’,’l’,’l’,’o’]

Lists are created with [], l = [] or l = list(){l is an empty list}

len(x) - gives the length of the list

max(x),min(x) #gives the max and min element of the list

x=list(range(25))

x[10] = 42 #we cant do this with tuples are they are immutable

4 in x #this gives true, similarly we have 'not in'

x.count(5) #count the no of 5's in the list x

x.index(5) #gives the index of 5 in list x

x.append(25) #adds the value 25 at the end of the list

list1 = list2 = [1,2,3,4]
list1.append([5,6])
# list1 is now [1,2,3,4,[5,6]]
list2.extend([5,6])
# list2 is now [1,2,3,4,5,6]
#thats the difference between append and extend

x.remove(12) #removes the first occurence of 12 from the list

del x[12] #this deletes the value at index 12

x.insert(index,value) #inserts the value at the index(not replaces)

x.pop() #removes and returns the value at the very end of the list

x.pop(0) #removes and returns the value at the very beg of the list

Sets

Unordered collection of unique elements. Can be seen as list with no duplicate elements.

Creation(3 ways):

list = [1,2,3,4,5,2,3] s= set(list) #s is now a set without any duplicates from list
s = set([11,1,2,13,15,11])
s = {1,2,3,4,5,1,2,3} #s is a set without duplicates

len(s) #returns the no of elements in the set

11 in s, 10 not in s(returns boolean values)

Sets in python are mutable. We can add or remove elements

s.add(23) - If we try to add an element to the set which alraedy exists then essentially nothing happens(no change)

s.remove(11)

Operations:

Intersection: set1.intersection(set3) #elements common to both the sets are returned back in a set(note the return type is a set)

Difference: set1.difference(set2) #Elements in set1 but not in set2(result returned back as a set)

Union: set.union(set2) #Union operation of sets(result returned back as a set)

set1.clear() #clears the entire set

File IO

The second arg in the open() func is the mode, by default if not specified it is ‘r’ mode,ie read mode.

Different modes:

‘w’ for write mode
‘a’ for append mode
‘+r’ for read and write mode
Similarly we have ‘+w’, ‘+a’

The open func returns a file handle which is iterable and gives us back one time at a time from the file. There is a method called readlines() which is used on the file handle obj which does the exact same thing.

file_handle.read() #returns the entire content of the file as a string

file_handle.read(5) #reads only 5 bytes

file_handle.seek(0) #sets the cursor position to the beg of the file

file_handle.tell() #tells, ie returns the current cursor position

file_handle.readline() #returns the file content line by line each time it is called

file_handle.readlines() #returns the file content line by line in a list which can be iterated

file_handle.write("data_to_write as a str") - writes data to the file, but make sure you open in the write(w) mode. Note in the w mode if previous data was there in the file then it deletes it, to presrve the data open it in the append(a) mode

file_handle.writelines() #takes a list/tuple as an argument and writes the items in the liat without any spaces or newlines between items

file_handle.close() #this closes the file resource and makes sure changes are saved

Reading and writing text files

def main():
    infile = open('lines.txt','r')
    outfile = open('new.txt','w')
    for line in infile:
        print(line, file=outfile,end = '')

if __name__ == "__main__": main()

Handle big chunks

Instead of reading line by line we can read big chunks of data in buffer to copy contents from one file to the other

def main():
    buffersize  = 50000 #means 50,000 bytes  
    infile = open('lines.txt','r')
    outfile = open('new.txt','w')
    buffer = infile.read(buffersize) #note buffer is not iterable
    
    while(len(buffer)):
    	outfile.write(buffer) 
    	print(".",end='')
    	buffer = infile.read(buffersize)
   
if __name__ == "__main__": main()

Reading and writing binary files

Images store information in pixels and hence lets open the jpg file in binary mode to read inf from it and lets open another file in binary write mode. ‘rb’ is binary read mode and ‘wb’ is binary write mode. The process to read a binary file is very similar, we create a buffer and read chunks of data using it.

def main():
    buffersize  = 50000 #means 50,000 bytes  
    infile = open('olives.jpg','rb')
    outfile = open('new.jpg','wb')
    buffer = infile.read(buffersize) #note buffer is not iterable
    
    while(len(buffer)):
    	outfile.write(buffer) 
    	print(".",end='')
    	buffer = infile.read(buffersize)
   
if __name__ == "__main__": main()