Installation
1.Install python 3 from the offical downloads section
2.Install latest version of Java and Eclipse IDE
3.Now install pydev plugin inside eclipse under help, install software(make sure to add the http://www.pydev.com/updates website)
4.Install the dependencies
5.Go to preferences, under pydev, under interpreter python add the python interpreter by including the executable or the alias in usr/local/bin
6.Create a perspective for pydev and remove java perspective if needed
Creating a main script:
Type python in mac(comes with python 2 by default). Now it opens the interpreter. Now type import idlelib.idle
. It will open the IDLE in mac.
#!/usr/bin/python3 - called as the shebang line, mentions the path to the interpreter
Whitespace is significant in python(4 spaces are traditional in python)
No semicolons
Python shell
Gives us instant feedback
Use ‘exit()’ to exit out of the shell
‘help’ gives us inf on almost anything. Eg: help(print) will give us adequate inf about the print func in python. To leave the help dialog press the Q key.
dir(str) - tells us all the associated attributes and methods
help(str.rjust) - gives us info abt the rjust method of str
General syntax
Comments in python:
print("Hello") # this is a single line comment
""" print("Hello")
this is a multi line comment """
Everything in python 3 is an object
a,b = 0,1
- multiple statements
(1,2,3,4,5) is a tuple
[1,2,3,4,5] is a list
True,False are boolean values in python(they are objs of bool class)
a,b,c=10,5,4
None is a data type in python(should be seen as absence of something)
When working with == 0 , 0.0 , False are considered False
When working with ‘is’ 0 , 0.0 are not considered False
Conditionals
if a>b:
print(a)
elif b>a:
print(b)
else
print("This is really cool")
s = "lessthan" if a<b else "not less than" # this is a cond express
Python doesnt have switch statement, but dictionaries can be used to achieve the same principle(selecting from multiple choices) in python
Functions
def is the keyword used to indicate functions
def func_name(args):
print("Here goes the body of the function")
OOP
Everything in python 3 is an object, variables, functions even code
class Egg:
def __init__(self,kind="fried"):
self.kind = kind
"""this is the constructor, many py developers call it double underscores dunder"""
def whatKind(self):
return self.kind
fried = Egg() #instantaing obj so calls the constructor
Every object has an: 1. ID 2. Class 3. Contents
x=42
id(x)
type(x)
All variables in python are first class objects
Mutable and immutable objects
Objects in python may be mutable or immutable
Most fundamental types in python are immutable
Immutable: 1. Strings 2. Numbers 3. Tuples
Mutable: 1. Lists 2. Dictionaries 3. Other objs depending on their implementation
Numbers
We have Integers and Floats in python
// - integer division
round(float,precision)
int(42.9)
float(42)
The above two are actually the respective constructors of Integer and Float class
Using strings(powerful in python)
Can use single/double quotes for their creation
Can use escapes sequences like \n inside a string
Inorder to specify raw strings, ie not to interpret escape sequences like, path = C:\name
. Here \n is not newline, it is a path. So lets use raw string, where the escape sequences are not interpreted by python s = r"This is a raw \n string" # this is often used in reg expr
len(str) - this is used to return the length of the string
Inserting variables in a string:
Python 3 way:
c= "hello"
s = "This is {n} string!".format(n=c)
s = "This is {0:.2f} string!".format(n)
""" this {0:.2f} means we are doing a formating before printing. The 0(to the left of :) refers to
the index of the parameter of the format func, ie 0 means the first parameter in the format
functions comes inside {} and .2f means we are making sure it is float and has 2 numbers after
the decimal point """
Python 2 way:
s = "This is %s string!" % n
’’’ or “”” allows us to define preformatted string in python
String can be considered as an immutable sequence, so we can do:
for(i in str): print(i)
Aggregating values with lists and tuples
(1,2,3,4,5) - tuple is immutable in py [1,2,3,4,5] - list is mutable in py
Dictionaries
Dictionaries are mutable objects
Its called hash or associative arrays in other languages {} - used for the definition of dictionaries
d = { 'one':1, 'two':2, 'three':3 }
for k in d: print(k,d[k])
for k in sorted(d.keys()): print(k,d[k])
We have the concept of keyword arguments in python
Keyword arguments are used to a great advantage in defining dict objects
Keys in dict have to be of immutable type and should be unique(ie strings,numbers or tuples not lists)
Values in the dict need not be unique and can be mutable/immutable data type
d = dict( one = 1, two = 2, three = 3}
"one" in d
- this returns True
d.keys() #returns a list of keys of the dict
d.values() #returns a list of values of the dict
d.items() #returns a list of tuples(key-value pairs) of the dict
d['seven'] = 7
d.get("seven","othr") #this checks for the key seven,else returns othr
del d['four'] #this deletes this element from the dictionary
d.pop('four') #this deletes it from the dictionary and returns it
Dictionaries to implement switch:
choices = dict(one='first',two="two",three="three")
v = 'one'
print(choices[v])
Appending dictionaries
x= dict(four=4,five=5)
d= dict(one=1,two=2,three=3,**x)
Iterating dictionaries
for k,v in d.items(): print(k,v)
for k in d: print(k,d[v])
Type and id of a variable
id(a) #this returns the unique id of the object
== - checks for the equality of the value store
is - checks for the object id, eg: x is y
Loops
While loop:
a,b=0,1
while(a<50):
print(a,end='')
a,b = b,a+b
For loop:
In python all container objects are iterable
for i in [1,2,3,4]:
print(i)
Enumerating iterators:
Enumerate is used to get the corresponding index of iterators in python which can be particularly useful
enumerate(iterator) - it is a func ehich takes an iterator as an arg
s = "This is a string"
for index,value in enumerate(s):
if value==='s': print("index {} is an s".format(index))
Alternate technique to deal with indexes:
s = "This is a string"
for index in range(len(s)):
if s[index]==='s': print("index {} is an s".format(index))
Control flow keywords:
continue and break are avail in python(works just as same as in other c style languages)
else is another keyword - it is executed when the condition of the loop becomes completely false or if its never true in its first place
for(i in str):
print(i)
else
print("over")
Operators
+,-,*,/,//,%
+=,-=,*=,/=,//=
<,>,<=,>=,==,!=, is, is not
and,or(boolean operators)
For immutables the ids will be the same if the values are same
Increment and decrement operator is schieved like x+=1
and x-=1
Bitwise operators:
&,| - bitwise and and or operator
^ - xor operation
x « 4 - x shifted to the left by 4 bits
x » 4 - x shifted to the right by 4 bits
~ - bitwise complement operator
Slice operator:
Ranges in python are non-inclusive, ie range(0,10) means 0 to 9
Ranges can accept 3 parameters: 1. Start 2. Stop(non inclusive) 3. Step(can be positive or negative, negative means right to left)
for i in range(len(str)-1,-1,-1): #len(str)-1 is the last index
#the second parameter is non-inclusive so -1 ensure str(0) is printed
print(i,str(i)) #prints the index and value of each char in reverse
list[27:42] #means begins at index 27 and ends at index 42 non-inclusive
list[27:42:3] - means begins at index 27 and ends at index 42 non-inclusive and gives the every third element
So the slice operator has 3 args(start,stop and step) - second and third are optional. If the step is negative then it grabs elements from right to left by step size(instead of left to right). Eg: list[::-1] means it will reverse the entire list.
for i in list[27:42:3]: print(i)
List is mutable and we can actually assign to the slice
list[27:42:3] = (99,99,99,99,99,99) - so now all these elements in the list are replaced by 99’s
hello = []
hello[:] = range(100) #list named hello contains numbers 0 to 99
Operator precedence
*,/ have higher precendence over +,-
Can use () to override operator precedence
Regular expressions
Very powerful to match patterns in text
Actually a simple lang in itself, a reg exp can be simple or complex
Implemented in python using the ‘re’ model(this model is in-built in python and comes by default with it)
Searching with regular expressions in python
Search for the line with the pattern and print the line
import re
def main():
fh = open('raven.txt')
for line in fh:
if re.search('(Len|Neverm)ore',line):
print(line,end='')
if __name__ == "__main__": main()
Search for pattern and print the pattern
import re
def main():
fh = open('raven.txt')
for line in fh:
match = re.search('(Len|Neverm)ore',line)
if match:
print(match.group())
if __name__ == "__main__": main()
Search and replace using regular expr
Search and replace in 1 single step
import re
def main():
fh = open('raven.txt')
for line in fh:
print(re.sub('(Len|Neverm)ore','####',line), end='')
if __name__ == "__main__": main()
Search and replace in 2 steps
def main():
fh = open('raven.txt')
for line in fh:
match = re.search('(Len|Neverm)ore',line)
if match:
print(line.replace(match.group(),'###'),end='')
if __name__ == "__main__": main()
Pre-compile regex
Pythons reg exp module has a way that we can precompile a reg exp when we are going to be using it over and over again. This is efficient. Also this gives us some additional functionality.
def main():
fh = open('raven.txt')
pattern = re.compile('(Len|Neverm)ore')
for line in fh:
match = re.search(pattern,line)
if match:
print(line.replace(match.group(),'###'),end='')
if __name__ == "__main__": main()
Additional Functionality
def main():
fh = open('raven.txt')
pattern = re.compile('(Len|Neverm)ore',re.IGNORECASE)
for line in fh:
if re.search(pattern,line):
print(pattern.sub('###',line),end='')
if __name__ == "__main__": main()
Exceptions
Key method of handling errors in python
“try” something and catch an exception with “except”
We can catch multiple exceptions by have multiple except clauses with their expection names
The code in the else block is executed only if the try block raises no exception(the else block is optional)
We can also add the finally clause, which is optional and gets executed no matter what. Usually resource cleaning/deallocation happens here like closing streams/file handles,etc.
We can raise our own exceptions with “raise”
If we have more than one line in the try suite then the execution stops at the line where the error happens(the lines below are not run) and goes to the except suite to handle it.
try:
fh=open('filename')
except IOError as e:
print("Could not open the file:", e)
else:
for l in fh:print(l.strip())
Raising our own custom exceptions:
def main():
try:
for line in readline('xdoc.doc'): print(line.strip())
except IOError as e:
print('cannot read file:',e)
except ValueError as e:
print('bad filename',e)
def readfile(filename):
if filename.endswith('.txt'):
fh = open(filename)
return fh.readlines()
else:
raise ValueError('File must be a .txt file')
if __name__ == "__main__": main()
Functions
To have a function without a body we use the pass keyword which is essentially a NOOP(No Operation)
def empty():
pass
We can use default arguments in python to make the arguments optional in a functional call
None is a singleton obj and identity is a good way to test for it. (x is None)
List of arguments(arbitary list)
*args - the asterisk means list of optional arguments
The list of arguments come as a normal tuple in python
Named parameters
Mostly it is commonly called kwargs meaning keyword args. kwargs is actually a dictionary
def main():
testfunc(one=1,two=2,three=3)
def testfunc(**kwargs):
print('This is a test func',this,that, kwargs['one'],kwargs['two'],kwargs['three'])
# for k in kwargs: print(k,kwargs[k])
if __name__ == "__main__": main()
We can combine normal positional arguments, arbitary tuple arguments and keyword arguments in python. The only restriction is that they should be in this order: normal positional args followed by arbitary tuple args followed by keyword args.
def main():
testfunc(1,2,3,4,5,6,one=1,two=2,three=3)
def testfunc(this,that,other,*args,**kwargs):
print('This is a test func',kwargs['one'],kwargs['two'],kwargs['three'])
if __name__ == "__main__": main()
Return values
Can return almost any obj from a function using the “return” keyword
Create a seq from a generator func
Generator function is a func that returns an iterator obj
‘yield’ returns the value and the next time the function is called it continues the execution from the next line(thats what makes it different from ‘return’)
def main():
for i in inclusive_range(0,25,1):
print(i,end='')
def inclusive_range(start,stop,step):
i = start
while(i<=stop):
yield i
i = i+step
if __name__ == "__main__": main()
Classes
Convention is to use the camel-casing for classes
We can import classes from a file, say from file_name import class_name
When an obj calls a method that self variable gets passed as a reference to the object(not the class), it is as though we are passing the obj itself as an argument when we are instantiating the object of the class.
def __init__(self)
- this method is the constructor
Using obj data:
Use accessor methods to read/write attributes of a class. It avoids side effects
Commonly in python we store the obj data in dictionary objects, which gives us a lot of flexibility.
class Duck:
def __init__(self, **kwargs):
self.properties = kwargs
def quack(self):
print('Quaaack!')
def walk(self):
print('Walks like a duck.')
def get_properties(self):
return self.properties
def get_property(self, key):
return self.properties.get(key, None)
def main():
donald = Duck(color = 'blue')
print(donald.get_property('color'))
if __name__ == "__main__": main()
Method1:
class Duck:
def __init__(self, name="duckkk", legs=2, sound="quack"):
self.name = name
self.legs = legs
self.sound= sound
Method2:
class Duck:
def __init__(self, **kwargs):
self.name = kwargs.get("name","duckkk")
self.name = kwargs.get("legs",2) #default value is 2
self.name = kwargs.get("sound","quack")
Method3:
class Duck:
def __init__(self, **kwargs):
for key, value in kwargs.items():
setattr(self,key,value)
Python also provides with some utitlity functions to handle classes.
getattr(class_name, attribute_name) #returns attrib value
setattr(class_name,attribute_name,value) #sets attrib value
hasattr(class_name,attribute_name) #returns bool value
delattr(class_name,attribute_name) #deletes attribute from the class
isinstance(obj_name,class_name) #returns a bool value
Inheritance:
class Dog(Animal):
- This means the class Dog now inherits from the Animal class
Methods in the sub class can be overriden(no need for the override keyword)
super().walk()
- calling this in the walk method of the subclass invokes the walk method in the superclass
In python mutliple inheritance is possible, ie a child class can inherit from multiple parents, ie class ChildClass(Parent1,Parent2…..,ParentN)
issubclass(child_class,parent_class)
- function to check if a class is child of another class(returns boolean value)
The Duck class above does not inherit from any class, so class Duck:
will do, but this wont work in python 2. In python 2 every class has to inherit from some parent class. So we need to change it to class Duck(object):
to make it work in both python 2 and 3.
Polymorphism
A strong advantage of loosely typed/ what they call duck typing is that polymorphism is natural
Generator objects
An obj that can be used in the context of an iterable, eg: range obj is used often in the context of an iterable(for loop)
def __iter__(self)
- this method makes it an iterable object
String methods
Strings are immutable in python. So all functions which potentially change the string are actually not hacking inits place but returning new strings
"This is so cool {}".format(n)
s= "sailesh"
s.upper() #makes it uppercase
s.lower()
s.swapcase() #swaps/toogles the case of every individual char in the str
s.find('is') #find a word in the string
s.replace('this','that')
s.strip() #removes whitespace,newline from the beg and the end of the #string
s.rstrip() #removes whitespace from the end of the string not from beg
s.rstrip("\n") #removes whatever specified from the end of the string #not from beg
s.isalnum() #return true if all characters in the string are #alphanumeric
s.isalpha() #returns true if all characters in the string are alphabetic
s.isdecimal() #returns true if all characters in the string are decimal
Splitting and joining strings
s = "This is a string of words"
s.split() # It folds on the whitespaces and splits based on it
s.split(" ") #It will do the splitting but wont do the folding
# split returns a list
new_string = ','.join(list) #list of words in list in joined by ,
Other string methods
There are over 40 built in string methods in python3
Format strings
We can use the + operator to concatenate strings like “This is “+ var. Alternatively we can use “This is {}”.format(var).
'this is a placeholder {}'.format(a,b)
'this is {1}, that is {0}, there is {1}'.format(a,b)
'this is {bob} and that is {fred}'.format(bob=a,fred=b)
d = dict(bob=a,fed=b)
'this is {bob}, that is {fred}'.format(**d)
Tuples
Tuples are created with the comma operator. They are 0 index based.
t=1,2,3,4,5 #t[0] holds the value 1, t[4] and t[-1] holds end of list
4 in t #this gives true, similarly we have 'not in'
t.count(5) #count the no of 5's in the tuple
t.index(5) #gives the index of 5
len(t) - gives the length of the list
t=1,2,3,4,5 or t=(1,2,3,4,5) - can have paranthesis, it is the comma operator that actually matters
Tuple with one element is created like t=(1,)
t = tuple(range(25))
t = tuple(range(25))
Lists
Lists in python can hold multiple data types
We can make lists from strings, ie l = list(‘hello’), l is [‘h’,’e’,’l’,’l’,’o’]
Lists are created with [], l = [] or l = list(){l is an empty list}
len(x) - gives the length of the list
max(x),min(x) #gives the max and min element of the list
x=list(range(25))
x[10] = 42 #we cant do this with tuples are they are immutable
4 in x #this gives true, similarly we have 'not in'
x.count(5) #count the no of 5's in the list x
x.index(5) #gives the index of 5 in list x
x.append(25) #adds the value 25 at the end of the list
list1 = list2 = [1,2,3,4]
list1.append([5,6])
# list1 is now [1,2,3,4,[5,6]]
list2.extend([5,6])
# list2 is now [1,2,3,4,5,6]
#thats the difference between append and extend
x.remove(12) #removes the first occurence of 12 from the list
del x[12] #this deletes the value at index 12
x.insert(index,value) #inserts the value at the index(not replaces)
x.pop() #removes and returns the value at the very end of the list
x.pop(0) #removes and returns the value at the very beg of the list
Sets
Unordered collection of unique elements. Can be seen as list with no duplicate elements.
Creation(3 ways):
-
list = [1,2,3,4,5,2,3] s= set(list) #s is now a set without any duplicates from list
-
s = set([11,1,2,13,15,11])
-
s = {1,2,3,4,5,1,2,3} #s is a set without duplicates
len(s) #returns the no of elements in the set
11 in s, 10 not in s(returns boolean values)
Sets in python are mutable. We can add or remove elements
s.add(23) - If we try to add an element to the set which alraedy exists then essentially nothing happens(no change)
s.remove(11)
Operations:
Intersection: set1.intersection(set3) #elements common to both the sets are returned back in a set(note the return type is a set)
Difference: set1.difference(set2) #Elements in set1 but not in set2(result returned back as a set)
Union: set.union(set2) #Union operation of sets(result returned back as a set)
set1.clear() #clears the entire set
File IO
The second arg in the open() func is the mode, by default if not specified it is ‘r’ mode,ie read mode.
Different modes:
-
‘w’ for write mode
-
‘a’ for append mode
-
‘+r’ for read and write mode
-
Similarly we have ‘+w’, ‘+a’
The open func returns a file handle which is iterable and gives us back one time at a time from the file. There is a method called readlines() which is used on the file handle obj which does the exact same thing.
file_handle.read() #returns the entire content of the file as a string
file_handle.read(5) #reads only 5 bytes
file_handle.seek(0) #sets the cursor position to the beg of the file
file_handle.tell() #tells, ie returns the current cursor position
file_handle.readline() #returns the file content line by line each time it is called
file_handle.readlines() #returns the file content line by line in a list which can be iterated
file_handle.write("data_to_write as a str")
- writes data to the file, but make sure you open in the write(w) mode. Note in the w mode if previous data was there in the file then it deletes it, to presrve the data open it in the append(a) mode
file_handle.writelines() #takes a list/tuple as an argument and writes the items in the liat without any spaces or newlines between items
file_handle.close() #this closes the file resource and makes sure changes are saved
Reading and writing text files
def main():
infile = open('lines.txt','r')
outfile = open('new.txt','w')
for line in infile:
print(line, file=outfile,end = '')
if __name__ == "__main__": main()
Handle big chunks
Instead of reading line by line we can read big chunks of data in buffer to copy contents from one file to the other
def main():
buffersize = 50000 #means 50,000 bytes
infile = open('lines.txt','r')
outfile = open('new.txt','w')
buffer = infile.read(buffersize) #note buffer is not iterable
while(len(buffer)):
outfile.write(buffer)
print(".",end='')
buffer = infile.read(buffersize)
if __name__ == "__main__": main()
Reading and writing binary files
Images store information in pixels and hence lets open the jpg file in binary mode to read inf from it and lets open another file in binary write mode. ‘rb’ is binary read mode and ‘wb’ is binary write mode. The process to read a binary file is very similar, we create a buffer and read chunks of data using it.
def main():
buffersize = 50000 #means 50,000 bytes
infile = open('olives.jpg','rb')
outfile = open('new.jpg','wb')
buffer = infile.read(buffersize) #note buffer is not iterable
while(len(buffer)):
outfile.write(buffer)
print(".",end='')
buffer = infile.read(buffersize)
if __name__ == "__main__": main()