Install Jupyter
It is a web app that allows us to write python scripts.It was originally called IPython notebook or IP notebook, now they have changed their name to Jupyter(as they have started to support multiple languages other than just py). These notebooks have the extension .ipynb
Steps for installation:
- Go to their official downloads page.
- Now run pip3 install jupyter
- Now run jupyter notebook in terminal and the notebook opens in your web browser
Usage
We can create new cells, delete cells, copy cells, move cells
Shift+Enter is the shortcut to run the script in the cell
Shift+Tab write after a func will give us docstring/help info about that func
obj. - now if we press tab it shows us the methods that can be called on the obj
We can change the cell type to code/markdown
We can restart a kernel if say a while/for loop goes in infinite mode and we would like to interrupt and restart it
Basic workflow
def solution(num1,num2)
return num1+num2
from nose.tools import assert_equal
class SolutionTest(object):
def test(self,sol):
assert_equal(sol(2,2),4)
assert_equal(sol(4,4),8)
print("ALL TEST CASES PASSED")
# Run tests
t = SolutionTest()
t.test(solution)
Algorithm analysis
Algorithm is a procedure to solve a problem
def sum1(n):
final_sum = 0
for x in range(n+1):
final_sum +=x
return final_sum
def sum2(n):
return (n*(n+1)/2)
How can we compare the algorithms? 1. Time 2. Space
We can use the %timeit magic func in jupyter to compare the time taken by functions
Timing the code
Sometimes its important to know how long our code is taking to run, or at least know if a particular line of code is slowing down our entire project. Python has a built-in timing module to do this.
This module provides a simple way to time small bits of Python code. It has both a Command-Line Interface as well as a callable one. It avoids a number of common traps for measuring execution times. Lets learn about timeit!
import timeit
"""
Lets use time it to time various methods of creating the string '0-1-2-3-.....-99'
We'll pass two arguments the actual line we want to test encapsulated as a string and the number of times we wish to run it. Here we'll choose 10,000 runs to get some high enough numbers to comapre various methods
"""
# For loop
timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
#returns 0.2894139289855957
# List comprehension
timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
#returns 0.25658297538757324
# Map()
timeit.timeit('"-".join(map(str, range(100)))', number=10000)
#returns 0.16328215599060059
"""Great! We see a significant time difference by using map()! This is good to know and we shuold keep this in mind.
"""
Now lets see iPython’s magic function %timeit iPython’s %timeit will perform the code in the same line a certain number of times (loops) and will give you the fastest performance time (best of 3).
Lets repeat the above examinations using iPython magic!
%timeit "-".join(str(n) for n in range(100))
# prints this: 10000 loops, best of 3: 23.8 µs per loop
%timeit "-".join([str(n) for n in range(100)])
#prints this: 10000 loops, best of 3: 21.3 µs per loop
%timeit "-".join(map(str, range(100)))
#prints this: 100000 loops, best of 3: 13.5 µs per loop
Great! We arrive at the same conclusion. Its also important to note that iPython will limit the amount of real time it will spend on its timeit procedure. For instance if running 100000 loops took 10 minutes, iPython would automatically reduce the number of loops to something more reasonable like 100 or 1000.
Big-O notation
Describes how quickly runtime will grow relative to the input
O(n) means the runtime grows linearly with the input size
Big-O Examples
# O(1) constant:
def func_constant(values):
print values[0]
# O(n) linear:
def func_list(list):
for i in list:
print(i)
# This is O(n)
def func_list(list):
for i in list:
print(i)
for i in list:
print(i)
#This is O(2n) and not O(n^2), but when n -> infinity
#O(n) = O(2n), so we can drop the constant 2 and both the algorithms have almost the same performance. This is called as dropping the insignificant constants or terms.
# O(n^2) quadratic:
def func_quad(lst):
for item1 in lst:
for item2 in lst:
print(item1,item2)
Best case, Worst case, Avg case
def matcher(lst,match):
for item1 in lst:
if item1 == match:
return True
return False
lst = [1,2,3,4,5,6,7,8,9,10]
matcher(lst,1) #Best case O(1), match happens first time
matcher(lst,11) #Worst case O(n), match happens after n iterations
Array sequences
Dynamic Arrays:
The array in python is dynamic in nature. We dont have to specify the size taken by the array in advance. We can keep growing it as ans when needed.
They are similar to hermit crabs which hunt for bigger shells/homes when they grow in size and are in need of it.
Anagram problem
If 2 strings can be written with the same letters then they are anagrams. Eg: “Clint Eastwood”, “old west action” Go given 2 strings check if they are anagrams?(Ignore case)
If each letters frequency are the same in both the strings then thay are anagrams of each other, ie if 2 string are equal to each other after sorting then they are anagrams.
def anagram2(s1,s2):
#remove whitespace and turn to lower case
s1 = s1.replace(" ","").lower()
s1 = s1.replace(" ","").lower()
return sorted(s1) == sorted(s2)
#Lets solve this without sorting func manually
def anagram(s1,s2):
#remove whitespace and turn to lower case
s1 = s1.replace(" ","").lower()
s1 = s1.replace(" ","").lower()
#if the no of letters are not same they we can immediately say they #are not anagrams
if len(s1) != len(s2):
return False
#Now lets use a dictionary to count the frequency of each letter
count = {}
for letter in str:
if letter in count:
count[letter] += 1
else:
count[letter] = 1
# So lets loop through each letter in str2 and sub 1 for each letter
# Finally if all the values in the dict are zero then anagrams
for letter in s2:
if letter in count:
count[letter] -=1
else
count[letter] = 1
#set this to 1 so that the next for loop retuens False
for k in count:
if count[k]!=0:
return False
return True
Array Sum problem
Given an integer array, output all the unique pairs that sum up to a specific value k. So the input: pair_sum([1,3,2,2],4), would return 2 pairs:(1,3) (2,2)
NOTE: FOR TESTING PURPOSES, CHANGE YOUR FUNCTION SO IT OUTPUTS THE NUMBER OF PAIRS
Solution: The O(N) algorithm uses the set data structure. We perform a linear pass from the beginning and for each element we check whether k-element is in the set of seen numbers. If it is, then we found a pair of sum k and add it to the output. If not, this element doesn’t belong to a pair yet, and we add it to the set of seen elements. The algorithm is really simple once we figure out using a set. The complexity is O(N) because we do a single linear scan of the array, and for each element we just check whether the corresponding number to form a pair is in the set or add the current element to the set. Insert and find operations of a set are both average O(1), so the algorithm is O(N) in total.
def pair_sum(arr,k):
if len(arr)<2:
return
# Sets for tracking
seen = set()
output = set()
# For every number in array
for num in arr:
# Set target difference
target = k-num
# Add it to set if target hasn't been seen
if target not in seen:
seen.add(num)
else:
# Add a tuple with the corresponding pair
output.add( (min(num,target), max(num,target)) )
# FOR TESTING
return len(output)
# Nice one-liner for printing output
#return '\n'.join(map(str,list(output)))
"""
RUN THIS CELL TO TEST YOUR SOLUTION
"""
from nose.tools import assert_equal
class TestPair(object):
def test(self,sol):
assert_equal(sol([1,9,2,8,3,7,4,6,5,5,13,14,11,13,-1],10),6)
assert_equal(sol([1,2,3,1],3),1)
assert_equal(sol([1,3,2,2],4),2)
print 'ALL TEST CASES PASSED'
#Run tests
t = TestPair()
t.test(pair_sum)