Data structures and applicability

Lists

Sets – Doesn’t maintain order and doesn’t allow repeated elements.

Stacks – Depth-first search, behave similar to recursion and can be a replacement for it.

Queues – Breadth-first search

Heaps – (Priority queue)

  1. Supports
    1. Insertion – O(logn)
    2. Extract min/max – O(logn)
    3. Heapify – O(n)
    4. Delete – O(logn)
  2. Applications:
    1. Scheduled event managing.
    2. Median maintenance – using two heaps, one max heap, and one min heap.
    3. Dijkstra Algorithm – O(m log n)

Search trees –

Trees :

  1. All the nodes should be connected.
  2. should not have cycles.

 

  1. Balanced binary search trees:
    1. Red-black trees.
    2. AVL trees.
    3. Splay trees.

Hash tables –

  1. When you don’t need to remember ordering, minimum or maximum.
  2. To check whether an element exists – O(1)
  3. Insertions and deletions – O(1)
  4. Applications:
    1. 2-sum problem.
  5. Hash function:
  6. Resolving collisions:
    1. Chaining: Linked lists inside each bucket.
    2. open addressing (probe sequence) – one object per bucket.
  7. Choose num of buckets as a prime number.
  8. Load factor: Num of objects/ Num of buckets.
  9. Every hash function has a pathological data set.
  10. Randomization from a family of hash functions at runtime.

Hashing – Use a hash function which deduces the index in which to store the value based on the digits of the value itself (usually last few digits). Store the value in the provided index. when you want to lookup, it is of O(1), since lookup from an indexed array is of constant time.

Bloom filters –

  1. More space efficient than hash tables.
  2. No deletions.
  3. Small false positive probability – might say x has been inserted even though it hasn’t been.
  4. Operations:
    1. Fast inserts and lookups.
  5. Applications:
    1. Spell-checker.
    2. List of forbidden passwords.
    3. Network routers – Limited memory, need to be super fast.

 

 

Advertisements

HEAD FIRST DESIGN PATTERNS

 

  1. The code should be closed to change and open to extension, should be easily maintainable for future change requests.
  2. Prefer composition (change behavior in runtime, encapsulate family of algorithms) over inheritance (your behavior is stuck). (why?)
  3. All design patterns have pros and cons. The decision of whether I should use a design pattern in a situation should be taken after analyzing whether the product you are building can afford to have cons offered by the pattern.

Situation (when):

Design pattern solution (what): Class diagram

Pros and Cons (why):

Decorator pattern (follows open-closed principle):

Situation:

When there are a lot of combinations to deal with in different categories.

Normal solution (stupid way): Enumerate all the combinations possible and write a method in subclass specific to each combination.

Maintenance problems with this solution:

  1. If the price of milk goes up, they have to change the code in all the places.
  2. If they add a new item in one category, the whole thing explodes even more combinatorically.

One more solution:

Take Boolean instance variable for the presence of each specific condiment and deduce cost of the all condiments based on that in the parent class and you can add the cost of that specific beverage alone in the subclass and call super.cost().

Some problems with this solution:

  1. We have to alter existing code if there is a change in the price of any condiment.
  2. New condiments in future means, new instance variables, new setters and getters and we also need to modify the cost method in the superclass to account for this new condiment as well.
  3. Some of the beverage and condiment combinations may not be appropriate and we have no way of restricting them.
  4. What if the customer wants a double mocha?

Time for One more solution:

When inheriting behavior by sub-classing, the behavior is set statically at compile time.

Design pattern solution:

Inheriting behavior at runtime through composition and delegation. More flexibility.

If u want a Dark roast with condiments of mocha and whip, Take a dark roast object, decorate it with a mocha object and decorate it with the whip and call the cost function and rely on delegation.

Decorator objects are kind of wrappers.

The concrete decorator has an instance variable for the thing it decorates.

We can implement new decorators any time to add new behavior instead of changing existing code which is thoroughly tested.

Decorators are usually created by using Factory and Builder patterns.

Alternative to sub-classing for extending functionality.

Enables a lot of combinations with a minimal number of classes possible.

Demerits:

  1. Introduces complexity to the code and reduces readability.
  2. If you have some logic that is specific to a component type like the discount in this case, then this would not be applicable.
  3. You end up adding a lot of small classes.

Adapter pattern:

 

Facade Pattern:

Situation:

  1. To simplify the interface of a group of classes.

Iterator Pattern:

This along with composite pattern helps you deal with a collection of Objects better.

Situation:

  1. If you want to iterate across collectibles of varied types like ArrayList, Hashmap, etc, provided the component type remains same (check).
  2. To provide a way to access elements of an aggregate object sequentially without worrying about its underlying implementation.

Composite Pattern:

Situation:

  1. If you want to treat tree-like structures, leaf nodes, and composites uniformly.
  2. If you want your sub-categories to behave in the same way as your categories.
  3. The difference from iterator pattern is, here the component type itself varies.

Design pattern:

Looks familiar to Depth-first search.

Class Diagram:

How to use it:

Pros and cons:

  1. The abstract Menu component class acts as an interface for both leaf node and composite node, thus breaking single responsibility principle.

Before closing out discussion on the collection of objects, touch upon Bounded generics.

Force not to add improper types into your collectibles in the first place. Archiver case.

Below three patterns change the behavior at runtime using composition.

The State Pattern:

Situation:

Encapsulate state-based behavior and delegate behavior to the current state.

Pros and cons:

  1. Lot of classes, one for each state.

The Strategy Pattern:

Situation:

Encapsulate interchangeable behaviors and use delegation to decide which behavior to use.

Configures context classes with a behavior.

Template Method Pattern:

Situation:

Subclasses decide how to implement steps in an algorithm.

Factory pattern:

  1. Tying your class to a concrete implementation (new) is very fragile and less flexible.

Situation: you have different closely related concrete classes and you choose to instantiate one of these based some conditional logic.

Lame solution:

Put if, else blocks and use the new operator to instantiate an appropriate object in each block.

Maintenance problems:

If there is a new object (new duck type), that has to be added to this conditional logic and you have to figure out and add it in all places wherever it is applicable. This is error-prone.

Other solution:

Code to an interface.

Here, we have to figure out which implementation to instantiate for what type. This requires a piece of conditional logic code. This has to change if new implementations are added or any of the existing implementations are to be removed. So, this is not closed for modification.

Factory method lets subclasses decide which class to instantiate.

Observer Pattern:

Situation:

  1. When a group of objects need to be notified of some state changes.

Miscellaneous:

  1. Code to an interface, not to an implementation.
  2. Use bounded generics instead of Object collectibles for better validation and type casting issues.
  3. Usually, an anti-pattern to modify objects coming in as arguments. Use final modifiers.

You may or may not use any of these patterns in your daily life, but understanding these and going through pros and cons of them will significantly impact your thought process and help you make better decisions.

 

 

Dependency injection

dependency injection1.PNG

What is dependency injection?

Instead of initiating the object we are dependent on, we take help of a dependency framework which will push the ready made object to the dependent class during runtime.

In the above picture we are not initializing hotDrink with new hotDrink(), instead we are getting it as a constructor parameter.

Why is it useful?

This is useful to decouple two different code packages and remove direct dependency.

If you have an interface which can have multiple implementations, with the dependency injection framework, you can choose to run which implementation to run at runtime. Suppose you want to isolate and test a particular package, you can provide mock implementations of all other interfaces it is dependent on.

If you have to do the same without DI injection and interfaces, then you need to change code in lot of places, calling mock object at all the places you were calling original object to test.

It is useful when you have dependencies depending on other dependencies and so on.

Implementing with Google Guice:

Guice diGuice di2

You should usually initialize the injector where you bootstrap the program.

bind or guice configure() helps you define which implementation of interface to use.

@implementedby annotation can be used instead of configure() and bind.

@implementedby can be used as a default one.

If both guice module(bind) and @implementedby compete for different implementations of same interface, guice module wins.


 

Wanna have conditional logic to pick implementations?

Providers


Dependency injection without interface:


Assisted dependency injection: ##

FactoryModuleBuilder

when you want to select between implementation by giving type as input and also you want to pass input inside.

Design and Analysis of algorithms

The most important principle to be an algorithm designer is not to be content.

Asymptotic analysis:

O

Omega

Theta

The master method for analyzing the running time of divide and conquer algorithms:

  1. Black box solution for running time of you input few parameters related to the recurrence.
  2. Assumptions: All recurrences are of equal size.
  3. Master method

n- original problem size

a- number of recurrences called in each instance, rate at which subproblems proliferate. (Evil)

b- The factor by which the input size is divided for calling recurrences.

d- Polynomial exponent of the remaining work needed to merge solutions for the final solution.

b^d – Rate of work shrinkage per subproblem. (good)

Case 1; Same work at each level.

Case 2: More work at the root level.

Case 3: More work at leaf level.

Algorithm Design paradigms:

  1. Divide and conquer:
    1. write the base case.
    2. Using recursion to solve subproblems of a problem.
    3. Combine subproblems
    4. Store the results of subproblems in a hashmap and use them to trim other repeating recursive paths. (Dynamic programming)
  2. Randomization: Like in quick sort.
    1. Decomposition principle:
  3. Greedy algorithms.
  4. Dynamic programming.

Problems:

Graphs:

n-num of vertices, m-num of edges

Graph partitioning: Cuts of a graph, Minimum cut

Representation:

  1. Adjacency matrix. O(n^2)
  2. Adjacency lists. O(n+m)

Strongly connected components: Regions where you can go from any node A to any node B in the directed graph.

Kosaraju’s 2-pass strongly connected components algorithm:

One of the smartest and beautiful algorithms.

The structure of internet:

Structure of internet.PNG

Further reading: Networks, crowds and markets.

 

Glossary:

arcs – Directed edges, ordered pair.

Robust software development process

At scratch:

Document the requirements.

Divide and conquer: Break the implementation of the project into the small chunks of independent interacting software modules.

Design the proper architecture for these interacting modules.

Document the design.


Development:

  1. Build a module.
  2. write any simulators needed to give it input and see the output (optional).
  3. Test the module completely.

Repeat above three steps until all the modules are completed.

Put modules together in a proper interaction flow.

Write test cases and automate testing.


Customize:

Read new requirements and add it to the requirements document to track or identify any reported scenarios as bugs or unexpected(requires new implementation).

Implement the requirement and test for implementation.

Do regression testing and update the regression test suite with the test cases of new implementation and modify the already existing test cases if necessary as per the requirement.


The process is complete.

Let’s talk about main elements of debugging for developer

  1. search (ctrl+f).
  2. breakpoints.
  3. call stack.
  4. call hierarchy.
  5. watch.
  6. data breakpoints and conditional breakpoints.
  7. logs.

Learn system design:

https://www.educative.io/collection/5668639101419520/5649050225344512

Python

Sorting a list of tuples by multiple conditions:

sorted_by_length = sorted(list_,
                         key=lambda x: (x[0], len(x[1]), float(x[1])))

To get the lower bound integer in python:

int(1.6) =1

int(1.4)=1


function:

def function_name(i):


Modules:

write a python file with filename.py extension and add definitions and statements in them.

‘import filename’ in another file.

access definitions inside that file using filename.nameofdef


tup=()

concatenating to a tuple: tup=tup+(a[i],)

list1=[] – ordered sequence.

list1.append(1)

Can’t mutate a string. assignment error.


Aliasing problem in python:

if you have an object say a list and you attach it to the name list1

if you say list2=list1

list2=list1, then list2 points to same object as that of list one.

If you add an element to list2, then it will be reflected in list1 as well which you might not have intended.

So how to take a copy of list1 without having any impact on list1 if I do changes to the copy?

list2=list1[:]              cloned

sort the original list-> list1.sort()

take a sorted list of original list into another object

list2=sorted(list1)

http://stackoverflow.com/q/2739552/5819704 – [[1,2]]*3, all three lists inside outer list refer to same object. This doesn’t sync with your intuition at the beginning.

tuple and string doesn’t support item assignment i.e. s[1]=5, only list supports.


Functions as objects:

sending functions as arguments to other functions.

Higher order programming: Map

Dictionaries: A list of key, value pairs

my_dict={‘Ana’:’B’,”Vivek’:’Ex’}

‘Vivek’ in my_dict  -> True

my_dict[‘Vivek’]=’Ex’

my_dict.keys{}

my_dict.values{}

Memorization: Storing the already computed values in dictionary and looking up for them in case we need them again. This will save from computing something that is already computed.


Debugging:

  1. Testing:
    1. Unit testing-
      1. validate each piece of the program.
      2. testing each function separately.
    2. Regression testing-
      1. After each fix, you need to retest to confirm already tested modules or not broken by the latest fix.
    3. Integration testing-
      1. Does overall program work?
    4. Black box testing-
      1. without looking at the code.
      2. done by other than the implementer to avoid some implementer biases.
    5. Glass box testing:
      1. Path complete.
      2. Design of test cases based on code.
  2. Defensive programming:
    1. specifications for functions.
    2. Modularize programs.
    3. check conditions on inputs/outputs.
  3. Eliminate the source of bugs:

try:

except:

else:

finally:

raise ()

assert             to check whether the assumptions are met or not.


classes and objects:

Data attributes associated with the class, but not with instance or objects.

Generator: yield 1 in method. runs upto the yield and stops. again call the funct.__next__() runs upto next yield and stops.


Lisp and python programming languages allow you to implement reflection.


Using Pylab:

import pylab as plt

plt.figure(‘figurename’) – to plot in new figure

plt.clf()

plt.ylim(0,1000)

plt.title(‘Linear’)

plt.xlabel(‘sample points’)

plt.ylabel(‘linear function’)

plt.subplot(211)

plt.plot(list of x values, list of y values,’b-‘,label=’linear’)

plt.yscale(‘log’)

model=pylab.polyfit(observedX,observedY, n)

Finds coefficients of a polynomial of degree n, that provides a best least squares fit for the observed data. It returns n+1 values.

estYvalues=pylab.polyval(model,xVals)

 


Anonymous function – lambda:

f1=lambda x,y: x+y

f1(2,3)=5


Credits: Jake Vanderplas

An abstract class in python can’t be instantiated.


Scikit:


Pandas:


using both 2.7 and 3.5 of python in diferent environments:

It won’t act weird if you will do it in tensor flow environment. Please follow these steps:
Install python 3.5

you can do the following to use python 3.5 on the new conda environnement “tensorflow”:

conda create –name tensorflow python=3.5
activate tensorflow
conda install jupyter
conda install scipy
pip install tensorflow
or
pip install tensorflow-gpu

You can uninstall python 2.7 also, but if you will follow above steps you can keep python 2.7 and python 3.5 also. They wont act weird. So in this case for anaconda by default it would be python 2.7. But for tensorflow environment it will work on python 3.5


view models of keras

import os
os.environ[“PATH”] += os.pathsep + ‘C:/Anaconda2/envs/tensorflow/Library/bin/graphviz’

from keras.utils.vis_utils import plot_model
plot_model(model, to_file=’model.png’)

Computer science

Determines whether a string is legal – syntax

Determines whether a string has meaning – static semantics.

Assigns a meaning to a legal sentence – semantics

Guess and checks:

Bisection search:

Newton Raphson: next_guess=guess-p(guess)/derivative(p(guess))


Converting decimal to binary:

take a decimal number -> 0.375

How can we represent this in binary?

Multiply this decimal number by a power of 2 which can convert it into a whole number. 2^3=8 in this case which gives 8*0.375=3

then convert this whole number to binary (11)

Divide by 2^3 (shift right) to get 0.011(binary)=0*2^-1+1*2^-2+1*2^-3

If we can’t find p such that x*2^p doesn’t give a whole number, then we can only get approximate binary representation.


Recursion:

Can a problem be represented as a smaller version of the same problem?

For example, finding the factorial of n can be n times smaller version of factorial of (n-1).

multiplication of a times b can be a + multiplication of a times (b-1).

Towers of Hanoi is the paragon problem used popularly to apply recursion.

gcd(a,b) is same as gcd of (b,a%b) until b becomes 0, when b becomes 0 then the answer is a.

fibonacci

An example of self-referencing.

calling a function from inside the same function with different arguments probably.

 


Complexity:

Evaluating efficiency of programs:

  1. Measure with a timer.
  2. Count the operations.
  3. The abstract notion of order of growth.

complexity orders.PNG

Big O Notation: Upper bound on the asymptotic growth, often called the order of growth.

  1. Upper bound on the asymptotic growth often called the order of growth.
  2. Big oh or O(): to describe the worst case.
    1. You can say the complexity of 5*n+2 as the order of n, O(n).
    2. ignore additive and multiplicative constants.
    3. focus on dominant terms.
  3. Order of dominancy –
    1. c^n(exponential)>n^c(polynomila)>nlog(n)(loglinear)>n(linear)>log(n)>1
    2. efficiency in opposite order 🙂
  4. Law of addition:O(n)+O(n+n)=O(n+n^2)=O(n^2)
  5. Law of multiplication:O(n)*O(n)=O(n^2)

logarithmic complexity:

  1. bisection search.
  2. binary search of a list.
  3. look out for division operations as well (heuristic).

Linear complexity:

  1. Factorial – both iterative and recursive.
  2. n dependent loop.

Log-linear complexity:

  1. Merge sort.

polynomial complexity:

  1. Quadratic – nested loops.

Exponential complexity:

  1. mostly in recursive cases-Towers of Hanoi, generating all subsets(power set).

 

Bogo/Monkey/stupid sort:

Shuffling and checking iteratively until all the elements are sorted.

Bubble Sort:

Check two consecutive elements and swap them based on order. Do this all along the list for multiple times.

O(n^2)

in place sort, efficient in terms of space.

Selection Sort:

Find the smallest element and put it at the beginning. find the next smallest and put it at the second place and go on.

O(n^2) – Quadratic, more efficient than bubble sort.

The advantage in this is you can stop after getting a certain number of sorted elements.

Merge sort:

Divide and Conquer –

Time complexity – O(n*log(n)) – log-linear

Space Complexity – O(n)

Quicksort:

Pick a pivot and move all the values larger than it after it and all values less than it before it. Usually, pick the last element as your pivot.

Time Complexity –

worst case Time complexity – O(n^2)

Average case Time complexity – O(n*log(n))

in place sort, efficient in terms of space.