Friday 1 July 2016

Binary Search In Java And It's Complexity

Have you ever used a word dictionary of any language?? You must be thinking that what word dictionary has to do with binary search, right? Well, it has. Binary search is exactly similar to the searching for a word in a dictionary!!

By the way, binary search has one precondition- data structure needs to be present in sorted order. Like a dictionary has words arranged in sorted order. We can sort data structure using various sorting algorithms.

Steps to search for a word in a dictionary?
  1. Go to the middle of the dictionary.
  2. If you find the word then stop here else go to step 3.
  3. See if the word shall come after middle of the dictionary or before.
  4. If it comes after the middle of the dictionary then take middle till end of the dictionary as a new dictionary & repeat steps 1-4.
  5. If it comes before the middle of the dictionary then take middle till start of the dictionary as a new dictionary & repeat steps 1-4.
  6. If you are at the extreme end or extreme start of the dictionary then it means the word you are looking for is not in the dictionary.
Ok, it's time to code above logic in Java.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
/**
 * BINARY SEARCH JAVA PROGRAM
 */
package javagreedy.blogspot.com;

import java.util.Arrays;

/**
 * @author javagreedy.blogspot.com
 * 
 */
public class BinarySearchApp {

 public static void main(String[] args) {

  String[] dictionary = new String[]{"Dog", "Apple", "Ant", "Ball",
    "Elephant", "Cat", "Frog", "Fish"}; // unsorted array of words

  Arrays.sort(dictionary);// we have sorted it using existing sort api
        // provided by Java

  System.out.print("Dictionary contains: ");
  for (String word : dictionary) {
   System.out.print(word + " "); // display the sorted dictionary
  }
  System.out.println();// line break

  binarySearch(dictionary, "Ant"); // binary search for Ant in dictionary

  binarySearch(dictionary, "Tiger"); // binary search for Tiger in
           // dictionary

 }

 private static void binarySearch(String[] dictionary, String wordToFind) {

  int start = 0; // dictionary start
  int end = dictionary.length - 1;// dictionary end
  int middle = (start + end) / 2;// middle of the dictionary

  // base condition
  while (start <= end) {
   // check if the word is at middle
   if (dictionary[middle].equals(wordToFind)) {
    System.out.println(wordToFind + " is found at location "
      + middle);
    break;// exit
   } else if (dictionary[middle].compareTo(wordToFind) < 0) {// check
                  // if
                  // wordTofind
                  // is
                  // after
                  // middle
    start = middle + 1;// reset start
   } else {
    end = middle - 1;// if wordTofind is before middle then reset
         // end
   }
   middle = (start + end) / 2;// find middle of new dictionary
  }
  if (start > end) // unexpected condition
   System.out
     .println(wordToFind + " is not present in the dictionary");
 }

}

Output:
1
2
3
Dictionary contains: Ant Apple Ball Cat Dog Elephant Fish Frog 
Ant is found at location 0
Tiger is not present in the dictionary

Well, Java already provides a method for binarySearch in class java.util.Arrays. You can use it as below-

int index = Arrays.binarySearch(dictionary,"Ant");

What is the time complexity of binary search?
Worst  Case:  Worst case will be when the element we are searching for is not present in the data structure. In the second execution of above program where we searched for "Tiger". Each time we divided dictionary into half. We went on doing this until there were no further division possible. If you see this looks similar to the logarithmic time complexity. Suppose we have 8 words then we will divide dictionary in the following ways- 
1. middle = 8/2 = 4
2. middle = 4/2 = 2
3. middle = 2/2 = 1 No further division is possible.

Here, we divided dictionary 3 times. Mathematically: log(8) = 3 where log base is 2 as we are dividing dictionary each time by 2. So, if we have n elements in data structure then worst case complexity for binary search would be log(n).

Best Case: At the very first execution, we directly searched for the word in middle of the dictionary. And if that is the word we are looking for then this is the best case. Irrespective of the number of words in a dictionary we will always have to check to divide the dictionary only once if it's the best case. So, best case complexity becomes O(1).


Saturday 25 June 2016

Linear Search In Java And It's Complexity

As name says , linear search is searching for an element in the data structure in a linear or sequential fashion. Suppose I have a bag containing 20 balls which are having any random numbers from 1 to 100 printed on them. Then, how I can find the ball with number, say  55? If we use linear search then we will take out one ball at a time and check a number on it till we find the one with number 55.

Let’s implement linear search in java-

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/**
 * LINEAR SEARCH JAVA PROGRAM
 */
package javagreedy.blogspot.com;

/**
 * @author javagreedy.blogspot.com
 *
 */
public class LinearSearchApp {

       public static void main(String[] args) {

              int[] bag = new int[] { 10, 33, 23, 75, 98, 22, 44, 37, 55, 67, 13, 64,
                           80, 36, 74, 68, 12, 92, 15, 20 }; // bag containing 20 numbers

              linearSearch(bag, 55); // search 55 in bag

              linearSearch(bag, 29); // search 29 in bag

       }

       private static void linearSearch(int[] bag, int numToFind) {
              // scan complete bag
              for (int i = 0; i < bag.length; i++) {
                     // check element one by one
                     if (bag[i] == numToFind) {
                           System.out.println("Bingo!! " + numToFind
                                         + " is found at location " + i);
                           return;
                     }
              }

              System.out.println("Oops! Bag doesn't contain " + numToFind
                           + " at all!");

       }

}

Output:
Bingo!! 55 is found at location 8
Oops! Bag doesn't contain 29 at all!

What is the time complexity of linearsearch?
Worst  Case:  See the second execution of above program where we searched for number, 29 till last element in array and we didn’t find it. This is the worst case when number is not present in the array.  For 20 array elements we checked (line number 27) 20 times. So, if we have ‘n’ number of elements in bag then we will have to check for ‘n’ times. So, worst case complexity will be O(n).

Best Case: What if we found the number on the first check itself? Well, you guessed it right; it’s the best case for linear search. Irrespective of the number of elements in an array we will always have to check for a number only once if it's the best case. So, best case complexity becomes O(1).

Saturday 4 June 2016

Finding Time Complexity (Asymptotic Analysis)

There are some basic code statements that are used to build complete program. It is necessary to know their time complexities in order to calculate the time complexity of whole code/algorithm.

We will look at them one by one-


1. Constant Time [O(1)]

1
2
3
4
5
6
void sum() {
int a= 2;
int b=3;
int c = a + b;
System.out.println(c);
}

In above code we are not taking any input. So this code is independent of input size. Lets say the above code takes time C to execute. This time will always be constant irrespective of the input size. In asymptotic analysis we say that the time complexity is O(1).

2. Simple Loops
1
2
3
4
5
6
int k= 0;//time c1 
// loop will execute n times 
for(int i = 1 ; i<=n ; i++)
 {
 k= k+ i; //time C2 
}

Total time = Time taken by initialization statement  + Time taken by statements inside loop x Number of time loop will execute
Total time = C1 + C2 x n
Total time = C1 + C2n i.e. O(n)

3. Nested Loops
1
2
3
4
5
6
7
8
9
10
11
12
13
14
int k= 0, m = n; //time C1

// outer loop will execute n times
  for(int i = 1 ; i<=n ; i++) {

   k= k+ i; //time C2

// inner loop will also execute n times
for(int j = n ; j>=1 ; j--) {

   m= m- j; //time C3
    
}
}

In nested loops we analyze time from inside out. 

Time taken by inner loop = C3 x n

Total time  = Time taken by initialization statement + Time taken by statements inside outer loop x Number of time outer loop will execute
Total time = C1 + (C2 + C3n) x n
Total time = C1 + C2n + C3n2   time  i.e. O(n2

4. Consecutive Statements
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
int x = 0, k=0; //time C1

// loop will execute n times
  for(int y = 1 ; y<=n ; y++) {

   x= x+ y; //time C2
     
}

// outer loop will execute n times
  for(int i = 1 ; i<=n ; i++) {

   k= k+ i; //time C3

// inner loop will also execute n times
for(int j = n ; j>=1 ; j--) {

   m= m- j; //time C4
     
}
}

Total time =  C1 + C2n + (C3 + C4n) x n
Total time =  C1 + C2n + C3n + C4n
i.e. O(n2

5. If else statements
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
boolean bool; //time C1

if(bool) {
  return true; //time C2
  
}
else {

// loop will execute n times
  for(int y = 1 ; y<=n ; y++) {

    y++; //time C3
     
}

}

Here we check for worst case scenario
(remember general rules? ) i.e. what will be the larger time complexity?


Total time will be either C1 + C2 in case 'if' condition is true. Or it will be C3n in case else statement executes. So, maximum time complexity in worst case would be O(n).

6. Logarithmic
1
2
3
4
5
 for(int i = 1 ; i<=n ; i++) {

   i= i*2; //time C

}

Here value of i is becoming twice in every execution and hence reducing the number of times the loop will execute. Let's see how it goes-
Initially i = 1 i.e. 20
Execution 1 -> i = 2 i.e. 21
Execution 2 -> i = 4 i.e. 22
Execution 3 -> i = 8 i.e. 23
Execution 4 -> i = 16 i.e. 2and so on...

Lets say 2= n. If we take log on both side to the base 2, we get
log2k = logn
k.log2 = logn
k = logn (since log2 to the base 2 is 1)

k is nothing but the total time taken by loop and hence time complexity is O(logn).

Wednesday 1 June 2016

Asymptotic Notations- Big Omega Notation [Ω] & Big Theta Notation [Ѳ]

In the previous post we have seen the 1. Big O Notation [O]. Next is-


In opposite to the big O notation this notation represents tightest lower bound of the given function.
Big Omega
Analytically we say as: f(n) belongs to Ω(g(n)) as there exists c > 0 (e.g., c = 1) and n0 (e.g. n0 = 5) such that f(n) ≥ c.g(n) whenever n ≥ n0.

Example- lets find tightest lower bound of f(n) = 5n2+2n+1.

By looking to f(n) we can say that if we have a function g(n) = n2  (As, n2  is always less than 5n2) then it will be the tightest lower bound of f(n). In other words: f(n) ≥ c.g(n) for c=1 & n0=1. In Big Omega, we say f(n) = Ω(n2) & it is read as "f of n is big omega of n2".

Another thing to note is if f(n) = Ω(n2) then f(n) = Ω(n) or f(n) = Ω(logn) because n & logn are lesser than n2

But since we use Big Omega notation to represent tightest lower bound of the function, we will say f(n) = Ω(n2) instead of Ω(n) or Ω(logn).

Big Omega notation is usually used to specify time/space complexity of the given algorithm in best case scenario. Best case scenario is the situation when algorithm takes the least time to run or the least memory during it's execution.

3. Big Theta Notation [Ѳ]

This notation is used to find average time/space complexity of an algorithm.
Big Theta
Average complexity will always be between tightest upper bound and tightest lower bound.

Analytically we say as: f(n) belongs to Ѳ(g(n)) as there exists c > 0, d > 0 (e.g., c = 1, d=3) and n0 (e.g. n0 = 5) such that c.g(n) ≥ f(n) d.g(n) whenever n ≥ n0.

Example- suppose, we have a function f(n) = 7n3+2.
Then f(n) = O(n3) for c = 8, n0=2. Also, f(n) = Ω(n3) for c = 6, n0=2. Now as lower & upper bound of f(n) is n3 then we can say average of time complexity for f(n) is n3. Hence, f(n) = Ѳ(n3& it is read as "f of n is big theta of n3".

General rules while analyzing time complexity:
1. We analyze the time complexity for very large input size.
2. We consider worst case scenario i.e. Big O notation.
3. For any function, f(n) then we drop lower order terms & constant multipliers.
e.g. f(n) = 10n4+4n3+n2+2n+5 then f(n) = O(n4)
Here, we have dropped lower order terms which are 4n3,n2,2n & 5. We have considered 10nand finally we omitted 10, which is a constant multiplier & we are left with n4.

Thursday 26 May 2016

Asymptotic Notations- Big O Notation [O]

We always hear below common terms in the world of computer science: 

Data Structure- It defines how the data is arranged in computer’s memory.  For example- arrays, stacks , linked lists are few of the popular data structures. 

Algorithm- It is a set of instructions used to manipulate data structures. For example- sorting algorithm, searching algorithm etc.

Time Complexity- It is a measurement of rate of growth of time taken by any algorithm with respect to the increase in input size.

Space Complexity- Similar to the time complexity, it is a measurement of rate of growth of memory space taken by any algorithm with respect to the increase in input size.

Asymptotic Notation- It is a standard used for determining time/space complexities of an algorithm.

There are three basic asymptotic notations-
  1. Big O Notation [O]
  2. Big Omega Notation [Ω]
  3. Big Theta Notation [θ]
1. Big O Notation [O]

This notation represents tight upper bound of the given function.
Big O
Analytically we say as: f(n) belongs to O(g(n)) as there exists c > 0 (e.g., c = 1) and n0 (e.g. n0 = 5) such that f(n) ≤ cg(n) whenever n ≥ n0.

Lets see it through an example-
We have another function f(n) = 5n+4. Lets find the tightest upper bound of this function. By tightest upper bound I mean we have to find an another function (say g(n))  that will be always higher when we multiply g(n) by some constant (say c) and add some value of n (say n0) to it.

By looking to f(n) we can say that if we have a function g(n) = 6n then it will be the tightest upper bound of f(n). In other words: f(n) ≤ cg(n) for c=6 & n0=4. To represent it in Big O we say f(n) = O(n) & it is read as "f of n is big oh of g of n".

Now you might say what if you have taken g(n) = 6n2 or g(n) = 6n?
Well, you are right! And in this case complexity will be O(n2) and O(n3) respectively because f(n) will always be less than or equal to g(n) for any value of n.

But since we use Big O notation to represent tightest upper bound of the function, we will say f(n) = O(n) instead of O(n2) or O(n3).

Big O notation is usually used to specify time/space complexity of the given algorithm in worst case scenario. Worst case scenario is the situation when algorithm takes maximum time to run or maximum memory during it's execution.

In the next post we will see Big Omega & Theta Notations.