Thursday, June 25, 2015

Implementing a tree using two python dictionaries

A tree is a data structure where one to many relationships can be represented using nodes. These nodes can have values and may be represented using objects in an Object Oriented manner. Yet the idea of object usage may not be very appealing where the operations are time critical and the implementation subjects to memory constraints.

Though trees are used in heavy computational tasks such as neural networks (in a modified manner). Another area where related nodes/ objects are used is graphs. These can be used to simulate relationships that we see in social networks and so on.

This post mainly focuses on building a tree structure to support large number of raw data at a very high access and modifying speed.
This also supports path tracing where the path between two nodes could be derived as a list of nodes in between.

There are two dictionaries involved and their tasks are as below

treeDic = {}

This is used to store the parent nodes and their child nodes. Each entry will carry the value allocated for the tree node (parent) as well. This has been done for the completeness. Yet this could be performed in many other ways.

Data here is stored as follows.
treeDic = {1 : [ value, [ 2, 3 ] ] }
  • 1 - This is the parent node value
  • value - This is an integer to store the value of the node/ the data item or even a separate object
  • [ 2 , 3 ] - The array containing the child nodes
parentDic = {}

This is used to keep the parent nodes of each of the nodes. It is important that we note that there will be no parent node for the tree root and this has to be checked before we proceed with the upward tree traversal

The preconditions that are required are as follows
  •  All the parent-child nodes should be connected in some form
  • A child node can have only one parent
  • There could be no or more child nodes (This is not limited to a binary tree, this is k-ary)
 A small note on the getPath(node1, node2) function

This uses two temporary arrays where each array will save the path from the node to the tree root. This is done for both the nodes and then the node list is concatenated followed by getting the set(). This will provide the set of nodes that are there in between the given two nodes.

This could further be optimized which I have not mentioned but here is how. We could iterate through the nodes simultaneously and once we see a common node we may stop and then concatenate the arrays of nodes accordingly. Here the simultaneous looping will not negatively affect the time complexity since the deepest node will anyway have to iterate that number of times to meet the common node.

Before viewing the code, here are some advantages of this implementation.
  • Fast access to every node O(1)
  • Low memory consumption
  • Can have millions of nodes
  • Search is O(1) if we know the node ID
  • Tree is capable of traversing child-parent and parent-child either ways

Saturday, June 20, 2015

Looping extremely large iteration numbers Python

This sum is the Euler 2 but the hacker ranks requires this to be done in less than 10 seconds at the same time with an extraordinarily large value of N = 4 * 10^16 which is not supported in python with xrange() function.

Thus our savior is itertools which is an inbuilt library that we have in handy.

The algorithm does not calculate Fibonacci numbers explicitly but calculate them directly within the loop in function fibo(n)

This is the code for the simulation

Maximum sub array of given length in linear time

 In this post I'm going to elaborate on a technique I used to calculate maximum sub array in linear time (fixed length maximum => max product).
  •  The below code iterate on the array as a normal loop iteration.
  •  In each iteration the loop variable product is taken and if it exceeds the maximum it is induced for a separate variable.
  • At the same time the number that was in the loop variable is queued into a separate array
  • As this continues the numbers that are completed will reach the maximum allowed sub array length. Now we should make sure next steps wont damage the fixed sub array length.
  • Now dequeue the queue and divide the maximum with the dequeued value and multiply with the next value in the iteration.
  • Continue this until the entire array is covered.

For the task two variables are required to save the temporary maximum value and the resultant value. Also the algorithm runs at O(n) time complexity.
The brute-force approach will require using loop within a loop and iterate over and over again requiring a huge computational power. Thus I have used this method in Euler 8 sum at Hackerrank :)

Special case where zeros contains has been handled in the main loop where the input is processed. Splitting with zeros and having strings lesser in length than required gives no useful results => more optimization to the algorithm ;)

Here is the code :)

Thursday, June 11, 2015

Recursion part 1

Recursion is a programming concept where the same function is called within the same function directly or indirectly. Despite the rigorous meaning this is extensively used in mathematical and other algorithms.

For an example recursion is used to search within folders where each sub-folder is searched recursively by calling the search method again and again from sub-folders.

In this article an interesting scenario where a child buying toffees and returning toffee covers to get toffees again is discussed. Before moving on this is a simple example where recursion could be used to calculate factorial. Well this is not the most efficient way of calculating the factorial since recursive depth is limited by the programmers stack space. We'll be using python language this time.

Now here comes the real problem.

"Child can buy n toffees with Rs. n. He can get a toffee by returning 2 toffee covers. How many toffees he can try with 100 rupees?"

This is clearly a recursive call since the child will return toffee covers to buy toffees for the second time. So we should consider up coming chances for the kid to buy toffees.

Say he has 5 rupees, he take 5 toffees and has 5 covers. Then he returns 4 covers and take 2 toffees. Now he has 3 covers and he can buy another toffee and he has 1 cover in hand as well. So he can buy another toffee too. In total the fellow has 9 toffees eaten. So let's code this in python. The method is calculate().

Friday, June 5, 2015

Greatest Common Divisor

GCD calculation in ordinary maths is used to find whether two numbers are relatively prime. This means that two numbers have no factors in common. In other words telling that the two numbers ratio is a simplified fraction.

This is done by;
1. Get the smaller ratio between numbers
2. If the ratio remainder is zero the lower value is the answer
3. Else continue the above two steps with the smaller number and (greater number % smaller number) this is modulo division where you get the remainder.

Calculating Array Length

Array length is not a big deal in modern high level languages. Still this is more or less impossible with inbuilt libraries in C language. Yet C is very important to code time critical algorithms where you need quick computations with maximum hardware utilization

In C '\0' means the NULL terminator. So what we are planning to do is go through the entire array until the null is met.

This could similarly performed by having stored the length in a struct and using that as an ArrayList available in Java or a list in C#. But this is a quick fix for simple algorithmic simulations. :)