Write My Paper Button

Create a binary tree of threads where each thread will compute the hash value Hash value The hash value of a string (very long) is a fixed length numeric value. Hash values are typically expressed in hex digits and can be long. In this project we will restrict the hash value type

Assignment Task

Multi-threaded hash tree

In this project, you will write a multi-threaded program to compute a hash value of a given file. The size of the file can be very large. So, there is a need for the program to be multi-threaded to achieve speedup. While one thread reads a part (block) of the file, other threads can compute hash values of already read blocks.

Binary tree threads

Create a binary tree of threads where each thread will compute the hash value

Hash value

The hash value of a string (very long) is a fixed length numeric value. Hash values are typically expressed in hex digits and can be long. In this project we will restrict the hash value type to a 32-bit unsigned integer. A hashing algorithm will read the input string (file) and produce a hash value. There are many hashing algorithms in the literature. For this project we will use Jenkins one at a time hash described in

One of the uses of hash value is to check the integrity of the data (file, database, etc.). One can think of a hash value of a file as a fingerprint of the file. If the file is modified, then the hash value also changes. By comparing the new and old hash values one can easily detect that the file was modified.

Computation of a hash value of a large file (several giga bytes in size) may take a while. Hence a need for a multi-threaded program.

Part 1- Multi-threaded hashing

A file can be divided into multiple blocks. The basic unit of accessing a file on a i/o device is a block. Assume there are n blocks, m threads, and n is divisible by m. Each thread can compute the hash value of n/m consecutive blocks. Several of these hash values are then combined to form a string whose hash value can be computed as before. These steps can be repeated till one final hash value remain.

A complete binary tree of threads should be created. Each leaf thread will compute the hash value of n/m consecutive blocks assigned to it and return the hash value to its parent thread (through pthread_exit() call). An interior thread will compute the hash value of the blocks assigned to it, and then wait for its children to return hash values. These hash values should be combined to form a string: . (Note that the + sign here means concatenation and not addition. Also, if the interior thread has no right child, only the hash value of the interior thread and the hash value from the left child are combined.) The interior thread then computes hash value of the concatenated string as before and returns the value to its parent, The value returned by the root thread is the hash value of the file.

Part 2- Experimental study

The main goal of this study is to find speedup achieved when the number of threads is increased for various input files. For each input file, find the time taken to compute the hash value for 1, 4, 16, 32, 64, 128, and 256 threads. To run 256 threads, increase ulimit -u to 256.

Plot a graph with time on y-axis, and threads on x-axis for each input file. Plot another graph with speedup on y-axis and #threads on x-axis for each input file.

speedup = (time taken for single thread)/(time taken for n threads)

Write a short report on what you observe. It is expected that the time taken to compute hash value decrease when the number of threads increase. Does it always happen? What is the speed-up achieved when the number of threads increased? Is the speedup proportional to the number of threads increased? Also, briefly describe the experimental set-up at the beginning of your report. (Command ‘lscpu’ provides relevant information for this)

Write a high-quality report. You should showcase the report in your job interview.

WhatsApp Widget
GET YOUR PAPER DONE