Session 10 : Operating System Programming Concepts : Lab 10 - Posix Threads & AWK
Lab
10: Posix Threads & Awk
Q?
What is thread
/> it
is a sequence of control within a process
/>
creating a new thread and creating a new process is different
/>
when we create a new thread in a process, the new thread of execution
gets its own stack (and hence local variable) but shares:
global
variables – file descriptors – signal handlers – its current
directory
Q?
How to use Posix Thread
/>
First thread creation is done :
int
pthread_create(pthread_t *thread, pthread_attr_t *attr, void
*(*start_routine)(void *) , void *arg);
void
*(*start_routine) (void *) is simply saying we must pass the address
of a function taking a pointer to void as parameter, and returning a
pointer to void.
/>
Second we join the created threads:
int
pthread_join( pthread_t th, void **thread_return)
/>
Lastly we exit the thread
void
pthread_exit(void * retval)
the
function terminates the calling thread, returning a pointer to an
object
Q?
What is awk
/> it
is a utility that interprets a special-purpose programming language
that makes it possible to handle
simple data- reforming jobs easy.
Ex: make
changes in text files when certain pattern appear
extract
data from parts of certain lines while discarding the rest
/>
actually it searches for line that contains certain pattern
/>
The rule of awk program is to search for one pattern and one action
to perform when that pattern is found :
pattern1 { action1 }
pattern2 { action2 }
Note:
* awk keeps processing input lines in this way until searching for
pattern in a line till the end of input file is reached
Q?
How to run awk programs
/> If
the program is short we run
as follows:
awk
'program' input-file1
input-file2 …....
Note:
the program consist of format described above
/> if
the program is long it is
usually more convienient to put it in a file and run it with a
command like this:
awk
-f program-file.awk
input-file1 input-file2 …........
Q?
How do we read input files
a/>
All input can be read from the
standard input (keyboard or pipe from another command).
b/>
or we can read from files whose
names are specified on a awk command line
/> in
case of reading from files , awk reads them in order, reading all the
data from one before going on to the next (this unit of reading is
called records : process 1
record at a time )
/> in
rare condition we use getline command which can do explicit input
form any number of files.
Q?
How input is split into records
/>
awk language divides its input into records
and fields and this records
are seperated by a character called record seperator (RS)
/> by
default the RS is a newline character (\n) i.e. the 1 record
is 1 single line
/> we
can also use the built-in variable
RS to use a different char to seperate our records
Q?
What is fields
/>
records are automatically parsed into chunks called fields.
/> by
default field are separated by
whitespace (ex: like words in a line)
/>
$1 refers to first field , $2 to second and so on
…........
/>
$NF represent the latest field (whose value is the number of
fields in the current records )
Q?
how fields are sperated
/>
this is controlled by field separator FS(built in variable)
/>
this FS is a single character or a regular expresssion
Q?
How do we print outputs
/> we
use print statement
/> we
can specify the string or numbers to be printed in a list separated
by comma
print
item1, item2,..............
Example:
awk '{if
(NF < 4) printf “line content is ” $0; }' file.in
Note:
* conditional syntax is same as C language
awk
'/user/' /etc/passwd → prints the line that contains string user
awk
'length($0) > 80 ' file.in → prints every line longer than 80
chars
awk
'NF > 0' → prints every line that has at least one field
awk
'{print $1}' file.in → prints the first field of file
/string1/
{if ($3>0) print $1 } # rule 1 comment
/string2/
{if($4>10) print $NF} #rule 2
Execution:
awk
-f simple_script.awk inputFile
Learning
goals: in this laboratory activity you will
practice writing C multithreading applications
by using
the pthread library.
You will
learn how to write simple AWK scripts and you will also improve your
Bash scripting
skills.
Exercise
1
Write a
concurrent program able to sort data files using threads as follows.
The
input files include:
• on
the first line the total number of integer values;
• on
the following lines a number for each line;
For
example a file could be:
5
3
45
76
9
11
The
program reads N input parameters, for each couple of parameters:
• The
first parameter identifies input files
• The
second one output files.
Then the
program creates N/2 threads.
Each
thread:
• reads
the corresponding input file
• sort
the corresponding integer vector in ascending order
• store
the result in the corresponding output file
For
example:
./thread_sort.exe
file1.in file1.out file2.in file2.out file3.in file3.out
It will
create 3 threads :
• Thread
1 will sort file1.in and it will store the result in file1.out
• Thread
2 will sort file2.in and it will store the result in file2.out
• Thread
3 will sort file3.in and it will store the result in file3.out
Hint:
Each
thread calls 3 functions:
1.
ReadFileIn
2. Sort
3.
WriteFileOut
The
program has to implement the following precedence graph(example with
4 parameters → 2 Threads
):
M1 , M2
: main begin and main end respectively
R1 , R2
: reading input files
O1 , O2
: vectors sorting
W1 , W2
: writing output files
Exercise
2
Write a
Bash script which reads a file name as the first command line
parameter and a word as the
second
command line parameter.
The file
includes a list of directories, and for each directory searches all
the text files (ending in
“.txt”)
and for each file generates statistics in two files:
1. The
first file with the same file name but ending in “.stat”
containing file statistics
→ number
of lines , number of chars , number of words and the length of the
longest line;
2. The
second with the same file name but ending in “.graph” containing
a histogram (made by
“+”
and “–“ symbols) representing the occurrence of each word
(second parameter) in the
text
file. For each line of the text file, the graph file contains the
line number followed by a
“+”
symbol for each occurrence of the word and a “–“ symbol if the
word does not appear
in the
corresponding line. sh
At the
end both the “.stat” file and the “.graph” file are stored in
a new directory with the same
name of
the original directory but ending in “_stats”.
Hint:
use basename command to remove extension of a filename (man basename)
Optional:
The script also create a compress archive for each stat directory and
move it to a
directory
named “backup”.
Exercise
3
Using
only AWK perform the following tasks:
1. Print
the name of the initialization process, the first process executed
with PID 1;
2. Print
the name and PID of the processes whose status is R or R+;
Hint:
use ps -el to list the processes and redirect the output to the awk
command
Summary
At the
end of this laboratory activity you should have understood how to use
threads to write
multithreading
applications. You should also have improved your understanding about
writing Bash
and AWK
scripts.
Comments
Post a Comment