CSE3303

CSE3303 Sample Final

Q1: A cycle in the Resource Allocation graph indicates a deadlock provided
(a) there is more than one resource of each type
(b) there is one resource of each type
(c) there are more resources than processes
(d) there are more processes than resources
(e) none of the above

Answer: (b) there is one resource of each type

Q2: Which one of the below is NOT a method to detect a cycle in an undirected graph (ie. edges do not have arrows)?
(a) Depth-first search where we mark nodes visited
(b) Breadth-first search where we mark nodes visited
(c) Hansel-Gratel method
(d) Recursive search
(e) Identification of strongly connected components

Answer: (d) Recursive search

Q3: Which of the folloing makes sure that a directed graph is a tree?
(a) The graph can be levelized
(b) Every node has exactly one incoming edge
(c) There is no cycle and every node has at most one incoming edge
(d) There is a root node
(e) None of the above

Answer: (c) There is no cycle and every node has at most one incoming edge

Q4: If there is a deadlock, the following actions can be taken. List them in the order of their impact on the system's performance. Go from least impact to most impact. (1) Reboot the system (2) Toggle the power switch (3) Terminate processes randomly until the deadlock is broken (4) Detect processes involved in the deadlock and terminate them until the deadlock is broken (5) Detect processes involved in the deadlock and rool them back to their previous checkpoints until the deadlock is broken.

Answer: 5, 4, 3, 2, 1.

Q5: What does PCB stand for?

Answer: Process Control Block.

Q6: What is a RAMDISK? What is it useful for?

Answer: We can use part of our RAM as a disk drive, and such a drive is called RAMDISK. The advantage of a RAMDISK is file access speed. The downside is the drive volatile. However, a special program can be installed to save the contents on a true disk drive at power-down and reload at power-up. See Wikipedia for more info.

Q8: What is phishing?

Answer: Phishing is leading you to a decoy website. For ex., they send you an email which appears to be from your bank, and asks you to logon to yopur account and update some info. The site that the email leads you looks like your bank's website but actually it is just an imitation. You enter your username/password thinking that is your bank's website. They capture your username/password, logon to your bank account, and funnel your money out. Phishing can take various forms. An early form I encountered at my grad school in the US was when a student walked away from a computer w/o logging out. He left a program running. The program looked like login screen. People tried to login thinking that nobody was working on that machine. The student's program gave a wrong user or password message and in the process captured people's passwords.

Q9: What is interleaving?

Answer: Interl;heaving comes up in different contexts. Interleaving usually means using multiple copies of the same resource and spawning newer requests to extra copies of the resource while that resource is busy. This method increases throughput of storage devices such as RAM and HD.

Q10: What is authentication?

Answer: Authentication is making sure that a person (or agent) is who he claims he is. Even the most powerful encryption method is insecure unless we have a secure way of authenticating the agent(s) that carry the encrypted data. There is no magic mathematical solution to authentication. Authentication requires one or more trusted bodies. Without any trusted bodies, even the most sophisticated authentication method is meaningless. So to start with we should have at least one trusted body in existence.

Q11: What is RSA?

Answer: RSA stands for Rivest-Shamir-Adleman and is a popular public (asymmetric) key encrytion algorithm -- usually used for authentication rather than streaming data.

Q12: What is SAMBA?

Answer: SAMBA is a file serevr software that runs on UNIX machines. It lets UNIX file servers appear as Windows file servers to Windows machines and vice versa. Vice versa means the reverse situation. In our case, that means SAMBA also makes Windows file servers appear as UNIX file servers to UNIX machines. SAMBA is something you should look into. UNIX has a really efficient file system and hence its file servers serve data really fast. Take the same PC and turn it into a UNIX file server, and it will serve faster than the case when the same exact machine is a Windows file servers. Also, UNIX file servers need fewer reboots. UNIX/Linux still suffers in ease of use when it comes to ordinary users. However, it excels in the server department. Its file and more importantly web and database servers are faster (mostly due to a faster file system) and more stable.

Q13: What is automounting of drives?

Answer: Usually used in the context of network drives. Mounting is the process of mapping a drive to a directory path in our file system and making them accesible. In Windows, it is mapping them to a drive letter. Mounting requires a lengthy communication with the file server. Automounting is automatically mounting some drives at boot of logon time. Compare automounting to dynamic mounting.

Q14: What is dynamic mounting of drives?

Answer: In an enterprise network (ie. WAN), there are many network drives that we may possibly want to access. Mounting all of them at boot time makes the boot process too long especially if some of those networks do not respond and time out after long time out periods. Automounting may make a computer seem like it is not responding (ie. as if it locked up) when it is booting up. It is possible that we may not access many of those drives for a very long time if ever. Dynamic mounting mounts a drive the first time it is referred to (ie. access is requested). For ex., when you do cd Z:, then Z: drive mounts.

Q15: What is the distinguishing property of hard real-time systems? Give an example of a hard real-time system.

Answer: Hard real-time systems are systems where, when the timimg requirements are not met, either the system does not funtion ro it catastrophically fails. An ABS (Automatic Braking System in cars) system, most of aviation (ie. aeroplane) electronics, most weapon systems are hard real-time systems.

Q16: What is different between hard and soft real-time systems? Give an example of a soft real-time system.

Answer: A hard real-time system has to obey timing requiremets. However, in a soft real-time system it is desirable that timing requirements are met. If they are nit met, user experience gets worse but the result is not catastroph. Simply put, if a soft real-time system does not meet desired timing, nobody dies. A multimedia system such as an MP3 player or DivX player is a soft real-time system. If timing is not met, the audioor video is not smooth and gets interrupted at times. Take a look at Wikipedia for more info.

Q17: Give an example of a system which would lose a lot of performance if it had to only sequentially access files (ie. could not random-access files).

Answer: A database server.

Q18: What is the file operation used for random-access?

Answer: fseek() or also called just seek().

Q19: What is internal fragmentation in disks?

Answer: Internal fragmentation is space left unused at the end of every last block in files.

Q20: What is external fragmentation in disks?

Answer: External fragmentation is having blocks of a file scattered around the disk. If we could have all files allocated contiguously, there would be no external fragmentation.

Q21: What does defragmentation software do?

Answer: Defragmentation software moves the blocks around and thus makes files mapping to the disk blocks as contiguos as possible. This boosts disk access time by minimizing head movement.

Q22: What is the difference between a sector and block (also called cluster)?

Answer: A block is one or more sectors. The disk electronics serves disk data in chunks. ANd that chunk is a sector. However, the operating system may want to use bigger chunks called block or cluster. A block has to be an integer number of sectors. A sector is usually 512B. A block hence can be any multiple of 512B (= 0.5KB).

Q23: Let's say you are the system administrator of a Linux Apache web server. One of the links (which is to a plain html file) does not work. You are sure the file is there and other html files in the same directory are displayed without any problems. What might be wrong?

Answer: Possibly the permissions (ie. access privileges) are not right. Do "chmod 755 filename".

Q24: I have a HD of 40GB with a file system on it that has 4KB blocks. It is half full. The disk space lost due to internal fragmentation is 1%. How many files do have?

Answer: So 20GB is full. 1% lost means (20GBx1%=) 200MB is lost. Assuming an equal distribution of file sizes, the last block must be half full on the average. (Since we are possibly talking about thousands of files, there must be a file of every size. That is why the last block must be half full on the average.) So each file wastes 2KB. 200MB/2KB=100K. That hundred thousand files.

Q25: I have a HD of 40GB with four 10GB partitions. I have FAT32 file systems on my partitions with 4KB block size.. What is the size of each partition's table? What percentage of the disk space is lost to partition tables?

Answer: FAT32 uses 32b pointers. That is 4B for each pointer. A 4B pointer for one block (=4KB). Hence the overhead is one thousandth (1/1000=4B/4KB). 10GB partition requires 10MB table. Percentage-wise, it is 0.1%.

Q26: Let's say I have a disk with 10 blocks (0 through 9). And I have two files on the disk: fA and fB. fA takes up blocks 4 3 2 in that order. fB occupies 9 1 8 6 7. FAT file system is used. Give me the file allocation table contents. Use index -1 to mark the last block in a file.

Answer: 0 8 -1 2 3 5 7 -1 6 1.

   fA  fB
0:
1:     8
2: -1
3: 2
4: 3
5:
6:     7
7:     -1
8:     6
9:     1

Let me walk you through fA. The first block is block number 4. So go to 4. Write the index of the next block there: 3. Go to block 3. Write the next block index there: 2. Go to block 2. Since it is the last block, write EOF (End Of File) there, hence -1. The remaining blocks: 0 and 5 do not matter what they contain. They will not be pointed from the directopry table, anyway. I just made them point to themselves. If they were part of real file, they would never point to themselves.

Q27: Let's say I have a FAT disk with 10 blocks (0 through 9). Each block is 4KB. The file allocation table contains 0, 1, 3, 4, 6, 5, 7, -1, 8, 9. Is there a file on the file system.

Answer: Since there is a -1 (EOF: End Of File), there is a file on the system of size one block at the mminimum. The file may not be linked from a directory table but that does not mean it is not there. The file occupies the following blocks in that particular order: 2 3 4 6 7. (Work backwards from the block with -1.)

0: 0
1: 1
2: 3
3: 4
4: 6
5: 5
6: 7
7: -1
8: 8
9: 9

The last block is 7 because it contains -1. The block before last is the one that has 7 in it. That is block 6. So now we have ... 6 7. Next prepend the block that contains (ie. points to 6). That is 4. So now we have ... 4 6 7. It goes like that.

Q28: Consider the UNIX inode figure in the "File-System Implementation" chapter. The header is 8 words and each word is 4B. Note that intermediate index nodes contain nothing but pointers. Each pointer is 4B. What is the largest possible file size. Each block is 512B.

Answer: A rough calculation can be made as follows. The bulk of the capacity comes from the triple indirect pointer. Triple indirect pointer has a 1,000-fold storage capacity compared to the double indirect pointer. There are 512B/4B=128=2⁷ pointers in a pointer block. Triple indirect pointer leads us 2⁷x2⁷x2⁷ pointers in the leaf level (3rd level). That is 2M (2 million) pointers and hence 2²¹=2M blocks. In UNIX 1 block is 512B=2⁹B. Hence, the largest file size is roughly 1GB. (I believe this is not true anymore in UNIX. They possibly updated the inode structure in the last 10-15 years.)

The exact number is as follows. Add the storage capacity of the double indirect pointer to the above number. That is 8MB. Single indirect has 64KB. Direct blocks are 128-8-3=117. 117x512B=~60KB. So a closer number is 512GB plus 8MB plus 60KB.

Q29: We have 100 processes. Pi arrives at t=2i. P0's CPU burst is 200ms. Other processes all have a burst of 20ms. The OS is doing a RR (round robin) scheduling with a time quantum of tq. What is the average completion time (a) when tq=10 (b) when tq=100.

Answer: (a) P1 will finish first. Between t=10-20 its first 10ms will be executed. After the OS goes thru all 100 processes and does a 10ms of P0 again, it will execute and finish the burst of P1 (t=1010-1020). After that, every 10ms, a new process will finish. At t=2000, P1 thru P99 will have finished. At that point we executed 20 out of 200ms of P0. P0 will finish at t=2180. Completion time is finish time minus arrival time (ie. start time). Average completion time is sum of completion times divided by the number of processes (=100 in our case). Sum of completion times can be computed by summing finish times and subtracting sum of arrival times. Sum of finish times is equal to (Sum for i=1 to 99 of (1010 +10i)) +2180. That is equal to (1010x99 +10x(99x100)/2) +2180 = 151,670. Sum of arrival times is Sum for i=1 to 100 of (2i) = 2x(99x100)/2 = 99x100 = 9900. Hence sum of completion times = 151,670 -9,900 = 141,770. Divide that by 100 for average and you get approx. 1,517ms. This is really a large number. Most of the processes have a burst time of 20ms and they take almost 75 times that to complete.

(b) When tq=100, we do a 100 of P0 first, then processes finish their burst one by one before they get contexted out. Sum of completion times is (Sum for i=1 to 99 of (100 +20i)) +2180 = 111,080. Sum of arrival times was computed above: 9,900. Sum of completion times is 101,180. Average that over 100 processes and you get 1,012ms. That is a savings of 505ms per process. This also shows that tq should be picked such that an ordinary process' CPU burst should not be interrupted due to tq (ie. timer interrupt).

Q30: Which one is false in regards to swap space management in UNIX?
(a) UNIX uses a separate partition for swap space
(b) UNIX uses a different file system for swap space
(c) UNIX tries to allocate contiguous space for swapped processes since performance is the highest priority when it comes to swap space.
(d) UNIX's swap space infrastructure offers higher performance (ie. speed) than Windows.
(e) None of the above is false.

Answer: (e) None of the above is false.

Q31: Which of the following is false in regards to free disk space management methods?
(a) Bit Vector method allows contiguous allocation.
(b) Bit Vector method suffers in allocation time.
(c) Linked List method good allocation and update time.
(d) Linked List method suffers in contiguous allocation department.
(e) Counting Linked List method excels in update time.

Answer: (e) Counting Linked List method excels in update time.

Q32: Go over the Linux questions in MQ and MT and any other Linux questions posted on the webpage.

Q33: Consider the following snapshot of a system:

       Alloc    Req       Max     Avail
      A B C D  A B C D  A B C D  A B C D
      -------  -------  -------  -------
  P0  2 0 0 2  4 5 4 3  4 5 4 3  1 1 2 4
  P1  1 0 2 0  2 3 2 1  2 3 2 3
  P2  1 1 0 0  3 4 5 6  5 4 5 6
  P3  1 3 2 0  3 5 6 2  3 7 6 2

(a) Show that the system is in a deadlock. Explain why. (b) What is the min number extra resources we need to add to Avail to make the system free of deadlock. (c) Add that many resources and then show that the system is now in a non-deadlock state by giving a sequence.

Answer: Ignore the Max matrix. Solve this question just like the Banker's questions in MQ and MT (deadlock avoidance questions) but use Request in place of Max.

Q34: How do you a open a file if you are going to use it as a communication medium between two programs, ie. a substitute for IPC?

Answer: Files are normally opened for read or write or read/write. If you want to use a file as a communication medium (ie. buffer/pipe) between two processes, then you would normally think that one process opens the file for write and the other for read. However, that does not really work. The problem is as follows. A regular file_open_for_read function is designed to work with static files. Hence, after it reaches the EOF, it does not check to see if another process appended more data to the end of the file. It considers its job of reading the file done once it reaches EOF. In Perl programming language, a regular file_open_for_read function is as follows: open(FILEHANDLE, '< filepath'). However, for the pipe between the two processes to be established through the file, we need to open it as follows: open(FILEPATH, 'cat filepath |'). There must be the equivalent of this in every programming language. It must be achievable with several lines of code if not achieved by a single function call.

Q35: In UNIX let's say we have a file which has its x permission set but it is a plain text file. Can the file be executed? If yes, how?

Answer: Yes, it can be executed although it is a text file. If the executable file is a text file, it is interpreted as a script file. If the first line of the file starts with #!, the file is input to the program whose absolute path is specified after #! characters. If there is #! at the beginning, the file is assumed to be a Bourne shell (sh) script.

Q36: Most filesystems decide whether a block is an index block (a set of pointers to blocks) or data statically. At Juniper Networks, we used a mechanism, which lets a block be either a data block or index block based on a header bit in the block itself. Give this solve the following problems. Assume a block size of 4KB. A pointer is 4B. Assume we can fit 4KB of data in a data block and 1K pointers in an index block. Let's say we have 1,000 1KB files, 500 5KB files, and 1 100MB file. (a) How much disk space will these files use in our flexible file system? (b) What is the largest file size we can have in this flexible file system? (c) Instead of this flexible file system, if we had a 3 level indexing mechanism (3 levels in all cases), how much disk space would be used? (d) What is the largest file size in this inflexible file system?

Answer: (a) The 1,000 1KB files can be stored in one 4KB (data) block each. A 5KB file needs 2 data blocks, and we need an index block to point to them. Hence, a 5KB file occupies 3 blocks (ie. 12KB). For the 100MB file, we need 100MB/4KB (=25K) data blocks. One index block contains 1K pointers. I would use 25 pointers in the root index block. Then, I would fully utilize all the 1K pointers in all of the 25 2nd level index blocks. Hence, we need 1+25+25K blocks. The total size on disk is 1x1,000 + 3x500 + (25K+26)x1 = 28,126 blocks = 112,504 KB.

(b) The largest file size is unbounded because we can any levels of indirections (ie. pointers). However since a pointer is 4B=32b, we can have at most 2³²=4G blocks. That is 4Gx4KB=16TB. Hence, the whole file system can 16TB at most, so is a single file. Max file size is 16TB.

(c) A 1KB file would occupy 4 blocks. A 5KB file would occupy 5 blocks. A 100MB file 25K+27 blocks. Total is 4x1,000 + 5x500 + (25K+27)x1 = 32,127 blocks = 128,508 KB. 16KB more then the above file system.

(d) (2¹⁰)=1G pointers and hence 4TB.

The file system proposed here with a flexible number levels is especially beneficial in the presence of many small files. And in fact, on many OSes and machines, many small files lie around.

Q37: Let's say our disk's head is on track 50 and the pending requests are for tracks 29, 92, 45, 60, 49, 52. Use SSTF to schedule the disk accesses.

Answer: SSTF stands for Shortest Seek Time First. This is a greedy (ie. acgozlu = kestirme) heuristic. It goes to the nearest track at every point. It goes like this: 50 -> 49 -> 52 -> 45 -> 60 -> 29 -> 92.

The total head movement is 120 tracks. If we went like this: 50 -> 49 -> 45 -> 29 -> 52 -> 60 -> 92, the total distance travelled would have been 84. The problem here is very similar to TSP (Travelling Salesman Problem), which is an N-P complete problem. These problems do not have a magic and nice solution. One apporach that gives a good solution in some cases may give pretty bad results in other cases. The only way to make sure you find the most optimal solution is to try nearly all possibilities. The reason SSTF makes too much zigzag in our case is because I designed a chronic example which is pretty bad for SSTF. Under normal circumstances, SSTF minimizes head movement pretty well.