Introduction to algorithms / Thomas H. Cormen [et al.]. Probabilistic Analysis and Randomized Algorithms The hiring .. The PDF files for this. Introduction to algorithms / Thomas H. Cormen [et al.]nd ed. p. cm. their design and analysis accessible to all levels of readers. We have tried to keep. by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein . Published by Solutions Chapter 5: Probabilistic Analysis and Randomized Algorithms . We created the PDF files for this manual on a. MacBook Pro.
|Language:||English, French, Japanese|
|ePub File Size:||18.35 MB|
|PDF File Size:||12.11 MB|
|Distribution:||Free* [*Sign up for free]|
by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein . Published by Chapter 5: Probabilistic Analysis and Randomized Algorithms. Welcome to my page of solutions to "Introduction to Algorithms" by Cormen, Once the remaining 5 problems are finished, I'll be preparing a combined pdf with . The algorithms are described in English and in a pseudocode designed to be on the role of algorithms, probabilistic analysis and randomized algorithms, and of algorithms for over two decades, I can unequivocally say that the Cormen et .
Shaded arrows show array values moved one position to the right in line 6, and black arrows indicate where the key is moved to in line 8. We use loop invariants to help us understand why an algorithm is correct. Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration. Termination: When the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct.
When the first two properties hold, the loop invariant is true prior to every iteration of the loop. Note the similarity to mathematical induction, where to prove that a property holds, you prove a base case and an inductive step.
Here, showing that the invariant holds before the first iteration is like the base case, and showing that the invariant holds from iteration to iteration is like the inductive step. The third property is perhaps the most important one, since we are using the loop invariant to show correctness.
It also differs from the usual use of mathematical induction, in which the inductive step is used infinitely; here, we stop the "induction" when the loop terminates. Let us see how these properties hold for insertion sort. Moreover, this subarray is sorted trivially, of course , which shows that the loop invariant holds prior to the first iteration of the loop.
Maintenance: Next, we tackle the second property: showing that each iteration maintains the loop invariant. Informally, the body of the outer for loop works by moving A[ j - 1], A[ j - 2], A[ j - 3], and so on by one position to the right until the proper position for A[ j] is found lines , at which point the value of A[j] is inserted line 8. A more formal treatment of the second property would require us to state and show a loop invariant for the "inner" while loop. At this point, however, we prefer not to get bogged down in such formalism, and so we rely on our informal analysis to show that the second property holds for the outer loop.
For insertion sort, the outer for loop ends when j exceeds n, i. But the subarray A[1 n] is the entire array! Hence, the entire array is sorted, which means that the algorithm is correct.
We shall use this method of loop invariants to show correctness later in this chapter and in other chapters as well. Pseudocode conventions We use the following conventions in our pseudocode. Indentation indicates block structure. For example, the body of the for loop that begins on line 1 consists of lines , and the body of the while loop that begins on line 5 contains lines but not line 8.
Our indentation style applies to if-then-else statements as well.
Cormen T.H. et al. Introduction to Algorithms: Solutions to exercises and problems
Using indentation instead of conventional indicators of block structure, such as begin and end statements, greatly reduces clutter while preserving, or even enhancing, clarity. The looping constructs while, for, and repeat and the conditional constructs if, then, and else have interpretations similar to those in Pascal.
Thus, immediately after a for loop, the loop counter's value is the value that first exceeded the for loop bound. We used this property in our correctness argument for insertion sort. Variables such as i, j, and key are local to the given procedure. We shall not use global variables without explicit indication. Array elements are accessed by specifying the array name followed by the index in square brackets. For example, A[i] indicates the ith element of the array A. The notation " " is used to indicate a range of values within an array.
Thus, A[1 j] indicates the subarray of A consisting of the j elements A, A,. Compound data are typically organized into objects, which are composed of attributes or fields.
A particular field is accessed using the field name followed by the name of its object in square brackets. For example, we treat an array as an object with the attribute length indicating how many elements it contains. To specify the number of elements in an array A, we write length[A]. Although we use square brackets for both array indexing and object attributes, it will usually be clear from the context which interpretation is intended.
A variable representing an array or object is treated as a pointer to the data representing the array or object. Sometimes, a pointer will refer to no object at all.
In this case, we give it the special value NIL. Parameters are passed to a procedure by value: the called procedure receives its own copy of the parameters, and if it assigns a value to a parameter, the change is not seen by the calling procedure. When objects are passed, the pointer to the data representing the object is copied, but the object's fields are not.
The boolean operators "and" and "or" are short circuiting. That is, when we evaluate the expression "x and y" we first evaluate x. If, on the other hand, x evaluates to TRUE, we must evaluate y to determine the value of the entire expression. Exercises 2. Write pseudocode for linear search, which scans through the sequence, looking for v.
Using a loop invariant, prove that your algorithm is correct. Make sure that your loop invariant fulfills the three necessary properties. State the problem formally and write pseudocode for adding the two integers. Occasionally, resources such as memory, communication bandwidth, or computer hardware are of primary concern, but most often it is computational time that we want to measure.
Generally, by analyzing several candidate algorithms for a problem, a most efficient one can be easily identified. Such analysis may indicate more than one viable candidate, but several inferior algorithms are usually discarded in the process. Before we can analyze an algorithm, we must have a model of the implementation technology that will be used, including a model for the resources of that technology and their costs.
For most of this book, we shall assume a generic one-processor, random-access machine RAM model of computation as our implementation technology and understand that our algorithms will be implemented as computer programs.
In the RAM model, instructions are executed one after another, with no concurrent operations. In later chapters, however, we shall have occasion to investigate models for digital hardware. Strictly speaking, one should precisely define the instructions of the RAM model and their costs.
To do so, however, would be tedious and would yield little insight into algorithm design and analysis. Yet we must be careful not to abuse the RAM model. For example, what if a RAM had an instruction that sorts?
Then we could sort in just one instruction. Such a RAM would be unrealistic, since real computers do not have such instructions. Our guide, therefore, is how real computers are designed. The RAM model contains instructions commonly found in real computers: arithmetic add, subtract, multiply, divide, remainder, floor, ceiling , data movement load, store, copy , and control conditional and unconditional branch, subroutine call and return.
Each such instruction takes a constant amount of time. The data types in the RAM model are integer and floating point. Although we typically do not concern ourselves with precision in this book, in some applications precision is crucial.
We also assume a limit on the size of each word of data. If the word size could grow arbitrarily, we could store huge amounts of data in one word and operate on it all in constant time-clearly an unrealistic scenario.
Real computers contain instructions not listed above, and such instructions represent a gray area in the RAM model. For example, is exponentiation a constant-time instruction? In the general case, no; it takes several instructions to compute xy when x and y are real numbers. In restricted situations, however, exponentiation is a constant-time operation. Many computers have a "shift left" instruction, which in constant time shifts the bits of an integer by k positions to the left.
In most computers, shifting the bits of an integer by one position to the left is equivalent to multiplication by 2.
Introduction to Algorithms
Shifting the bits by k positions to the left is equivalent to multiplication by 2k. Therefore, such computers can compute 2k in one constant-time instruction by shifting the integer 1 by k positions to the left, as long as k is no more than the number of bits in a computer word. We will endeavor to avoid such gray areas in the RAM model, but we will treat computation of 2k as a constant-time operation when k is a small enough positive integer.
In the RAM model, we do not attempt to model the memory hierarchy that is common in contemporary computers. That is, we do not model caches or virtual memory which is most often implemented with demand paging. Several computational models attempt to account for memory-hierarchy effects, which are sometimes significant in real programs on real machines.
A handful of problems in this book examine memory-hierarchy effects, but for the most part, the analyses in this book will not consider them. Models that include the memory hierarchy are quite a bit more complex than the RAM model, so that they can be difficult to work with. Moreover, RAM-model analyses are usually excellent predictors of performance on actual machines. Analyzing even a simple algorithm in the RAM model can be a challenge.
The mathematical tools required may include combinatorics, probability theory, algebraic dexterity, and the ability to identify the most significant terms in a formula. Because the behavior of an algorithm may be different for each possible input, we need a means for summarizing that behavior in simple, easily understood formulas.
Even though we typically select only one machine model to analyze a given algorithm, we still face many choices in deciding how to express our analysis.
We would like a way that is simple to write and manipulate, shows the important characteristics of an algorithm's resource requirements, and suppresses tedious details. In general, the time taken by an algorithm grows with the size of the input, so it is traditional to describe the running time of a program as a function of the size of its input. To do so, we need to define the terms "running time" and "size of input" more carefully. The best notion for input size depends on the problem being studied.
For many problems, such as sorting or computing discrete Fourier transforms, the most natural measure is the number of items in the input-for example, the array size n for sorting.
For many other problems, such as multiplying two integers, the best measure of input size is the total number of bits needed to represent the input in ordinary binary notation. Sometimes, it is more appropriate to describe the size of the input with two numbers rather than one.
For instance, if the input to an algorithm is a graph, the input size can be described by the numbers of vertices and edges in the graph. We shall indicate which input size measure is being used with each problem we study. The running time of an algorithm on a particular input is the number of primitive operations or "steps" executed. It is convenient to define the notion of step so that it is as machineindependent as possible.
For the moment, let us adopt the following view. A constant amount of time is required to execute each line of our pseudocode.
One line may take a different amount of time than another line, but we shall assume that each execution of the ith line takes time ci , where ci is a constant. This viewpoint is in keeping with the RAM model, and it also reflects how the pseudocode would be implemented on most actual computers.
This simpler notation will also make it easy to determine whether one algorithm is more efficient than another. When a for or while loop exits in the usual way i. We assume that comments are not executable statements, and so they take no time. If the array is in reverse sorted order-that is, in decreasing order-the worst case results.
Typically, as in insertion sort, the running time of an algorithm is fixed for a given input, although in later chapters we shall see some interesting "randomized" algorithms whose behavior can vary even for a fixed input. Worst-case and average-case analysis In our analysis of insertion sort, we looked at both the best case, in which the input array was already sorted, and the worst case, in which the input array was reverse sorted.
For the remainder of this book, though, we shall usually concentrate on finding only the worst-case running time, that is, the longest running time for any input of size n. We give three reasons for this orientation. Knowing it gives us a guarantee that the algorithm will never take any longer. We need not make some educated guess about the running time and hope that it never gets much worse.
For some algorithms, the worst case occurs fairly often. For example, in searching a database for a particular piece of information, the searching algorithm's worst case will often occur when the information is not present in the database. In some searching applications, searches for absent information may be frequent.
The "average case" is often roughly as bad as the worst case.
A few portions of the book rely on some knowledge of elementary calculus. We have heard, loud and clear, the call to supply solutions to problems and exercises. Feel free to check your solutions against ours.
We ask, however, that you do not send your solutions to us. To the professional The wide range of topics in this book makes it an excellent handbook on algorithms. Because each chapter is relatively self-contained, you can focus in on the topics that most interest you. Most of the algorithms we discuss have great practical utility.
We therefore address implementation concerns and other engineering issues. We often provide practical alternatives to the few algorithms that are primarily of theoretical interest.
Introduction to Algorithms, Third Edition
We have designed the pseudocode to present each algorithm clearly and succinctly. We attempt to present each algorithm simply and directly without allowing the idiosyncrasies of a particular programming language to obscure its essence. We understand that if you are using this book outside of a course, then you might be unable to check your solutions to problems and exercises against solutions provided by an instructor.
Please do not send your solutions to us. To our colleagues We have supplied an extensive bibliography and pointers to the current literature. Each chapter ends with a set of chapter notes that give historical details and references. Though it may be hard to believe for a book of this size, space constraints prevented us from including many interesting algorithms. Changes for the third edition What has changed between the second and third editions of this book?
As we said about the second-edition changes, depending on how you look at it, the book changed either not much or quite a bit. A quick look at the table of contents shows that most of the second-edition chapters and sections appear in the third edition.
We removed two chapters and one section, but we have added three new chapters and two new sections apart from these new chapters. Rather than organizing chapters by only problem domains or according only to techniques, this book has elements of both. It contains technique-based chapters on divide-and-conquer, dynamic programming, greedy algorithms, amortized analysis, NP-Completeness, and approximation algorithms.
But it also has entire parts on sorting, on data structures for dynamic sets, and on algorithms for graph problems. One key idea in the sorting networks chapter, the principle, appears in this edition within Problem as the sorting lemma for compareexchange algorithms. The treatment of Fibonacci heaps no longer relies on binomial heaps as a precursor. Dynamic programming now leads off with a more interesting problem, rod cutting, than the assembly-line scheduling problem from the second edition.
Furthermore, we emphasize memoization a bit more than we did in the second edition, and we introduce the notion of the subproblem graph as a way to understand the running time of a dynamic-programming algorithm. In our opening example of greedy algorithms, the activity-selection problem, we get to the greedy algorithm more directly than we did in the second edition. With our new way to delete nodes, if other components of a program maintain pointers to nodes in the tree, they will not mistakenly end up with stale pointers to nodes that have been deleted.
Most of these errors were posted on our Web site of second-edition errata, but a few were not. We also now use dot-notation to indicate object attributes. Our pseudocode remains procedural, rather than object-oriented. In other words, rather than running methods on objects, we simply call procedures, passing objects as parameters. We also updated many bibliography entries and added several new ones. The Web site links to a list of known errors, solutions to selected exercises and problems, and of course a list explaining the corny professor jokes, as well as other content that we might add.
The Web site also tells you how to report errors or make suggestions. We used the Times font with mathematics typeset using the MathTime Pro 2 fonts. We drew the illustrations for the third edition using MacDraw Pro, with some of the mathematical expressions in illustrations laid in with the psfrag package for LATEX 2". Unfortunately, MacDraw Pro is legacy software, having not been marketed for over a decade now.
Happily, we still have a couple of Macintoshes that can run the Classic environment under OS Hence the decision to revert to MacDraw Pro running on older Macintoshes. We thank our respective universities and colleagues for providing such supportive and stimulating environments.
Julie Sussman, P. Time and again, we were amazed at the errors that eluded us, but that Julie caught. She also helped us improve our presentation in several places. She is nothing short of phenomenal. Thank you, thank you, thank you, Julie! Priya Natarajan also found some errors that we were able to correct before this book went to press.
Any errors that remain and undoubtedly, some do are the responsibility of the authors and probably were inserted after Julie read the material. The chapter on multithreading was based on notes originally written jointly with Harald Prokop. We rejoice that the number of such contributors has grown so great that we must regret that it has become impractical to list them all.
The patience and encouragement of our families made this project possible. We affectionately dedicate this book to them. It is intended to be a gentle introduction to how we specify algorithms, some of the design strategies we will use throughout this book, and many of the fundamental ideas used in algorithm analysis. Later parts of this book will build upon this base. Chapter 1 provides an overview of algorithms and their place in modern computing systems.
It also makes a case that we should consider algorithms as a technology, alongside technologies such as fast hardware, graphical user interfaces, object-oriented systems, and networks.
They are written in a pseudocode which, although not directly translatable to any conventional programming language, conveys the structure of the algorithm clearly enough that you should be able to implement it in the language of your choice. We determine these running times in Chapter 2, and we develop a useful notation to express them. The rest of Chapter 3 is primarily a presentation of mathematical notation, more to ensure that your use of notation matches that in this book than to teach you new mathematical concepts.
Chapter 4 contains methods for solving recurrences, which are useful for describing the running times of recursive algorithms.
Although much of Chapter 4 is devoted to proving the correctness of the master method, you may skip this proof yet still employ the master method. Chapter 5 introduces probabilistic analysis and randomized algorithms.
We typically use probabilistic analysis to determine the running time of an algorithm in cases in which, due to the presence of an inherent probability distribution, the running time may differ on different inputs of the same size. In some cases, we assume that the inputs conform to a known probability distribution, so that we are averaging the running time over all possible inputs.
In other cases, the probability distribution comes not from the inputs but from random choices made during the course of the algorithm. An algorithm whose behavior is determined not only by its input but by the values produced by a random-number generator is a randomized algorithm. We can use randomized algorithms to enforce a probability distribution on the inputs—thereby ensuring that no particular input always causes poor performance—or even to bound the error rate of algorithms that are allowed to produce incorrect results on a limited basis.
On the other hand, you probably have not already seen most of the material in Part I. Why is the study of algorithms worthwhile?
What is the role of algorithms relative to other technologies used in computers? In this chapter, we will answer these questions. An algorithm is thus a sequence of computational steps that transform the input into the output. For example, we might need to sort a sequence of numbers into nondecreasing order.
This problem arises frequently in practice and provides fertile ground for introducing many standard design techniques and analysis tools.
For example, given the input sequence h31; 41; 59; 26; 41; 58i, a sorting algorithm returns as output the sequence h26; 31; 41; 41; 58; 59i. Such an input sequence is called an instance of the sorting problem. In general, an instance of a problem consists of the input satisfying whatever constraints are imposed in the problem statement needed to compute a solution to the problem. As a result, we have a large number of good sorting algorithms at our disposal. Which algorithm is best for a given application depends on—among other factors—the number of items to be sorted, the extent to which the items are already somewhat sorted, possible restrictions on the item values, the architecture of the computer, and the kind of storage devices to be used: main memory, disks, or even tapes.
An algorithm is said to be correct if, for every input instance, it halts with the correct output. We say that a correct algorithm solves the given computational problem. An incorrect algorithm might not halt at all on some input instances, or it might halt with an incorrect answer.
Contrary to what you might expect, incorrect algorithms can sometimes be useful, if we can control their error rate. Ordinarily, however, we shall be concerned only with correct algorithms.
What kinds of problems are solved by algorithms? Sorting is by no means the only computational problem for which algorithms have been developed. You probably suspected as much when you saw the size of this book.In this book, we shall typically describe algorithms as programs written in a pseudocode that is similar in many respects to C, Pascal, or Java.
An incorrect algorithm might not halt at all on some input instances, or it might halt with an incorrect answer. D Matrices D. And memory may be inexpensive, but it is not free.
Lines , illustrated in Figure 2.