Friday, October 07, 2016

Technical Interviews

[Kenny Yu, Harvard University]
[This is CS50.] [CS50.TV]
Hi everyone, I'm Kenny. I am currently a junior studying computer science.
I'm a former CS TF, and I wish I had this when I was an underclassman,
and that's why I'm giving this seminar.
So I hope you enjoy it.
This seminar is about technical interviews,
and all my resources can be found at this link,
this link right here, a couple of resources.
So I made a list of problems, actually, quite a few problems.
Also a general resources page where we can find tips
on how to prepare for an interview,
tips on what you should do during an actual interview,
as well as how to approach problems and resources for future reference.
It's all online.
And just to preface this seminar, a disclaimer,
like this should not--your interview preparation
should not be limited to this list.
This is only meant to be a guide,
and you should definitely take everything I say with a grain of salt,
but also use everything I used to help you in your interview preparation.
I'm going to speed through the next few slides
so we can get to the actual case studies.
The structure of an interview for a software engineering postion,
typically it is 30 to 45 minutes,
multiple rounds, depending on the company.
Often you'll be coding on a white board.
So a white board like this, but often on a smaller scale.
If you're having a phone interview, you'll probably be using
either collabedit or a Google Doc so they can see you live coding
while you're being interviewed over the phone.
An interview itself is typically 2 or 3 problems
testing your computer science knowledge.
And it will almost definitely involve coding.
The types of questions that you'll see are usually data structures and algorithms.
And in doing these kinds of problems,
they will ask you, like, what is the time and space complexity, big O?
Often they also ask higher-level questions,
so, designing a system,
how would you lay out your code?
What interfaces, what classes, what modules do you have in your system,
and how do these interact together?
So data structures and algorithms as well as designing systems.
Some general tips before we dive in to our case studies.
I think the most important rule is always be thinking out loud.
The interview is supposed to be your chance to show off your thinking process.
The point of the interview is for the interviewer to gauge
how you think and how you go through a problem.
The worst thing you can do is be silent throughout the whole interview.
That's just no good.
When you are given a question, you also want to make sure you understand the question.
So repeat the question back in your own words
and attempt to work thorough a few simple test cases
to make sure you understand the question.
Working through a few test cases will also give you an intuition on how to solve this problem.
You might even discover a few patterns to help you solve the problem.
Their big tip is to not get frustrated.
Don't get frustrated.
Interviews are challenging, but the worst thing you can do,
in addition to being silent, is to be visibly frustrated.
You do not want to give that impression to an interviewer.
One thing that you--so, many people, when they go into an interview,
they attempt to try to find the best solution first,
when really, there's usually a glaringly obvious solution.
It might be slow, it might be inefficient, but you should just state it,
just so you have a starting point from which to work better.
Also, pointing out the solution is slow, in terms of
big O time complexity or space complexity,
will demonstrate to the interviewer that you understand
these issues when writing code.
So don't be afraid to come up with the simplest algorithm first
and then work better from there.
Any questions so far? Okay.
So let's dive into our first problem.
"Given an array of n integers, write a function that shuffles the array
in place such that all permutations of the n integers are equally likely."
And assume you have available a random integer generator
that generates an integer in a range from 0 to i, half range.
Does everyone understand this question?
I give you an array of n integers, and I want you to shuffle it.
In my directory, I wrote a few programs to demonstrate what I mean.
I'm going to shuffle an array of 20 elements,
from -10 to +9,
and I want you to output a list like this.
So this is my sorted input array, and I want you to shuffle it.
We'll do it again.
Does everyone understand the question? Okay.
So it's up to you.
What are some ideas? Can you do it as n^2, n log n, n?
Open to suggestions.
Okay. So one idea, suggested by Emmy,
is first compute a random number, random integer, in a range from 0 to 20.
So assume our array has a length of 20.
In our diagram of 20 elements,
this is our input array.
And now, her suggestion is to create a new array,
so this will be the output array.
And based on the i returned by rand--
so if i was, let's say, 17,
copy the 17th element into the first position.
Now we need to delete--we need to shift all the elements here
over so that we have a gap at the end and no holes in the middle.
And now we repeat the process.
Now we pick a new random integer between 0 and 19.
We have a new i here, and we copy this element into this position.
Then we shift items over and we repeat the process until we have our full new array.
What is the run time of this algorithm?
Well, let's consider the impact of this.
We are shifting every element.
When we remove this i, we are shifting all the elements after it to the left.
And that is an O(n) cost
because what if we remove the first element?
So for each removal, we remove--
each removal incurs an O(n) operation,
and since we have n removals, this leads to an O(n^2) shuffle.
Okay. So good start. Good start.
Another suggestion is to use something known as the Knuth shuffle,
or the Fisher-Yates shuffle.
And it's actually a linear time shuffle.
And the idea is very similar.
Again, we have our input array,
but instead of using two arrays for our input/output,
we use the first portion of the array to keep track of our shuffled portion,
and we keep track, and then we leave the rest of our array for the unshuffled portion.
So here's what I mean. We start off with--we choose an i,
an array from 0 to 20.
Our current pointer is pointing to the first index.
We choose some i here
and now we swap.
So if this was 5 and this was 4,
the resulting array will have 5 here and 4 here.
And now we note a marker here.
Everything to the left is shuffled,
and everything to the right is unshuffled.
And now we can repeat the process.
We choose a random index between 1 and 20 now.
So suppose our new i is here.
Now we swap this i with our current new position here.
So we do a swapping back and forth like this.
Let me bring up the code to make it more concrete.
We start with our choice of i--
we start with i equal to 0, we pick a random location j
in the unshuffled portion of the array, i to n-1.
So if I'm here, choose a random index between here and the rest of the array,
and we swap.
This is all the code necessary to shuffle your array.
Any questions?
Well, one needed question is, why is this correct?
Why is every permutation equally likely?
And I won't go through the proof of this,
but many problems in computer science can be proven through induction.
How many of you are familiar with induction?
Okay. Cool.
So you can prove the correctness of this algorithm by simple induction,
where your induction hypothesis would be, assume that
my shuffle returns every permutation equally likely
up to the first i elements.
Now, consider i + 1.
And by the way we choose our index j to swap,
this leads to--and then you work out the details,
at least a full proof of why this algorithm returns
every permutation with equally likely probability.
All right, next problem.
So "given an array of integers, postive, zero, negative,
write a function that calculates the maximum sum
of any continueous subarray of the input array."
An example here is, in the case where all numbers are positive,
then currently the best choice is to take the whole array.
1, 2, 3, 4, equals 10.
When you have some negatives in there,
in this case we just want the first two
because choosing -1 and/or -3 will bring our sum down.
Sometimes we might have to start in the middle of the array.
Sometimes we want to choose nothing at all; it's best to not take anything.
And sometimes it's better to take the fall,
because the thing after it is super big. So any ideas?
(student, unintelligible) >>Yeah.
Suppose I don't take -1.
Then either I choose 1,000 and 20,000,
or I just choose the 3 billion.
Well, the best choice is to take all the numbers.
This -1, despite being negative,
the whole sum is better than were I not to take -1.
So one of the tips I mentioned earlier was to state the clearly obvious
and the brute force solution first.
What is the brute force solution in this problem? Yeah?
[Jane] Well, I think the brute force solution
would be to add up all the possible combinations (unintelligible).
[Yu] Okay. So Jane's idea is to take every possible--
I'm paraphrasing--is to take every possible continuous subarray,
compute its sum, and then take the maximum of all the possible continuous subarrays.
What uniquely identifies a subarray in my input array?
Like, what two things do I need? Yeah?
(student, unintelligible) >>Right.
A lower bound on the index and an upper bound index
uniquely determines a continuous subarray.
[Female student] Are we estimating it's an array of unique numbers?
[Yu] No. So her question is, are we assuming our array--
is our array all unique numbers, and the answer is no.
If we use our brute force solution, then the start/end indices
uniquely determines our continuous subarray.
So if we iterate for all possible start entries,
and for all end entries > or = to start, and < n,
you compute the sum, and then we take the maximum sum we've seen so far.
Is this clear?
What is the big O of this solution?
Not quite n^2.
Note that we iterate from 0 to n,
so that's one for loop.
We iterate again from almost the beginning to the end, another for loop.
And now, within that, we have to sum up every entry,
so that's another for loop. So we have three nested for loops, n^3.
Okay. This goes as a starting point.
Our solution is no worse than n^3.
Does everyone understand the brute force solution?
Okay. A better solution is using an idea called dynamic programming.
If you take CS124, which is Algorithms and Data Structures,
you will become very familiar with this technique.
And the idea is, try to build up solutions to smaller problems first.
What I mean by this is, we currently have to worry about two things: start and end.
And that's annoying. What if we could get rid of one of those parameters?
One idea is to--we're given our original problem,
find the maximum sum of any subarray in a range [O,n-1].
And now we have a new subproblem, where we know, in our current index i,
we know we must conclude there. Our subarray must end at the current index.
So the remaining problem is, where should we start our subarray?
Does this make sense? Okay.
So I've coded this up, and let's look at what this means.
In the codirectory, there's a program called subarray,
and it takes number of items,
and it returns the maximal subarray sum in my shuffled list.
So in this case, our maximal subarray is 3.
And that's taken by just using 2 and 1.
Let's run it again. It's also 3.
But this time, note how we got the 3.
We took the--we just take the 3 itself
because it's surrounded by negatives on both sides,
which will bring a sum < 3.
Let's run on 10 items.
This time it's 7, we take the leading 3 and 4.
This time it's 8, and we obtain that by taking 1, 4 and 3.
So to give you an intuition on how we can solve this transformed problem,
let's take a look at this subarray.
We're given this input array, and we know the answer is 8.
We take the 1, 4, and 3.
But let's look at how we actually got that answer.
Let's look at the maximal subarray that ended at each of these indices.
What's the maximal subarray that has to end at the first position?
[Student] Zero. >>Zero. Just don't take the -5.
Here it's going to be 0 as well. Yeah?
(student, unintelligible)
[Yu] Oh, sorry, it is a -3.
So this is a 2, this is a -3.
Okay. So -4, what's the maximal subarray to end that position
where -4 is at? Zero.
One? 1, 5, 8.
Now, I must end at the location where -2 is at.
So 6, 5, 7, and the last one is 4.
Knowing that these are my entries
for the transformed problem where I must end at each of these indices,
then my final answer is just, take a sweep across,
and take the maximum number.
So in this case it's 8.
This implies that the maximal subarray ends at this index,
and started somewhere before it.
Does everyone understand this transformed subarray?
Okay. Well, let's figure out the recurrence for this.
Let's consider just the first few entries.
So here it was 0, 0, 0, 1, 5, 8.
And then there was a -2 here, and that brought it down to 6.
So if I call the entry at position i subproblem(i),
how can I use the answer to a previous subproblem
to answer this subproblem?
If I look at, let's say, this entry.
How can I compute the answer 6 by looking at
a combination of this array and the answers to previous subproblems in this array? Yes?
[Female student] You take the array of sums
in the position right before it, so the 8,
and then you add the current subproblem.
[Yu] So her suggestion is to look at these two numbers,
this number and this number.
So this 8 refers to the answer for the subproblem (i - 1).
And let's call my input array A.
In order to find a maximal subarray that ends at position i,
I have two choices: I can either continue the subarray
that ended at the previous index, or start a new array.
If I were to continue the subarray that started at the previous index,
then the maximum sum I can achieve is the answer to the previous subproblem
plus the current array entry.
But, I also have the choice of starting a new subarray,
in which case the sum is 0.
So the answer is max of 0, subproblem i - 1, plus the current array entry.
Does this recurrence make sense?
Our recurrence, as we just discovered, is subproblem i
is equal to the maximum of the previous subproblem plus my current array entry,
which means continue the previous subarray,
or 0, start a new subarray at my current index.
And once we have built up this table of solutions, then our final answer,
just do a linear sweep across the subproblem array
and take the maximum number.
This is an exact implementation of what I just said.
So we create a new subproblem array, subproblems.
The first entry is either 0 or the first entry, the maximum of those two.
And for the rest of the subproblems
we use the exact recurrence we just discovered.
Now we compute the maximum of our subproblems array, and that's our final answer.
So how much space are we using in this algorithm?
If you've only taken CS50, then you might not have discussed space very much.
Well, one thing to note is that I called malloc here with size n.
What does that suggest to you?
This algorithm uses linear space.
Can we do better?
Is there anything that you notice that is unnecessary to compute the final answer?
I guess a better question is, what information
do we not need to carry all the way through to the end?
Now, if we look at these two lines,
we only care about the previous subproblem,
and we only care about the maximum we've ever seen so far.
To compute our final answer, we don't need the entire array.
We only need the last number, last two numbers.
Last number for the subproblem array, and last number for the maximum.
So, in fact, we can fuse these loops together
and go from linear space to constant space.
Current sum so far, here, replaces the role of subproblem, our subproblem array.
So current sum, so far, is the answer to the previous subproblem.
And that sum, so far, takes the place of our max.
We compute the maximum as we go along.
And so we go from linear space to constant space,
and we also have a linear solution to our subarray problem.
These kinds of questions you will get during an interview.
What is the time complexity; what is the space complexity?
Can you do better? Are there things that are unnecessary to keep around?
I did this to highlight analyses that you should take on your own
as you're working through these problems.
Always be asking yourself, "Can I do better?"
In fact, can we do better than this?
Sort of a trick question. You can't, because you need to
at least read the input to the problem.
So the fact that you need to at least read the input to the problem
means that you can't do better than linear time,
and you can't do better than constant space.
So this is, in fact, the best solution to this problem.
Questions? Okay.
Stock market problem:
"Given an array of n integers, positive, zero, or negative,
that represent the price of a stock over n days,
write a function to compute the maximum profit you can make
given that you buy and sell exactly 1 stock within these n days."
Essentially, we want to buy low, sell high.
And we want to figure out the best profit we can make.
Going back to my tip, what is the immediately clear, simplest answer, but it's slow?
Yes? (student, unintelligible) >>Yes.
>>So you would just go though and look at the stock prices
at each point in time, (unintelligible).
[Yu] Okay, so her solution--her suggestion of computing
the lowest and computing the highest doesn't necessarily work
because the highest might occur before the lowest.
So what is the brute force solution to this problem?
What are the two things that I need to uniquely determine the profit I make? Right.
The brute force solution is--oh, so, George's suggestion is we only need two days
to uniquely determine the profit of those two days.
So we compute every pair, like buy/sell,
compute the profit, which could be negative or positive or zero.
Compute the maximum profit that we make after iterating over all pairs of days.
That will be our final answer.
And that solution will be O(n^2), because there is n choose two pairs--
of days that you can choose among end days.
Okay, so I'm not going to go over the brute force solution here.
I'm going to tell you that there's an n log n solution.
What algorithm do you currently know that is n log n?
It's not a trick question.
Merge sort. Merge sort is n log n,
and in fact, one way of solving this problem is to use
a merge sort kind of idea called, in general, divide and conquer.
And the idea is as follows.
You want to compute the best buy/sell pair in the left half.
Find the best profit you can make, just with the first n over two days.
Then you want to oompute the best buy/sell pair on the right half,
so the last n over two days.
And now the question is, how do we merge these solutions back together?
Yes? (student, unintelligible)
>>Okay. So let me draw a picture.
Yes? (George, unintelligible)
>>Exactly. George's solution is exactly right.
So his suggestion is, first compute the best buy/sell pair,
and that occurs in the left half, so let's call that left, left.
Best buy/sell pair that occurs in the right half.
But if we only compared these two numbers, we're missing the case
where we buy here and sell somewhere in the right half.
We buy in the left half, sell in the right half.
And the best way to compute the best buy/sell pair that spans both halves
is to compute the minimum here and compute the maximum here
and take their difference.
So the two cases where the buy/sell pair occurs only here,
only here, or on both halves is defined by these three numbers.
So our algorithm to merge our solutions back together,
we want to compute the best buy/sell pair
where we buy on the left half and sell on the right half.
And the best way to do that is to compute the lowest price in the first half,
the highest price in the right half, and take their difference.
The resulting three profits, these three numbers, you take the maximum of the three,
and that's the best profit you can make over these first and end days.
Here the important lines are in red.
This is a recursive call to compute the answer in the left half.
This is a recursive call to compute the answer in the right half.
These two for loops compute the min and the max on the left and right half, respectively.
Now I compute the profit that spans both halves,
and the final answer is the maximum of these three.
So, sure, we have an algorithm, but the bigger question is,
what is the time complexity of this?
And the reason why I mentioned merge sort is that this form of divide the answer
into two and then merging our solutions back together
is exactly the form of merge sort.
So let me go through the duration.
If we defined a function t(n) to be the number of steps
for n days,
our two recursive calls
are each going to cost t(n/2),
and there's two of these calls.
Now I need to compute the minimum of the left half,
which I can do in n/2 time, plus the maximum of the right half.
So this is just n.
And then plus some constant work.
And this recurrence equation
is exactly the recurrence equation for merge sort.
And we all know that merge sort is n log n time.
Therefore, our algorithm is also n log n time.
Does this iteration make sense?
Just a brief recap of this:
T(n) is the number of steps to compute the maximum profit
over the course of n days.
The way we split up our recursive calls
is by calling our solution on the first n/2 days,
so that's one call,
and then we call again on the second half.
So that's two calls.
And then we find a minimum on the left half, which we can do in linear time,
find the maximum of the right half, which we can do in linear time.
So n/2 + n/2 is just n.
Then we have some constant work, which is like doing arithmetic.
This recurrence equation is exactly the recurrence equation for merge sort.
Hence, our shuffle algorithm is also n log n.
So how much space are we using?
Let's go back to the code.
A better question is, how many stack frames do we ever have at any given moment?
Since we're using recursion,
the number of stack frames determines our space usage.
Let's consider n = 8.
We call shuffle on 8,
which will call shuffle on the first four entries,
which will call a shuffle on the first two entries.
So our stack is--this is our stack.
And then we call shuffle again on 1,
and that's what our base case is, so we return immediately.
Do we ever have more than this many stack frames?
No. Because each time we do an invocation,
a recursive invocation to shuffle,
we divide our size in half.
So the maximum number of stack frames we ever have at any given moment
is on the order of log n stack frames.
Each stack frame has constant space,
and therefore the total amount of space,
the maximum amount of space we ever use is O(log n) space
where n is the number of days.
Now, always ask yourself, "Can we do better?"
And in particular, can we reduce this to a problem we've already solved?
A hint: we only discussed two other problems before this, and it's not going to be shuffle.
We can convert this stock market problem into the maximal subarray problem.
How can we do this?
One of you? Emmy?
(Emmy, unintelligible)
[Yu] Exactly.
So the maximal subarray problem,
we're looking for a sum over a continuous subarray.
And Emmy's suggestion for the stocks problem,
consider the changes, or the deltas.
And a picture of this is--this is the price of a stock,
but if we took the difference between each consecutive day--
so we see that the maximum price, maximum profit we could make
is if we buy here and sell here.
But let's look at the continuous--let's look at the subarray problem.
So here, we can make--going from here to here,
we have a positive change, and then going from here to here we have a negative change.
But then, going from here to here we have a huge positive change.
And these are the changes that we want to sum up to get our final profit.
Then we have more negative changes here.
The key to reducing our stock problem into our maximal subarray problem
is to consider the deltas between days.
So we create a new array called deltas,
initialize the first entry to be 0,
and then for each delta(i), let that be the difference
of my input array(i), and array(i - 1).
Then we call our routine procedure for a maximal subarray
passing in a delta's array.
And because maximal subarray is linear time,
and this reduction, this process of creating this delta array,
is also linear time,
then the final solution for stocks is O(n) work plus O(n) work, is still O(n) work.
So we have a linear time solution to our problem.
Does everyone understand this transformation?
In general, a good idea that you should always have
is try to reduce a new problem that you're seeing.
If it looks familiar to an old problem,
try reducing it to an old problem.
And if you can use all the tools that you've used on the old problem
to solve the new problem.
So to wrap up, technical interviews are challenging.
These problems are probably some of the more difficult problems
that you might see in an interview,
so if you don't understand all the problems that I just covered, it's okay.
These are some of the more challenging problems.
Practice, practice, practice.
I gave a lot of problems in the handout, so definitely check those out.
And good luck on your interviews. All my resources are posted at this link,
and one of my senior friends has offered to do mock technical interviews,
so if you're interested, email Will Yao at that email address.
If you have some questions, you can ask me.
Do you guys have specific questions related to technical interviews
or any problems we've seen so far?
Okay. Well, good luck on your interviews.