This series of blog is to share some thoughts of hard questions in leetcode. Generally, problems tagged as "hard" in leetcode varies in difficulty. Some of them need heavy coding and complex corner cases, some of them need greedy solution which is way too intuitive, I don't like them so I might skip those. It's fun to dive deep into questions that can be solved by some specific way because people always wants to find a "once for all" solution. And that is the purpose of my recording the solution of this questions.
I will try to illustrate the ideas nice and clean, all the descriptions will follow a certain format: What is the question? (generalized one or original one) Using which skills to solve which part of that question? Other things to mention (may be trivial or skills that can not be categorized and or rarely)
My target to enhance my skill to reduce a question to a specific model, anyway, it's interesting.
This is to find the median in two sorted arrays, using divide and conquer to reduce the scope to search. Should use a recursive function.
This is to match given string with a regular expression. We use backtrack to get the result.
This is to merge k sorted linkedlists in to one linkedlist. We use custom comparator and a heap to get the smallest result in the current heads of the given list.
This is to reverse nodes k by k in a linkedlist. Reduce this question to "reverse a linkedlist".
To find the start index of a sequence which consists of all the words in a given list (may not be distinct) exactly once. So first build a trie with that list, then move back to the next position after match start when (discuss in different situations)
To find the Longest Valid Parentheses subsequence in a string. Stack and dp both work. If use dp, dp[i] is the longest "single parenthese group"(some thing like "(())" instead of "(())()"). After get the dp array, it's easy to get the longest continuous substring, just do a one time loop again. (Or you can do it inside the loop, this is why it's a dp solution) If use stack, just match the current ')' with previous '(' and add 2 which is the current pair into previous result, a easy trick in python is to store both '(' and number of matched pairs in a stack, that is more intuitive, otherwise store -1 into that stack is ok for other languages. The result is the running maximum of the current result.
To solve a sudoku. Nothing special, backtrack and validate the final result.
To find the first missing positive in a unsorted array in O(n). Idea is based on discussion in different situations, if the smallest missing positive is larger than 1, then our task is simple, just find the minimum positive number and minus one. Otherwise, the array must be somthing like "1 2 7 10 20..." when it is sorted. So we do a O(n) index sort for that array. This is to ensure elements with index within in the range of [1,len(A)] will be insert to the right position at least once.
To simulate a situation where rain falls into a 2D space. Stack is used here to add water from the current position and previous position. And this is also a monotonic queue which maintains a decreasing sequence.
To match the target string using a simplified regex expression including '*' and '?'. Using dp to represent
No need to explain, classic backtrack.
To get the k-th element of the permutation sequence of n. The number of elements with the same starting number is the same for each permutation and sub permutation. So just choose the first number and second number...based on how many k left. Any recursive function will work.
To validate whether a string is valid as a number, scientific notation allowed. So this is a tipically problem with many corner cases, here I list some of them
"+."
"4e+"
"e"
"."
"+.1"
".e1"
"01"
Give some words and a maxwidth. Output a list of lines showing how these words will be output in the screen. Following the rules given by the question, it's complex but not hard actually.
Give two strings. Find the minimum number of modifications (add,delete,replace) of one string to change it into another. Using dp to solve it. dp[i][j] means the minimum step of operation to change str1[1..i] to str2[1..j]
Part of my python code is
for i in range(1,n+1):
for j in range(1,m+1):
c1,c2 = word1[i-1],word2[j-1]
if c1 == c2:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = min(dp[i-1][j],dp[i][j-1],dp[i-1][j-1]) + 1
Give a target string and base string, find a substring (of base string) to ensure it covers the target string when re-arranged. Using sliding window, in each step, there should a variable indicating whether the condition is meet or not. I used a set to store all the necessary characters in the target and another map to change that set. This post use a map (shared with previous count map) and a number to do this.
start = 0
for end in 1...n
update with new end
if [start..end] is valid:
while start < end and [start + 1..end] is still valid:
update //start is removed
start ++
[start...end] is the smallest window ending in end
Or we can have another structure using while
start,end = 0,0
update
while end <n:
if [start...end] is valid:
[start...end] may be the smallest window ending in end
update // start is removed
start++
if start > end:
update // start + 1 is added
end += 1
else:
end += 1
update // end + 1 is added
Choosing which template depends on **Whether calculating the validity of the next step is costly or not". For example, if we know [start..end] is valid, and we want to know if [start+1..end] is valid or not, if this operation is complex, then we choose the second one. Otherwise, the first one is cleaner and easier.
To get the maximum rectangle size with in a histogram with postive integer coordinates. Using stack to solve it. When using stack to solve these kind of specific question, there are some certain things that should consider:
To get the maximum rectange full of 1 inside a matrix of 1 and 0. For each row, this is the same to Largest Rectangle in Histogram, we only need to find the ending of each "histogram" in each row.
To get the first k numbers in a binary tree closest to the given target. Actually if k = n, the time complexity is O(n) since the result array has n elements. But we can do it in O(klogn) by finding the closest number in a loop. The idea is to find the maximum number smaller(or equals) to k and the minimum number larger(or equals) to k.
And we tag the nodes by binary index to identify whether this node is used or not. There is some discussion in different situations.
Find number within [low,high] ensuring the number is the same when rotated 180 degrees. This can be solved faster through complex logic, but for questions relating to symmetric reasoning. It's better to disscuss in different situations for whether it's odd or even. Here we just generate all possible answers and check whether it's within low and high or not. dp is a possible way to do that, dfs also works.
Paint house with a cost matrix (paint house i with color j costs cost[i][j]), with no two adjacent houses share the same color. dp[i][j] = minimum cost from 0-i with house i of color j, result is min(dp[-1])
Given a list of words sorted by a certain order (like a<b<c<d... or z<f<k<g...), get that order. We can get a certain order by two adjacent words, this order can be represented by an directed edge, finally we can get a graph. if it is a DAG, then we have the result by its topo order.
Get k values closest the target in a Binary Search Tree. A simple solution is traverse + two pointers, which is O(n). But for the follow up, the way to do it in O(klogn) can be: For each round, we select the maximum value smaller than target and minimum value greater than target. Compare their abs value and set a visited flag to the node with smaller abs value. The next round, if we encounter a node we visited in "find the maximum value smaller than target", we go to the left node, because we it's right child must have been visited before. It's the same for the right side.
Given a string of digits only, add '+-*' in any position between digits to make the expression value to equal to a target. Backtrack and try each possible position.
Core code, idea is to add previous number as default, and subtract from the sum when * is added behind
backtrack(j + 1, path + "*" + str(num), resultSoFar - prevNum + prevNum * num, prevNum * num)
Given a stream of numbers, find the median in O(logn) time. An idea is to maintain two heaps, one is a min heap and the other is a max heap. And maintain the size of two heaps has difference less than two. Or we can use segment tree and binary search to find kth element of the data stream. This is an overkill for such problems(But for this problem, it's slower since it's O(logn * logn)).
Given a grid with some points, return the minimum Manhattan Distance from any node to all these nodes. The chosen node and these nodes can overlap. The answer is pretty simple, the coordinate x,y of the chosen point is just the median of all the xs and ys. There is an intuitive idea about that: For any two points, the chosen point with minimum distance must in the shaded area. Even if we project that two points in the edges, the result will still be the same. So for all the points, we project them to edges and the result is obviously the median of all xs and ys.
Given a string s that contains parentheses and letters, remove the minimum number of invalid parentheses to make the input string valid.
My solution is quite lengthy and now doable in an interview, here is a version which I believe is simple and easy to implement
- Mention when removing ')', we must ensure we are not removing the last ')' of an expression like '((())', even if rightsToRemove > 0.