Leetcode Hard Page 1

Purpose

This series of blog is to share some thoughts of hard questions in leetcode. Generally, problems tagged as "hard" in leetcode varies in difficulty. Some of them need heavy coding and complex corner cases, some of them need greedy solution which is way too intuitive, I don't like them so I might skip those. It's fun to dive deep into questions that can be solved by some specific way because people always wants to find a "once for all" solution. And that is the purpose of my recording the solution of this questions.

I will try to illustrate the ideas nice and clean, all the descriptions will follow a certain format: What is the question? (generalized one or original one) Using which skills to solve which part of that question? Other things to mention (may be trivial or skills that can not be categorized and or rarely)

My target to enhance my skill to reduce a question to a specific model, anyway, it's interesting.

Median of Two Sorted Arrays

This is to find the median in two sorted arrays, using divide and conquer to reduce the scope to search. Should use a recursive function.

make sure we are always comparing number to number and index to index, for example normally we use 0-based index, if we want to know if the element with index of k can or can not be found within the first m elements, we should compare k+1 and m because both k+1 and m is the number, or k and m - 1. That's more reasonable to do it logically and can avoid silly off-by-one bugs.
make sure this recursive function will come to an end, especially if we use "mid index" which is "startIndex + endIndex + 1 >> 1" or "startIndex + endIndex >> 1".

Regular Expression Matching

This is to match given string with a regular expression. We use backtrack to get the result.

Merge k Sorted Lists

This is to merge k sorted linkedlists in to one linkedlist. We use custom comparator and a heap to get the smallest result in the current heads of the given list.

for python, we can define a class with "__lt__(self,other)" to do that trick (custom comparator)

Reverse Nodes in k-Group

This is to reverse nodes k by k in a linkedlist. Reduce this question to "reverse a linkedlist".

Substring with Concatenation of All Words

To find the start index of a sequence which consists of all the words in a given list (may not be distinct) exactly once. So first build a trie with that list, then move back to the next position after match start when (discuss in different situations)

match complete
cur char not in trie
end of a trie node, but the number of current word is out of limit

Longest Valid Parentheses

To find the Longest Valid Parentheses subsequence in a string. Stack and dp both work. If use dp, dp[i] is the longest "single parenthese group"(some thing like "(())" instead of "(())()"). After get the dp array, it's easy to get the longest continuous substring, just do a one time loop again. (Or you can do it inside the loop, this is why it's a dp solution) If use stack, just match the current ')' with previous '(' and add 2 which is the current pair into previous result, a easy trick in python is to store both '(' and number of matched pairs in a stack, that is more intuitive, otherwise store -1 into that stack is ok for other languages. The result is the running maximum of the current result.

Sudoku Solver

To solve a sudoku. Nothing special, backtrack and validate the final result.

First Missing Positive

To find the first missing positive in a unsorted array in O(n). Idea is based on discussion in different situations, if the smallest missing positive is larger than 1, then our task is simple, just find the minimum positive number and minus one. Otherwise, the array must be somthing like "1 2 7 10 20..." when it is sorted. So we do a O(n) index sort for that array. This is to ensure elements with index within in the range of [1,len(A)] will be insert to the right position at least once.

Trapping Rain Water

To simulate a situation where rain falls into a 2D space. Stack is used here to add water from the current position and previous position. And this is also a monotonic queue which maintains a decreasing sequence.

only if there are two walls before the current wall, the water can be kept, this is the key for simulation

Wildcard Matching

To match the target string using a simplified regex expression including '*' and '?'. Using dp to represent

dp[i][j] (1-based index) whether pattern[1..j] matches str[1..i], if pattern[j] is '*', we only need to know whether pattern[1..j-1] matches any str[1..k] (k in [1..j])

N-Queens,N-Queens II

No need to explain, classic backtrack.

Permutation Sequence

To get the k-th element of the permutation sequence of n. The number of elements with the same starting number is the same for each permutation and sub permutation. So just choose the first number and second number...based on how many k left. Any recursive function will work.

better to use division instead of a while loop when we need to subtract a constant value from a total multiple times.

Valid Number

To validate whether a string is valid as a number, scientific notation allowed. So this is a tipically problem with many corner cases, here I list some of them

"+."
"4e+"
"e"
"."
"+.1"
".e1"
"01"

Text Justification

Give some words and a maxwidth. Output a list of lines showing how these words will be output in the screen. Following the rules given by the question, it's complex but not hard actually.

Edit distance

Give two strings. Find the minimum number of modifications (add,delete,replace) of one string to change it into another. Using dp to solve it. dp[i][j] means the minimum step of operation to change str1[1..i] to str2[1..j]

Part of my python code is

for i in range(1,n+1):
    for j in range(1,m+1):
        c1,c2 = word1[i-1],word2[j-1]
        if c1 == c2:
            dp[i][j] = dp[i-1][j-1]
        else:
            dp[i][j] = min(dp[i-1][j],dp[i][j-1],dp[i-1][j-1]) + 1

it seems like many dp problem for strings need 1-based index and some initialization step to make the meaning correct

Minimum Window Substring

Give a target string and base string, find a substring (of base string) to ensure it covers the target string when re-arranged. Using sliding window, in each step, there should a variable indicating whether the condition is meet or not. I used a set to store all the necessary characters in the target and another map to change that set. This post use a map (shared with previous count map) and a number to do this.

Before determining the current interval is valid or not, for the end index, there are two "sliding window" boilerplates for us to use:

start = 0
for end in 1...n
  update with new end
  if [start..end] is valid:
    while start < end and [start + 1..end] is still valid:
      update   //start is removed
      start ++ 
    
    [start...end] is the smallest window ending in end

Or we can have another structure using while

start,end = 0,0
update
while end <n:
  if [start...end] is valid:
    [start...end] may be the smallest window ending in end

    update // start is removed
    start++
    
    if start > end:
      update // start + 1 is added
      end += 1
  else:
    end += 1
    update // end + 1 is added

Choosing which template depends on **Whether calculating the validity of the next step is costly or not". For example, if we know [start..end] is valid, and we want to know if [start+1..end] is valid or not, if this operation is complex, then we choose the second one. Otherwise, the first one is cleaner and easier.

notice in the second template, start + 1 or end + 1 may not exist, so we need some additional if condition

Largest Rectangle in Histogram

To get the maximum rectangle size with in a histogram with postive integer coordinates. Using stack to solve it. When using stack to solve these kind of specific question, there are some certain things that should consider:

Which info we have at every step and what should we store in the stack?
What is the property of the stack in each loop? Are we going to maintain a monotonic queue? increasing? decreasing?
Which info can be compressed or throwed? For example if we pop elements from a stack, we are compressing the info of these elements to the last element of that stack or just throw it away. In this question, we maintain a monotonic increasing queue, and for each new element smaller than previous one, we pop out the previous one and calculate if we extend the current element(the smaller one) to this element, what is the size of the rectangle. This is the maximum one because what we have is a monotonic queue and it can not be extended further left. So for each element, we know how long it can be extended to the left and we got the result.

Maximum Rectange

To get the maximum rectange full of 1 inside a matrix of 1 and 0. For each row, this is the same to Largest Rectangle in Histogram, we only need to find the ending of each "histogram" in each row.

Closest Binary Search Tree Value II (follow up)

To get the first k numbers in a binary tree closest to the given target. Actually if k = n, the time complexity is O(n) since the result array has n elements. But we can do it in O(klogn) by finding the closest number in a loop. The idea is to find the maximum number smaller(or equals) to k and the minimum number larger(or equals) to k.

And we tag the nodes by binary index to identify whether this node is used or not. There is some discussion in different situations.

Strobogrammatic Number III

Find number within [low,high] ensuring the number is the same when rotated 180 degrees. This can be solved faster through complex logic, but for questions relating to symmetric reasoning. It's better to disscuss in different situations for whether it's odd or even. Here we just generate all possible answers and check whether it's within low and high or not. dp is a possible way to do that, dfs also works.

Mention the leading zero is not a big issue if we need to convert the whole number, just add '0' into the interval result and ignore it at last.

Paint House II

Paint house with a cost matrix (paint house i with color j costs cost[i][j]), with no two adjacent houses share the same color. dp[i][j] = minimum cost from 0-i with house i of color j, result is min(dp[-1])

Alien Dictionary

Given a list of words sorted by a certain order (like a<b<c<d... or z<f<k<g...), get that order. We can get a certain order by two adjacent words, this order can be represented by an directed edge, finally we can get a graph. if it is a DAG, then we have the result by its topo order.

Mention two adjacent words could be illegal such as 'ab' followed by 'a'

Closest Binary Search Tree Value II

Get k values closest the target in a Binary Search Tree. A simple solution is traverse + two pointers, which is O(n). But for the follow up, the way to do it in O(klogn) can be: For each round, we select the maximum value smaller than target and minimum value greater than target. Compare their abs value and set a visited flag to the node with smaller abs value. The next round, if we encounter a node we visited in "find the maximum value smaller than target", we go to the left node, because we it's right child must have been visited before. It's the same for the right side.

Expression Add Operators

Given a string of digits only, add '+-*' in any position between digits to make the expression value to equal to a target. Backtrack and try each possible position.

Mention if we only generating different expressions and evaluate them, there will be TLE. We should at least get all the numbers we are using and put them in a stack and backtrack that stack.
Or we can calculate during backtrack which is complex but fast
https://leetcode.com/problems/expression-add-operators/discuss/572099/C%2B%2BJavaPython-Backtracking-and-Evaluate-on-the-fly-Clean-and-Concise

Core code, idea is to add previous number as default, and subtract from the sum when * is added behind

backtrack(j + 1, path + "*" + str(num), resultSoFar - prevNum + prevNum * num, prevNum * num)

Find Median from Data Stream

Given a stream of numbers, find the median in O(logn) time. An idea is to maintain two heaps, one is a min heap and the other is a max heap. And maintain the size of two heaps has difference less than two. Or we can use segment tree and binary search to find kth element of the data stream. This is an overkill for such problems(But for this problem, it's slower since it's O(logn * logn)).

Best Meeting Point

Given a grid with some points, return the minimum Manhattan Distance from any node to all these nodes. The chosen node and these nodes can overlap. The answer is pretty simple, the coordinate x,y of the chosen point is just the median of all the xs and ys. There is an intuitive idea about that: For any two points, the chosen point with minimum distance must in the shaded area. Even if we project that two points in the edges, the result will still be the same. So for all the points, we project them to edges and the result is obviously the median of all xs and ys.

Remove invalid parentheses

Given a string s that contains parentheses and letters, remove the minimum number of invalid parentheses to make the input string valid.

My solution is quite lengthy and now doable in an interview, here is a version which I believe is simple and easy to implement

https://leetcode.com/problems/remove-invalid-parentheses/discuss/459218/Python-solution-using-Smart-Backtracking-with-a-Simple-Trick-Beats-82-in-Time-and-100-in-Space
Mention when removing ')', we must ensure we are not removing the last ')' of an expression like '((())', even if rightsToRemove > 0.