- Citadel onsite confirmed, next Thursday. Choice of Chicago or New York. Jack said both offices were great. Chicago is a shorter flight, so we’ll go there. Fulltime can change after, if need be. Fly out midday wed.
- Very quick (2 question) interviewer feedback form for Amazon.
- California would be the 5th largest country economy, after Us China Japan Germany.
- Lots more system design videos. I really do love this stuff.
- Alphabet hit 1 trillion market cap today.
- Probabilistic solutions for hash collisions.
- When counting or checking existence, the natural solution is a hash table.
- The solutions here: use multiple hash functions, then compare. This makes you robust to collisions.
- Count min sketch algorithm.
- Used for counting how many times an item appears in an array.
- Iterate over the array, use multiple different hash functions on each item, then calc the min of all the integer counts produced by them.
- Bloom filter.
- Used to check if a record exists already (think “username taken”…).
- As each record is created, run it through multiple hash functions and put entries in a bit array.
- As a new one comes it, run it through all hash functions. If there’s already an entry for all, then you’re 90% sure it’s already taken.
- If one output doesn’t exist, then you’re 100% sure it is not already used, because you would have put an entry for it when the record was created.
- These typically use efficient hash functions and memory-efficient data structures.
- Allows you to do something like pass a bloom filter clientside so that a user can get immediate feedback in the browser if a name is taken. Just have to pass the K hash functions and a bit array of the possible outputs.
- Rate limiter design.
- Just think like github’s “remaining-requests” to the api. Keep a db which has a mapping of user: tokens. When they’re all used, reject requests.
- You’ll usually need to store timestamps too, so you know how the usage has been spread over your particular window (last minute/hour/whatever). Just like a queue: [ts1, ts2, ts3…]. You can count them as desired. Sliding window.
- Disney finally got back to me, they’ll schedule a phone interview for ASAP. Reminder the comp package is much less than the others: ~190k base, no equity, contract 18mo, half benefits, some disney perks like movie screenings. I’ll see how the phone goes, then balance with my current burden of the others before committing to an onsite interview. They’re glendale, which is a bit more convenient than the remotes.
- Nosql databases can be indexed for speed as well. It’s usually a compound index of multiple keys mushed together. Remember that an index is just something you can query quickly because it’s sorted.
- Sam Harris is so good: https://www.youtube.com/watch?v=LRm_H158qc0. Be grateful for things that haven’t happened. Easy way to stay in the moment, impervious to negative emotion.
- Downloaded Waking Up. Will read after jim simons’ bio.
- To create the floodfill feature (the paint bucket in mspaint), just use a graph traversal (either BFS or DFS) where the neighbors are all 4 nodes in the cardinal directions. If the value is the same, update to new color.
- Practice problems.
- Focused mostly on DS&A today. Tried a new website for diversity, geeksforgeeks.
- Good summary of everything: https://www.geeksforgeeks.org/must-do-coding-questions-for-companies-like-amazon-microsoft-adobe/.
- In general, geeksforgeeks has pretty good tutorials/descriptions. The text is better, the IDE/leaderboard/etc is worse.
- Poison bottles and rats to test. Each iteration, have the rat drink half the bottles. Reduces it to binary search basically, logn.
- Binary trees. Find max width, height, sum of columns, etc. Just do -1 for each left traversal, +1 for each right traversal. Keep in a global hash table or something. Then you can run whatever stats you want on it.
- Ended up switching back to leetcode. Geeksforgeeks has a bad interface. It requires you to supply your own input/output reader/write, rather than making it class or function based with prefilled templates.
- https://leetcode.com/problems/continuous-subarray-sum/. This one wasn’t easy. Find modulo of cumsum, check to see if two are the same, which means the cumsum between them is divisible by the target.
- https://leetcode.com/problems/baseball-game. Easy.
- https://leetcode.com/problems/top-k-frequent-words. Easy. Remember you can heap for nlogk instead of nlogn.
- https://leetcode.com/problems/spiral-matrix. Walk 2d grid in spiral. Just define the 4 directions, loop through them. Take whole row, take first or last item in all rows, and reverse as necessary.
- https://leetcode.com/problems/median-of-two-sorted-arrays/. Didn’t do it, just reread. This is a great article: https://medium.com/@hazemu/finding-the-median-of-2-sorted-arrays-in-logarithmic-time-1d3f2ecbeb46. The problem is way out of scope for an interview.
- https://leetcode.com/problems/longest-palindromic-substring. Did the “expand around all centers” way.
- https://leetcode.com/problems/trapping-rain-water/. Trapped rain water, iterate once from left and once from right. Calculate max and deltas, then compare and take min. That’s the trapped water.
- https://leetcode.com/problems/rotate-image/. Rotate image. If you’re allowed to use extra space, just fill in the new matrix. Write a couple down to see the pattern. New value is just (j, n-i-1). To do the rotation in place, go onion layer by onion layer and do 1, 2, 3, 4 = 1, 2, 3, 4.
- https://leetcode.com/problems/jump-game. Did DFS. There’s a much cleaner solution, basically start at the end and check if you can reach this position, then iteration back through the whole range.
- https://leetcode.com/problems/merge-intervals. Good example of making assumptions and testing them out. Did it vocally. Would have clarified with interviewer if live, rather than breaking/fixing with testcases.
- Applying locks, mutexes, semaphores to a concurrent program in order to avoid race conditions: synchronization.
- Priority inversion: when a high priority tasks is waiting for a low priority task to finish. Priority inheritance: when the low priority tasks becomes high priority bc the next job in the queue is high priority, to make sure nothing takes it away.
- In python2, xrange was more memory efficient than range. It wouldn’t create the full list, it would lazily generate them as you need. In python3, range is xrange.
- Python’s heapq.heapify will take a list and make a heap in linear time (in place too!). There’s also nlargest, nsmallest.
- pop(0) for first element, pop() for last.
- collections.Counter(arr).mostcommon(k) is very useful. It doesn’t have a nested sort, after count it’s arbitrary. To enforce alphabetical or something on the second level, just write it yourself.