• Citadel onsite confirmed, next Thursday. Choice of Chicago or New York. Jack said both offices were great. Chicago is a shorter flight, so we’ll go there. Fulltime can change after, if need be. Fly out midday wed.
    • Very quick (2 question) interviewer feedback form for Amazon.
    • California would be the 5th largest country economy, after Us China Japan Germany.
    • Lots more system design videos. I really do love this stuff.
    • Alphabet hit 1 trillion market cap today.
    • Probabilistic solutions for hash collisions.
      • When counting or checking existence, the natural solution is a hash table.
      • The solutions here: use multiple hash functions, then compare. This makes you robust to collisions.
      • Count min sketch algorithm.
        • Used for counting how many times an item appears in an array.
        • Iterate over the array, use multiple different hash functions on each item, then calc the min of all the integer counts produced by them.
      • Bloom filter.
        • Used to check if a record exists already (think “username taken”…).
        • As each record is created, run it through multiple hash functions and put entries in a bit array.
        • As a new one comes it, run it through all hash functions. If there’s already an entry for all, then you’re 90% sure it’s already taken.
        • If one output doesn’t exist, then you’re 100% sure it is not already used, because you would have put an entry for it when the record was created.
      • These typically use efficient hash functions and memory-efficient data structures.
      • Allows you to do something like pass a bloom filter clientside so that a user can get immediate feedback in the browser if a name is taken. Just have to pass the K hash functions and a bit array of the possible outputs.
    • Rate limiter design.
      • Just think like github’s “remaining-requests” to the api. Keep a db which has a mapping of user: tokens. When they’re all used, reject requests.
      • You’ll usually need to store timestamps too, so you know how the usage has been spread over your particular window (last minute/hour/whatever). Just like a queue: [ts1, ts2, ts3…]. You can count them as desired. Sliding window.
    • Disney finally got back to me, they’ll schedule a phone interview for ASAP. Reminder the comp package is much less than the others: ~190k base, no equity, contract 18mo, half benefits, some disney perks like movie screenings. I’ll see how the phone goes, then balance with my current burden of the others before committing to an onsite interview. They’re glendale, which is a bit more convenient than the remotes.
    • Nosql databases can be indexed for speed as well. It’s usually a compound index of multiple keys mushed together. Remember that an index is just something you can query quickly because it’s sorted.
    • Sam Harris is so good: https://www.youtube.com/watch?v=LRm_H158qc0. Be grateful for things that haven’t happened. Easy way to stay in the moment, impervious to negative emotion.
      • Downloaded Waking Up. Will read after jim simons’ bio.
    • To create the floodfill feature (the paint bucket in mspaint), just use a graph traversal (either BFS or DFS) where the neighbors are all 4 nodes in the cardinal directions. If the value is the same, update to new color.
    • Practice problems.
    • Applying locks, mutexes, semaphores to a concurrent program in order to avoid race conditions: synchronization.
    • Priority inversion: when a high priority tasks is waiting for a low priority task to finish. Priority inheritance: when the low priority tasks becomes high priority bc the next job in the queue is high priority, to make sure nothing takes it away.
    • In python2, xrange was more memory efficient than range. It wouldn’t create the full list, it would lazily generate them as you need. In python3, range is xrange.
    • Python’s heapq.heapify will take a list and make a heap in linear time (in place too!). There’s also nlargest, nsmallest.
    • pop(0) for first element, pop() for last.
    • collections.Counter(arr).mostcommon(k) is very useful. It doesn’t have a nested sort, after count it’s arbitrary. To enforce alphabetical or something on the second level, just write it yourself.