January 2020 – Page 4 – Mostly Software

Jan 10, 2020

Petty released the sbsc winnings, $700 for 1st place, $50 for 10th. End of season party this saturday.
Path-planning is an np-complete problem. The traveling salesman is an np-complete problem. Dijkstra’s algorithm alone is solvable in polynomial time.

Traveling salesman is just visit every city in the country once in the shortest path. In other terms: visit every node in a graph once, where the edges can have different weights (like distances between cities).

! to run shell commands from ipython.
Spoke with the google recruiter. Gave her my feedback from the other day, and combined with the interviewer’s fb, decided we’ll do a 2nd phone interview. This is good – I wanted another datapoint. We’ll do it next tuesday.
With py requests, can pass it in directly with ?key=value or via params={key: value}
You should typically use the rel links for next and prev when dealing with a paginated API, rather than just looping over an integer range and building the URL yourself. Sometimes the url will change.
Dentist and haircut.
Interesting report on the (primarily) medical state of psychedelics, and the effect on market investments we’ll see in years to come: https://www.reddit.com/r/investing/comments/emkhwt/2020_psychedelic_industry_insights_report/.
Went through a bit of highscalability. It really is a great blog.
Jitter can be a good thing. Sometimes it’s best to intentionally add a bit of entropy in. Helps avoid thundering herds.
Spent most of today on the Netflix take-home assignment. Private github repo.
breakpoint() is only available in py3.7 and beyond, it’s still import pdb; pdb.set_trace() for all vers before then.

Jan 9, 2020

Lots of system design practice today.
NoSQL vs SQL.

NoSQL is useful when you always need all information about that row. SQL better for select x, NoSQL better for select *.
Don’t need NULLs. Just skip keys in the nosql json. In this way, the schema is basically flexible per entry.
Good for analytics.
NoSQL is more expensive for updates, and doesn’t guarantee ACID.
NoSQL is more expensive for reads. While you can grab a whole blob easily, getting only one col across the whole db is slow.
JSON nesting is nowhere near as structured as the the relationships (FK, many-many, etc).
Joins are weird. Not easy.
Overall: if your data is always read/written in blocks, not individual cols, nosql is good.
Key-value is just key-value. Document nosql can have structure and nesting within (JSON usually).
Examples:

Document nosql dbs: mongo, couch.
Key-value nosql dbs: dynamo, redis.
Wide col nosql dbs: cassandra.
Graph nosql dbs: neo4j.

10m call with Jack to set up the onsite with citadel, a few notes:

Will’s team is interested, but there are a few others as well (equities…). Interview might be at one site, but fulltime might be at the other.
You don’t need to have expertise in finance, but you need to show interest. The projects of last year are great examples.

ACID is a db transaction property for transactions. Atomicity, Consistency, Isolation, Durability. Financial institutions absolutely require this. Something like a metrics server wouldn’t.

Availability is another important metric, but isn’t part of ACID.

Can shard by ID, or by location, or whatever is the most convenient/divisive/common entry point.
JOINs get expensive if they’re cross-shard.
Master-slave architecture. Master gets all the writes, slaves update to match master every X sec/min/hours/days, slaves get all the reads.

If master goes down, one of the remaining slaves becomes master.
Still dumb that its formal name is master/slave in 2020.

Remember generators just return before they’re finished executing. Could be 3 lines straight with yield 1, yield 2, yield 3, but it’s usually a loop within the function. Just returns partial progress. Then you can iterate over the returns as they come, or call next(my_gen_func) to walk through manually.
Confirmed and planned the logistics for the amazon onsite on monday.

Need to print the NDA before then.

Consistency is very important. Do you want mutexes on everything read/write such that the data returned is always exact? This is much slower and much bigger. How much data can afford to be out-of-date? How much time must pass before data is considered obsolete?
If I scale a microservice horizontally, the machines need to talk to each other. RPC. If I scale vertically, it’s all local still. IPC. RPC is slower, and more complicated to manage. But it’s usually cheaper, and it’s boundless, and it’s more resilient to failure.
Remember: event-driven/pubsub/messagequeue/bus designs are worse with consistency, but better with availability. You’re distributing events and aggregating results. If you need higher fidelity responses, go with more of a request/response (timeout) design. Slower, but more integrity in consistent data.
Event-driven is easier to test, play, rollback, because it’s structured and sequenced. Kinda like react/redux.
In-mem solutions use snapshots to get persistence, but they’re obviously at a certain frequency and so you open risk for data lass.
Postgres has nosql features as well, it’s just used less commonly. You can store docs just like mongo.
Amazon 1hr onsite coaching call, general to all onsites.

Basic fundamentals. The one thing I usually have to remember is: Ask questions to clarify the problem. This especially applies to design questions, but can also apply to coding questions.

Amazon 30m final call, specific to me.

Schedule: (each 1hr)

2 technical: Problem solving, DS&A.
1 nontechnical with hiring manager. Customers, communication, dealing with adversity.
1 lunch. Culture. Ask questions. Relax.
1 technical: Logical and maintainable code.
1 technical: System design. Scalability. Operational performance.

Each will have 2 leadership principle questions (behavioral) as well.
General:

Ask clarifying questions. Think out loud.
Start with an easy solution. Brute force, whatever. Then expand. Take bite-sized bites.
The interviewer is your coworker! If you start to panic, look at them like a collaborator.

Always consider priority. A bank withdrawal is much more time-sensitive than an email notification. This could affect the ordering in the message queue, the rate limiter thresholds on various notes, the load balancing, etc.
When realtime accurate data is cumbersome, you can trend or interpolate data, based on history, to give an estimate. This is true for something like the viewcount on popular youtube channels. Would take too much to show exactly, but is easy to show ballpark.
Thrashing – when a cache is inserting/deleting too quickly.
Going through this process of interview prep has been interesting. I feel like I’ve consumed what must equate to at least a handful of semesters in fundamental CS courses. This accessibility of information makes me wonder about the future of education. It’s been maybe 6 weeks and $0, whereas the university equivalent would have been a year and $30k. Will our grandchildren get in-person degrees? I’m not so sure this model is viable in the future.
Ordered more dry-erase markers for whiteboard practice at home. Arrive tomorrow.
Netflix splits every piece of content into small atoms, stored with different permutations of codecs and resolutions. That’s why viewing is seamless when your internet quality changes, or you select a new one; it will simply then fetch the right chunk instead of the whole movie again. The amount of chunks it preloads into the future is based on heuristics, by how much people typically jump during this movie. If a lot -> preload only a little. If a little -> preload a lot.
Did a practice system design from scratch for instagram/twitter.
It looks like neither grubhub nor yelp do group carts anymore? I remember loving this feature. You’d just send the link to anyone, they’d look through then menu and add, then be responsible for the finances individually when the order was sent.
You could store images in a db as blobs (binary large objects), but it’s almost always still best to put them on a file system elsewhere (CDN, s3, etc) and keep only a URL to the file in your db.
A client-server relationship is always initiated by the client. Request -> response. The protocol for this is http.
A peer-peer relationship can be initiated by either side. You can use a general socket (TCP) or a protocol specific to your use case. For chat (whatsapp, fb messenger, etc) the protocol is XMPP – extensible messaging and presence protocol.
It’s ok to have gut instincts. It’s even ok if they’re wrong. State them, and then state that you’re going to analyze them now. Do the verification out loud. A raw, transparent thought process is worth a lot.
Designing google search.

First, just think about keeping a big dictionary. As people add sites to the internet, it adds words and phrases to the dictionary. You have a rough mapping of what is out there.
Then think of it like a gigantic hash table, or an indexed db. Look for the word (or words) in the search, and return the direct results.
That might not be all though – look for misspellings, common associations, and other relatives. This is where association algorithms come in. They’re usually based on prior results. There are many other keys to consider: location, how recent, etc. There are also blacklists for known spam, risks, more.
Then, you have a list of possible options. How do you rank them? What do you show at the top of the list? This is another ranking algorithm. Again, usually based on prior results.
How do you manage the size of the dictionary? MapReduce. Basically shard the hash table by characters (since those are they keys that the user is typing), farm out to the all the appropriate nodes to their more-efficient subsearches, then combine the data back.

~1 in every 7 strings typed into the google search engine has never been typed before!
Write-ahead logging is when the action is logged before it is performed. Think of it in the context of a database write operation – this handles the atomicity and durability principles (of ACID).
Third day of jeop goat.
Replication types. Consider a db mirror, or a master-slave.

Snapshot replications. Vompletely copy the full data over, like a backup or a pgdump.
Transactional replication. Replay the transaction on each slave.
There are more, but don’t worry a ton about them.

Jan 8, 2020

Think about if a system needs to be read-heavy or write-heavy. Something like twitter is much heavier on read operations.
An index is placed on a specific column in a table so that you can perform lookups on that column more quickly. It’s not the primary key, although you can think of the PK as an index if you know the ID, because that’s the search val. More generally, it allows an indexed column to be sorted efficiently on the backend, so that lookup is binary search (logn) instead of n. The downside: this extra data takes space.

Think of it like a dictionary (an actual physical book with word). Indexing it would mean adding pages for each letter in the appropriate place, with little tabs that jut out. You can find your section much faster, but they take a little space.

PUT will update the data in that spot or create it if it isn’t there. It’s idempotent. POST just passes data to the service. The service can do whatever it wants with it. It’s not idempotent. It’s more generic.
Citadel interview got a callback for onsites. Call tomorrow for scheduling.
YAML is a superset of JSON. Can do more, but is bigger. Looks better, too.
Testing pyramid: unit -> integrated -> e2e.
Netflix phone interview.

Great convo. Took a one-pager of notes. Just phone, no coding. 1hr.
Discussed the INTERNET SHIELD, market viability in the future of streaming, attrition, culture.
Her team directly aligns with my automation tools / devops / autotest / sx-setuptools experience.
Next steps: take-home assignment, build an app. I said I’d submit by Saturday.

DHCP for dynamic IP allocation.
Perf is a main linux profiler.
“Computers perform tasks. Machines should be solving problems.”
Practice problems:

https://leetcode.com/problems/search-insert-position. Easy.

Remember runbooks for ops/support. All stacks/apps should have a list of common tasks or inquiries, each with an explicit set of steps to respond with.
Slight tweaks to my resume. I still like my abridged resume more, although nobody else does. It’s just so concise: 20 bullet points to summarize my life so far.
TSLA is absolutely skyrocketing. Hit ~500 today. Its market cap is now 85b, which the highest for any american car company in history. I bought 2 shares on margin just to have fun with it. The emotional roller coaster is worth it, regardless of final profit/loss (since I jumped in ON a surge woohoooooo).
Remember margin trading on RH is interest-free up to 1k, then 5% after that.
Watched a lot of this guy’s channel on system design: https://www.youtube.com/channel/UCRPMAqdtSgd0Ipeef7iFsKw. I like em.
Athletes say a lot of dumb stuff on social media. It seems easiest to critique them for poor decisionmaking/tact, but in reality it’s probably closer to: giving a huge stage to people on a career path where stage presence is not a required skill.
Jeop GOAT day twoooo.
Google phone interview.

Was 45m. The guy was cold, didn’t really reply to anything or change tone. Accent was tough over the phone.
Didn’t introduce himself or the team, jumped straight into a coding question on the google doc.
We did a 2 subsequence comparison question. I solved it, but still had a bad taste in my mouth.
We didn’t discuss previous experience, scaling services, managing projects, writing apps.
Big turnoff for Google, when my interactions with all other folks from Amazon/Netflix/BMW/Citadel/GitLab/Disney have been very positive.

Jan 7, 2020

There’s an online shazaam service called acrcloud. https://www.acrcloud.com/identify-songs-music-recognition-online#record-div.
That old ACB call order which expired would not have broken even if it were filled. Coo.
Dis is still about the same plateau after the d+ surge, when I sold. Coo.
Was digging this article until they starting talking about “electromagnetic radiation” affecting your sleep, suggesting that you ground yourself before bed. Then on to light therapy and all sorts of good stuff: https://medium.com/swlh/sleep-like-a-pro-whats-the-deal-with-deep-sleep-and-how-to-get-more-46dad5da233d.
Interesting discussion in a drug from KRTX against alzheimers: https://www.reddit.com/r/investing/comments/elclxc/im_a_physician_and_im_long_krtx_dd_inside/. This sector is crazy, I like it. Being in the industry seems to provide a ton of inside info, more than relative to being inside others even.
Ricky Gervais’ opening monologue for the golden globes was great.
Practice problems:

https://leetcode.com/problems/longest-valid-parentheses/. Back to try a few leetcode. Much shorter problem statements, love it.

Convertible arbitrage: taking a long position in a convertible security and a short position in the underlying stock. It’s a hedge tactic.
Coding interview with citadel. 1hr. Phone and coderpad. Prepped. Took a lot of notes during. 2 behavioral questions, 2 coding questions. Overall impression: good in both directions.

Said “see ya later” to hang up without even thinking twice about it, like I always do. Had some regret later; might have been interpreted as overconfidence or something. Need to pay attention to details, even ones this small.

Created a doc with all the resources I’ve used over the past month for interviews.
Generalized definition of a greedy algorithm/problem: start small and build the problem up, taking the (new) local optimum each time.
Went back over everything in gdrive/Notes/Career/*. Some fantastic refresher content in there. Mostly technical software, some workplace environment, some interview logistics. Being diligent about documentation my whole life has paid off.
Quicksort/mergesort/heapsort nlogn. Heapsort constant mem, others linear. Bubble slower time, const mem. Radix/bucket can be faster than the main 3 if n is small.
Anything hashable can be used as keys. All the immutable python built-ins qualify, including tuples (whereas lists don’t).
QR = quantitative researcher.
CDS = credit default swap. Kinda like an insurance policy on credit card debt, can get it backed by another investor for a price.
Supercontest.

Changed the home page (the root route /) to the leaderboard instead of your weekly picks.

Scheduled haircut for friday, before onsites.
First day of jeopardy GOAT tournament.
Explored http://highscalability.com/.
Looked up a BUNCH of resources on system design. I feel pretty good about behavioral and algorithm/datastructure questions now – system design is the last piece. Took a bunch of notes, did a few examples. Will focus on more in-depth ones over the next few days for on-sites, but I feel ready enough for the brevity of phones.
Went over 27 common “tell us about a time…” questions and wrote responses to each. I then tried a ~60s spontaneous response to each. There are definite overlaps. A good core of 10 responses can apply/stretch to all 30.

I did generics, and then I did concretes for Amazon’s leadership principles. They’re obviously the same set/class of responses, just need to have them in the toolbox ready to connect.

Pubsub research. Not great in consistency. Latency means timing-related transactions are not great. Subscribers might receive messages at different times. Better for event-driven architectures like games.
JSON serialization/deserialization gets less efficient as the size gets larger and larger. This is where you might pick another data transport for your API, like protobuf. You can have an endpoint return a protobuf object instead of a json object. Very similar.

Typically, use json when talking to a browser. Use protobuf when talking to a service.

Jan 6, 2020

Practice problems.

https://www.hackerrank.com/challenges/max-array-sum/problem. DP. Iterate through the array and notice patterns. What info do you need to hold on to? What are the possible answers? Keep track of the max by index and just walk through.
https://www.hackerrank.com/challenges/abbr/problem. Easy to get most the cases, but hard to get all. Remember for DP you can sometimes build a 2D matrix of 1s and 0s and traverse it (like longest common substring/subsequence).
https://www.hackerrank.com/challenges/candies/problem. An interesting one. Iterate through an array, once forward, once backward, and count the rising edges. Match them by index and take the max.
https://www.hackerrank.com/challenges/decibinary-numbers/problem. Didn’t do it. Wrapped my head around it, but didn’t implement.
https://www.hackerrank.com/challenges/min-max-riddle/problem. A cool problem. What mapping do you need to solve this? Can find the INVERSE of that mapping?. Swapping keys/values in a dict is easy.
https://www.hackerrank.com/challenges/castle-on-the-grid/problem. Used a graph. Build an adjacency matrix of the 4 cardinal directions from each node and remove the ones that are off the grid or on an X. Then it’s simply a shortest-path problem. Use BFS (queue).
https://www.hackerrank.com/challenges/poisonous-plants/problem. Got the easy solution.
https://www.hackerrank.com/challenges/ctci-bfs-shortest-reach/problem. When using BFS for shortest path, remember to slice the list for each neighbor, so you’re not overwriting the root path. And just logically think through everything. Created visited set() and queue deque(). while queue. Get path. Get node. If node == end, exit. If node in visited, continue. Else, lookup the next nodes in the adjacency matrix, add to path, then append to queue.
https://www.hackerrank.com/challenges/find-the-nearest-clone/problem. Getting good at the graph problems. It was a less-experienced topic for me before this.
https://www.hackerrank.com/challenges/ctci-connected-cell-in-a-grid/problem. Could do DFS or recursion or iteration.
https://www.hackerrank.com/challenges/matrix/problem. Fun problem. There’s actually a bug in a unittest in this one.

Done with the hackerrank kit. I’ve probably done 100 practice problems not in total across the platforms.

Confirmed the phone interviews this week.
Dijkstra’s algorithm is used to find the shortest path between two nodes in a graph with weighted edges. If you assigned no weights to edges (such that they’re all the same, 1), it reduces to plain old BFS!
BFS is exponential (based on how many neighbors each node has) in time and memory. It’s great for small problems, but solving something with a state space like a rubik’s cube is huge.
Sets and lists are unhashable in python. For most problems, you can just convert the (i, j) coordinates to a string or something similar.
collections.deque is faster. This whole package is great for performance. defaultdict is awesome. Counter is also really useful for counting stuff (like frequency of char in str, or others).
When using BFS for shortest path, put PATHS (lists of nodes) instead of nodes into the queue.
Append the new nodes to the right of a queue or a stack. But in a queue, pop the node off the left to search next. In a stack, pop the node off the right.
Run, pull, sauna, meditate.
Checked the snail mail. Probably 100 items. Everything was complete garbage except for 1 dmv registration. The signal to noise ratio is laughable (and wasteful of paper / people resources).
Renewed the BMW registration. $150 total. The DMV charges a 2.1% fee for online processing. Fuck them. It should be a discount for online logistics, which have significantly reduced overhead.

Even if it’s Elavon, you need to front that.

For number systems, the exponents always increase from 0 to n-1. This is known.

For binary, the base is 2, so the digits in the number (the constants before the base-exponents) can only be those 2:

[0, 1]

For decimal, the base is 10, so the digits in the number (the constants before the base-exponents) can only be those 10:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

I had thought about the base/exponent before, but never the limited range of digit values.

For previous-project questions, use STAR: Situation, Task, Action, Result.
My Amazon interview is actually at the Santa Monica location, which is more convenient! The hiring managers can pass offers between sites after, if need be.
Took a ton of “give an example of …” notes. Moved gdrive docs around, cleaned.
Database partitioning. Horizontal partitioning is when you keep the table the same and split groups of rows across different machines. Each is usually called a shard. Vertical partitioning is when you split columns off into a new table. Both improve performance, making it distributed (but more complex).
Collections.deque is pronounced “deck” – it stands for double ended queue.
Of course threads in a process have their own stacks but share heap. Separate processes get their own stack and heap.
Disjoint sets. Another useful data structure. Usually implemented as trees. Two operations: find() and union().

Jan 5, 2020

Promise. Async object that will return a value in the future.

If you call an async function, the promise is returned immediately (synchronously) but in a pending state. Later it will become fulfilled or rejected and do something else.
Can attach a callback to handle the return.
.then() returns a promise.

Functional vs OOP.

Functional uses (wait for it) functions – pure, no shared state, no global mutation. It’s based around actions, and is better if you expect your project to scale by adding actions.
Object oriented is usually based classes that interact. They share. It’s based around things, and is better if you expect your project to scale by adding things.

Function composition. Could be decorators (chained functions) or simply meshing the results of multiple returns today. Straightforward.
Class inheritance vs prototype inheritance.

The class is a template, not an object. Gives you a template for your subclass.
The prototype is an actual object. You create clones or slightly different children of it.

General composition vs inheritance.

Does B want ALL features of A? Inherit. Only want some? Compose.

I generally agree with this: https://medium.com/inc./amazon-uses-a-secret-process-for-launching-new-ideas-and-it-will-transform-the-way-you-work-aec5c9121ae.

Work backward from the customer, not forward from the product.
Start with the problem and find the solution, don’t create something and then check what it helps.

Couldn’t find a decent ebook download, so I ordered a hardcover for the first time in years. The man who solved the market, jim simons, rentech. Excited to read, but will probably wait until after interviews.
NFC wild card games. Vikings and Seahawks advance.

Divisional games: Ravens/Titans, Chiefs/Texans, Niners/Vikings, Packers/Seahawks.
The overtime rules are so dumb. The percentage of a goalie blocking a pk in a shootout is about 30%. That’s about the same ballpark as the probability of getting a touchdown on a drive. Imagine if the soccer game ended when the goalie blocked the first kick.

Made 12 new powders.
Caught up with Spencer. Slater’s on the pier. Miss him! Bomb nashville chicken sandwich.

Talked interviews, personal projects, SpaceX, software, life, investing.
Rewriting the autotest ui and other frontends in python with transcrypt.
Buying stock in a company is not directly financially enabling them, the stock pool is mostly set. It’s just changing hands. Buying a vote in investors meetings. It does increase volume and other things that make the company seem more valuable.
Certification got convoluted. Half is encryption, the other half is identification. They should be separate.

Practice problems.

I’m gold for problem solving. https://www.hackerrank.com/scoring.

https://www.hackerrank.com/challenges/friend-circle-queries/problem. Wrote a solution in 10min that passed half the tests. You can optimize further with disjoint sets, but I feel comfortable with my non-fancy solution.
https://www.hackerrank.com/challenges/maximum-xor/problem. Solved the brute force way, O(mn). To get it in linear time instead of quadratic, you have to use something called a Trie, which is basically a binary tree with 0s and 1s. https://en.wikipedia.org/wiki/Trie.
https://www.hackerrank.com/challenges/insert-a-node-at-a-specific-position-in-a-linked-list/problem. Standard linked list insert into position. I always have to think for a minute about the while loop. Manage your own index, keep track of prev/next, and make a few things conditional on index == 0. You can simplify it if you create an entirely new list, but the in-place version is better for space situations.
https://www.hackerrank.com/challenges/insert-a-node-into-a-sorted-doubly-linked-list/problem. I wrote the long-winded solution, which worked. HOWEVER. You can do these ll problems with recursion. It’s way more concise, it’s just a bit harder to conceptualize imo.
https://www.hackerrank.com/challenges/reverse-a-doubly-linked-list/problem. Same thing. Try to thing about the recursive solution. If it’s too complicated, fall back to iterative.
https://www.hackerrank.com/challenges/find-the-merge-point-of-two-joined-linked-lists/problem. Looped through both. Quadratic solution, which passed all tests, but there is an O(n) algorithm on the discussion page. It’s just less intuitive.
https://www.hackerrank.com/challenges/ctci-linked-list-cycle/problem. Check if there’s a cycle in a linked list. Easy. The key here is knowing that you can store objects as keys in a hash table. Then simply loop through the ll and bookkeep them. If you ever already have one, there’s a cycle. The hash lookup is constant time.
https://www.hackerrank.com/challenges/binary-search-tree-lowest-common-ancestor/problem. Tree problems make me think for a bit longer. Remember it’s like a linked list, but you each node has two pointers instead of one. This yields two calls in recursion, or two expansions in iteration. This particular one could be solved with math, because you’re given a binary search tree that’s already balanced, but you could walk it all the same to find the answer.
https://www.hackerrank.com/challenges/ctci-is-binary-search-tree/problem. Walk through a BST. I solved it both iteratively and recursively. An additional note: a binary search tree is valid if the inorder traversal is sorted properly.
https://www.hackerrank.com/challenges/tree-huffman-decoding/problem. Very easy question. Solved it quickly, but the wording was overly complicated.
https://www.hackerrank.com/challenges/balanced-forest/problem. Didn’t try. This problem is too long.
https://www.hackerrank.com/challenges/torque-and-development/problem. Build an adjacency dict of {node: {neighboring nodes}}. Then traverse it and group them.
https://www.hackerrank.com/challenges/find-the-nearest-clone/problem. For BFS, literally just pop from the from of the queue instead of the end of the stack. The graph problems have been longwinded for me.
https://www.hackerrank.com/challenges/ctci-fibonacci-numbers/problem. Recursive fibonacci. Easy as pie. Remember it’s 2^n (exponential) without dp, and down to O(n) with memoization.
https://www.hackerrank.com/challenges/ctci-recursive-staircase/problem. The staircase problem with 1/2/3 hops. This is the same as fibonacci with 3 numbers in the sequences, and the 3 exit conditions in order are 1/2/4. Remember these get much faster with a cache. Check if n is in the cache, and only recurse if it’s not.
https://www.hackerrank.com/challenges/recursive-digit-sum/problem. Easy. Recurse.
https://www.hackerrank.com/challenges/crossword-puzzle/problem. Thought about it for about 20min, but then didn’t take the time to solve. A very straightforward problem, but tedious. This is like a simpler version of the 5squared puzzle I’ve solved (programmatically).
https://www.hackerrank.com/challenges/balanced-brackets/problem. Easy. Just run a stack for open/close. Few small corner cases.
https://www.hackerrank.com/challenges/ctci-queue-using-two-stacks/problem. Lol what a silly question. Just implement the methods of a queue using a python list, enqueue/dequeue/peek.
https://www.hackerrank.com/challenges/largest-rectangle/problem. Same as the leetcode one. Use two pointers coming in from both sides.

11 left. Will do tomorrow.

XOR of two integers is just the XOR of the bit representations of the integers. In python ^ is the operator for XOR. For example, 2 ^ 1 = 3.
In python it’s set.add vs list.append.
In python sets are implemented with hash tables, so lookup is basically the same as a dict (extremely fast).

Jan 4, 2020

Practice problems.

https://www.hackerrank.com/challenges/ctci-bubble-sort/problem. Just implement a bubble sort. One loop through n-1, check if bigger than next, swap if so. Then simply wrap the whole thing in a blind loop to run n times, which guarantees it’ll be sorted when finished.
https://www.hackerrank.com/challenges/mark-and-toys/problem. Basic sort.
https://www.hackerrank.com/challenges/ctci-comparator-sorting/problem. Write a comparator method of a class. Pass two objects, do whatever logic, then return -1 if obj1 goes before obj2 or 1 if opposite. Then you can pass the Class.comparator function as a key to a sort call, or whatever.
https://www.hackerrank.com/challenges/fraudulent-activity-notifications/problem. Tougher, but I really enjoyed this one. Success rate was low, in the 30s. Sliding window for median, remember you can access it by index (middle if n odd, (middle + middle-1) if n even). To remove/insert in the sliding window, use the bisect module for speed.
https://www.hackerrank.com/challenges/ctci-merge-sort/problem. Brainstormed this one for about 30m and then chose not to implement. The question is basically “write you own merge sort”. O(nlogn). This allows you to count the inversions. No thanks.
https://www.hackerrank.com/challenges/ctci-making-anagrams/problem. Simple. My solution was cool.
https://www.hackerrank.com/challenges/alternating-characters/problem. The easy ones are easy.
https://www.hackerrank.com/challenges/sherlock-and-valid-string/problem. Little more complicated, but in syntax. Nothing algorithmically special here.
https://www.hackerrank.com/challenges/special-palindrome-again/problem. Tired of palindromes. I don’t like these problems.
https://www.hackerrank.com/challenges/common-child/problem. Didn’t care to do this one either. Don’t like longest common subsequence problems. Remember the table of 1s and 0s, slowly sum as you move down and right.
https://www.hackerrank.com/challenges/minimum-absolute-difference-in-an-array/problem. Nice change of pace. GREEDY algorithms. If you find yourself facing a quadratic solution, just sort! See if that helps.
https://www.hackerrank.com/challenges/luck-balance/problem. Simple.
https://www.hackerrank.com/challenges/greedy-florist/problem. Solved, but dumb question. It’s worded very ambiguously, and I had to modify the footer because the main function was completely missing a required argument (first time I’ve seen this on hackerrank).
https://www.hackerrank.com/challenges/angry-children/problem. Easy. Fast. Don’t forget outside of this section; consider a sort!
https://www.hackerrank.com/challenges/reverse-shuffle-merge/problem. Not gonna touch this one. Not worth my time, since it’s limited. Trying to understand the problem description alone took 10 minutes.
https://www.hackerrank.com/challenges/ctci-ice-cream-parlor/problem. Find a pair within 1d array, so iterate through it once, build a complement map.
https://www.hackerrank.com/challenges/swap-nodes-algo/problem. Didn’t do. This question takes 30 minutes to read.
https://www.hackerrank.com/challenges/triple-sum/problem. One you have to think about, doesn’t fit a standard pattern. Solve an easy case and you’ll see you’re best off approaching along a specific dimension (q).
https://www.hackerrank.com/challenges/minimum-time-required/problem. An interesting one as well. Basically implement your own binary search. Find best and worst case for lower and upper bounds, bisect them, calc the result, then shift one bound depending upon what the produced value is. Put this all in a loop `while lower_bound < upper_bound`.
https://www.hackerrank.com/challenges/maximum-subarray-sum/problem. <30% solve rate. Took a stab at it. Started with ~30m of whiteboarding. Reached the conclusion that the module of the sum is the modulo of (the sum of the modulos of the individuals). This is the key piece.
https://www.hackerrank.com/challenges/making-candies/problem. Not really a search problem, just walk through the simulation. My solution was the right complexity (passed 35/49 tests), just needed smaller optimizations here and there.
https://www.hackerrank.com/challenges/flipping-bits/problem. “Flipping bits” is just subtracting from the max size. So if you flip bits on a 32bit unsigned integer x, the result is 2^32-1-x.
https://www.hackerrank.com/challenges/ctci-big-o/problem. Check if a number is prime. To do so, iterate only over odd numbers (because even divides by 2), and only up to square root of x. Every pair of factors, if it exists, has one number below the sqrt and one above, so you only need to check up to the integer at floor(sqrt(x)).

Only 30 problems left. The average difficulty is a little on the easier side, too. The hardest is like 40% pass rate. I’ll try to do most tomorrow so that I can focus on the open-ended questions monday. Behavioral, design, etc.

To switch two items in an array in python in one operation, just do:

list[index1], list[index2] = list[index2], list[index1]

Instead of doing quotient = math.floor(num/den) and rem = num % den, get both with quotient, rem = divmod(num, den).

Oh shit. You can also get the quotient directly with // in py3. 11 // 2 results in 5. Don’t need math.floor everywhere.

If n is given for a problem to be 10^9, you probably need a linear solution. If 10^5, try to find an nlogn algorithm. If 10^4, quadratic is ok.
Red-black trees are colored to preserve balance.
Push, meditate.
Ordered more preferred stock.
AFC playoffs. Texans and Titans moving on.
Supercontest. Disabled the emailer during the offseason.

The machine was out of disk space, so I sshed in and ran docker system prune -af to free up 17.89GB. Remember to create the network and restart the nginx-proxy containers as well after this operation.