• Monday

    • Peter Thiel’s stanford talk.
      • Competition is for losers.
      • Be the big fish in a small market, then grow the market (rather than enter as a small fish in a big market).
    • 3 on restaurants/drugstores. Temp benefits with doordash+lyft. 200 after 500 in first 3mo.
    • Read a bit about fund fees, different bonds, windfalls, tax, more while making fried chicken (remember cornstarch + baking powder for extra crunch).
    • Finished aws reinvent.
      • Deep dive on Amazon EKS.
        • Vanilla k8s, other than the occasional security patch. They manage nodes, os, kubelet, cri, ami (plus the full standard k8s control plane). Upgrading cluster k8s version is turnkey (then test).
        • AWS Key Management Service, AWS certificate manager, AWS Secrets Manager. No need for vault or others.
        • IAM roles to service accounts, then runs pods as that service account (not 1-1 mapped SA <-> NS). RBAC and cluster separation for different tenants, of course.
        • No ssh, access nodes through AWS Systems Manager.
        • They scale the cluster level, even up to 10s of thousands of nodes. Also intelligent scaling of stuff like maxRequests. They have a custom next-gen autoscaler called Karpenter (although HPA/VPA/CA are avail). https://github.com/aws/karpenter.
        • Supports opentelem, fluentbit, prometheus, many for obs. Flux for gitops (although argo better).
      • Kubernetes at AWS: Strategy, road map, and vision.
        • Combining with AWS Outposts, remember you can run EKS on-prem.
        • Supports all the usuals. Helm, spinnaker, istio.
        • Bottlerocket is the aws os built specifically for containers.
      • Up-level your container image security with the latest from Amazon ECR.
        • Managed container registry. Integrates with all the expected AWS services, eks ecs ec2 fargate. Currently ~8b images downloaded per week.
        • Scalable, storage-wise. Runs security too, scanning all images (Amazon Inspector). This runs continuously, not just on push. Not just OS, but all layers of images. And can sign/verify images.
      • Delivering code and architectures through AWS Proton and Git.
        • Proton handles IaC (terraform), pipelines (jenkins), and observability (prometheus). Offers templates for devs to start, apply, verify against, and deploy.
        • Some template verification offerings as well.
      • Amazon builder’s library: Operational excellence at Amazon.
        • Sending “Ops Win” emails, good milestones with large audiences to encourage ops culture.
        • Retros and postmortems, regularly.
        • Go over prepared content together, but also have free time to look at graphs as a team. Open your datadog dashboards and ask for insights.
      • Best practices for securing your software delivery lifecycle.
        • Security testing (in your CI pipelines) must include static and dynamic security analysis.
        • AWS CodeArtifact, AWS CodeGuru, AWS CodeBuild, AWS Parameter Store, AWS CodeDeploy, and AWS ECR (aforementioned container scans).
        • Integrate with CodeReview interface, and use ML to automatically suggest changes.
      • How to reuse patterns when developing infrastructure as code.
        • AWS Cloud Development Kit and AWS CloudFormation. Competitor to Terraform.
        • Create reusable/sharable components/modules.
      • Automating cross-account CI/CD pipelines.
        • Not much.
      • Using feature flags to avoid downtime during migrations (LaunchDarkly).
        • New features, debug logging, heavy loadpaths, subjective canaries, switching databases, more.
      • Slack is the digital HQ for AWS developers and DevOps teams (Slack).
        • Incident management, organized involvement, anyone can spectate. Much agreed.
        • Standups, PR activity, workflows, emoji responses, everything else. All known, but all good.
      • On AWS, details matter: Why full-stack observability wins (Splunk).
        • Extract with OpenTelemetry. Primarily traces, metrics, logs. SDKs for languages, collectors, ui, infra.
        • Processors/filters to exclude secrets.
        • Investigate path: Errors/latency at the mesh -> zoom into node/app/service -> traces and find bottleneck -> check logs for that.
      • Intentional and empathetic observability (Datadog).
        • Don’t start with a ticket titled “Create dashboard”. Start with a metric of concern and focus on it, deliberately for outcomes.
          • Kinda agree. Anomaly detection and generics are much better at insights now than they used to be. Specific aren’t always better, and are often biased.
        • Put dashboards in pager description, etc. All the expected. Need a consistent reaction, especially with newhires.

  • Sunday

    • Annual santa run on the Ducati.
    • James Webb finally went up. Also watched Don’t Look Up, nice combo.
    • FIDE Rapid world championship started today. Blitz starts Wednesday.
    • Added all sbsc payers to the league_user_association table for premium.
    • More AWS Reinvent:
      • Deep dive on Amazon Aurora.
        • MySQL+PostgreSQL. Relational.
        • Data is always stored across 3 availability zones. 1 write. Up to 15 read replicas per AZ?
        • Locks at the storage layer, not the relational layer (higher). Very fast recovery, it’s doing it from disk all the time (nominally).
        • There’s “DevOps Guru” for RDS as well, anomaly detection. Locks, performance issues in the DB, more. Suggests resolution steps.
        • Babelfish is the migration tool from SQL server apps to aurora postgres.
      • Amazon DynamoDB: Driving innovation at any scale.
        • NoSQL, millisecond perf, scalable.
        • Key to scaling is very aggressive sharding. Even to a single row sometimes. Standard request router in front.
        • Case study with Mercado Libre (Argentinian online marketplace). They originally self-managed cassandra (kv). Migrated to dynamo.
      • What’s new in Amazon RDS for Oracle.
        • Obv managed Oracle DBs in the cloud. Performant and scalable like all others, common theme.
        • Case study with Goldman Sachs and their transaction banking platform on oracle.
      • Real-world use cases with graph databases.
        • Focus on Neptune. Any data with relationships, connections. Examples: Fraud detection, identity resolution, knowledge organization.
        • Case study: LexisNexis. Law data, court data, vendor data. Users upload their briefs, and LexisNexis suggests others by connecting data. Extract principal components to establish edges. Then give an API that shows important stats (court outcomes, value of citations, etc). Get new users, grow graph.
      • Amazon EBS under the hood: a tech deep dive.
        • S3 for object, EBS for block, EFS and FSx for file. All attach to EC2, offer backups, scale, more.
        • Most run on gp3, aws’ general purpose ssd. io2 is the block express specifically for EBS. Up to 256,000 IOPS. <1ms latency.
      • Deep dive on Amazon EFS.
        • “Serverless” (a word used liberally at this conference) and scalable.
        • Can run jenkins, wordpress, airflow, mongo, many.
        • Integrates with ec2, fargate, lambda, eks, ecs, many other aws tools.
        • Automatically moves files into low-frequency zones if they’re not accessed often. Intelligent tiering (customizable).
      • Remember fargate is just serverless orchestration. Instead of managing your own nodes and EKS, you just manage your containers and fargate will provision based on their cpu/mem requirements.
      • Glacier offers instant retrieval.
      • Simplify your file-based workloads with Amazon FSx.
        • Again, pay-for-what-you-use. Nice.
        • Supports windows file server, lustre, netapp ontap, openzfs.
        • Same as EFS. Can run hpc, basic apps, home dirs, data analytics, pipelines, whatever.
      • Deep dive on Amazon S3 security and access management.
        • Data lakes, log data, content/assets, configs.
        • TLDR: (1) Block public access (2) encryption (3) bucket policies (4) bucket owners.
        • Can have amazon manage keys or provide your own.
        • IAM policies are very configurable.
      • Building a data lake on Amazon S3.
        • Driven by data growth. A single DB is fine. Query, understand, perform. With 1000s of random DBs, need to rethink how to get insights.
        • Ok, well ETL everything into a known structure. Sure. But extra work. And you have to maintain the pipelines. Therefore -> data lake.
        • Add entropy to prefix for bursty workloads for perf.
      • AWS storage solutions for containers and serverless applications.
        • Discussed EFS, S3, Event Notifications, Storage Lens, X-Ray, more.

  • Friday

    • Did a few days of Advent of Code. Missed so much from vacation that I won’t fully catch up, unfortunately.
    • 2yr hard inquiries for home clearing soon.
    • Async, await, gather. Coroutines, not threads. App determines handoffs, not OS scheduler.
    • Remember HTTP/1, new TCP connection for every static fetch. Gets resource intensive. HTTP/2, multiplexing all over 1 connection. But if packet loss, blocking. HTTP/3 uses QUIC (multiplexing over UDP) instead of TCP.
    • More aws reinvent agenda viewing:
      • Architecting your serverless applications for hyperscale.
        • Great overview. Say, 1k requests/s bursting to 25k.
        • CloudFront (AWS CDN) sitting in front of S3. API gateway for the entry, routing, rate limiting, etc. Then lambda, dynamodb+aurora on the backend. SNS+SES for queueing notifications.
        • CloudWatch (monitoring) provides suggestions for optimizing coldstart time (to provision a new env) for lambda. Lambda also provides power tuning, calibrating memory and such to optimize function timing and therefore concurrency.
        • RDS proxy in front of DBs to manage connections, secrets, more.
        • Elasticache in front of aurora, DAX in front of dynamo.
        • X-Ray distributed tracing.
      • Amazon Managed Blockchain: When to use blockchain.
        • For where multiple entities/actions/events need consensus (with equal rights). Not for high performance (although solana 50k transactions per second now).
        • Public vs private blockchains, of course. Public like eth. Private like an internal for tracking eg shipments.
      • Advanced Amazon VPC design and new capabilities.
        • Full overview. Transit gateway, direct connect, more. Didn’t pay 100% attention to this one.
      • ML with Metaflow and Kubernetes: Prototype to production on Amazon EKS.
        • Metaflow: workflow infra for data science (compute, orch, monitoring, deploy, etc). Python lib that offers decorators to define the flow.
      • Accelerate front-end web and mobile development with AWS Amplify.
        • UI for model/schema definition. Easy integration with dynamo. Test interface. Deployment helpers.
      • Create from anywhere: The Netflix Workstations story.
        • Control plane in Java, Agent in Go, deployment with Spinnaker, Salt (remember python ansible) for config.
      • Building connected auto solutions with AWS IoT.
        • This was a case study for IoT in automobiles, not a technical overview.
      • Optimize compute for cost and capacity.
        • Processors: Intel, AMD, and AWS’ own graviton. Can be optimized for burst, compute, mem, general, gpu. Mixed model, scaling. The usual subjective calibration.
      • Optimize your Amazon EC2 usage with AMD-based instances.
        • 3rd gen EPYC CPUs, codenamed Milan.
      • AWS Outposts: Bringing the AWS experience on premises.
        • Integrations (eg terraform) still work. Better latency. Shorter network. Private.
        • The common rack is 42U, about the size of a fridge. Runs Nitro and the control plane. The hypervisor, the vpc network, ebs, everything. Cabled out of the box. All hardware. There are also smaller options.
      • Full-stack observability in your application-first/hybrid world (Cisco).
        • Breadth, just an overview of various observability components. Few examples.

  • Wednesday

    • Subscribed to money scoop and emerging tech.
    • Upped a few growth and value ETF positions.
    • Building a board: https://ryancaldbeck.co/2021/12/17/building-a-board-what-i-wish-i-had-known/.
    • Finally got to watch my agenda from AWS reinvent:
      • MLOps at Amazon: How to productionize ML workloads at scale.
        • Integrate ML into DevOps and DataEng pipelines. Training, CI/CD, performance, analytics, everything. On the final deliverables, on algorithms, and meta on the pipelines themselves.
        • Used for suggestions (of course), search matching, extended “customers also bought”, ETA prediction, optimal packaging/box size/shape, drone delivery pathing.
        • Sagemaker.
      • Implementing MLOps practices with Amazon SageMaker, featuring Vanguard.
        • Remember athena is the query service to run sql against s3.
      • How Amazon.com transforms customer experiences through AI/ML.
        • Again, lots of forecasting to map supply and demand.
        • A bit of computer vision for placing/picking in warehouses.
      • Data lakes: Easily build, secure, and share data with AWS Lake Formation.
        • Governed tables support: ACID transactions, storage optimization, and version history.
        • Supports col, row, and cell based permissions. Also custom tags (TBAC).
        • Data mesh between producers and consumers, internals and customers.
      • How the NFL Raiders became analytics champions in Las Vegas (sponsored by Matillion).
        • NOT for sports stats unfortunately…was for ticket sales and business analytics. Bit of clickbait.
        • Built on snowflake+tableau.
      • Building next-gen applications with event-driven architectures.
        • Direct synchronous, REST with sender/receiver. Better: stick a queue in between. A full bus can sit on top of the queue for routing and more. This is Amazon EventBridge.
        • Each event is just a json. You can write rules against anything. Events represent a system change. They’re immutable and have a timestamp.
        • Taco Bell’s example with AWS Step Function. Event-driven workflows.
      • Drawing the New York City skyline with Amazon Aurora Serverless v2.
        • Aurora is compatible with both postgres and mysql, but faster and cloud.
        • Remember (serverless != functions). Aurora is a serverless DB. Helpful for dev/test envs. Very useful for systems with intermittent or unpredictable load.
        • Because serverless, can scale very cheaply. v1 would double when capacity was reached. V2 now increase incrementally, with optimization.
        • Scaling is subsecond!
        • The nyc skyline part is just a timeseries of cpu load, querying aurora heavily at specific times to create a full step function that looks like a skyline.
      • Build a high-throughput asset-tracking architecture.
        • Inventory focused. Physical assets.
      • Certification. (shorts)
        • What is an AWS Certification?
        • Why is AWS Certification important?
        • What AWS Certifications are available?
          • Foundational (Cloud Practitioner), Associate (Architect/Operations/Developer), Professional, Specialty.
        • AWS Developers’ career stories.
      • Evolutionary AWS Lambda functions with hexagonal architecture.
        • Hexagonal = ports and adapters. Ports = interfaces/APIs/protocols. Adapters = connections to others. Basically plug and play components.
        • The 6 is not important. Just layers with inputs and outputs.
    Example of hexagonal architecture with an inner hexagon representing the application core, and an outer hexagon for the adapters, the border between the two being the ports
    https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)

  • Tuesday

    • Finally home from Miami. 2 weeks, full list of everything in drive and shared album in photos.
    • Looked into both an LLC and a trust to manage various sections of my assets.
    • Helium miner. Install device that extends the network, earn HNT. Proof of Coverage. Radio frequency. More for IoT.
    • Intra month credit payments to reduce util for my next round of opening new credit.
    • Flight home, stood in the back with a flight attendant (Jane! AA1664 2021-12-20) who had a herniated disc years ago. She was out for 1.5 years, horrible same as mine but thankful it happened for proper longterm body care. Meant a lot to hear this empathy.
    • TENS helping a lot.
    • Deleted all my Blockchain for Devs notes but will revisit and actually perform exercises next round instead of just airplane-reading.
  • Friday

    • World Chess Championship Game 6 today was fantastic. Nearly 8 hours. Multiple flips. Queen-bishop vs rook-rook-knight endgame. 136 moves, breaking the length record in a wcc.
    • https://aws.amazon.com/blogs/aws/top-announcements-of-aws-reinvent-2021/
    • Video Speed Controller chrome extension for all the on-demand aws:reinvent content. Huge.
    • JupyterHub 2.0 out, including RBAC.
    • Image builds.
      • Remember to enable buildkit.
      • Classics: multistage and alpine (or busybox).
        • Alpine ships with ~10k pgs, (about 5-10x less than main dists).
        • Remember alpine uses musl instead of the standard gnu c library so your build stage has to accommodate this when compiling the binary (usually a tag, something like “FROM golang:alpine”). Or just build on alpine, adding build-base.
      • There’s also “scratch” but you’ll miss a lot from alpine. No shell.
      • -static for independence (but size). Always check with ldd.
        • For go, CGO_ENABLED=0 to make static.
      • Java of course run in jre not jdk for size. openjdk:11-jre, openjdk:14-alpine. jlink to simplify build -> run env.
      • Remember precompiled libs for python (eg numpy wheel) are for a specific arch so you’ll need a specific version to run on a diff dist like alpine. Pretty much all python apps/services are fine, but data science on python alpine is not as seamless.
      • Every line is a layer. Every instruction is designed to be as hermetic as possible. Leverage this to use common layers ACROSS different images, of course. This cache is shared. Every layer is compressed by docker.
      • For scratch or other ground-up, make sure to copy in /etc/ssl so you trust the cert signatures that servers respond with. You may need timing info as well, /usr/share/zoneinfo. Or /etc/passwd or /etc/group.
    • Redis stands for REmote DIctionary Server. Never knew that.
      • Modules for search, graph, timeseries, documents. Core is kv, of course.
    • Good overview of Delos at FB (more design, not technicals): https://maheshba.bitbucket.io/blog/2021/10/19/42Things.html.
      • Reinforcement of my daily practice, from another perspective: “Write papers. Writing for an audience that has zero context on what you are doing will force you to examine and clarify your assumptions.”