2021-12-27 – Mostly Software

Peter Thiel’s stanford talk.
- https://www.youtube.com/watch?v=3Fx5Q8xGU8k
- Competition is for losers.
- Be the big fish in a small market, then grow the market (rather than enter as a small fish in a big market).
3 on restaurants/drugstores. Temp benefits with doordash+lyft. 200 after 500 in first 3mo.
Read a bit about fund fees, different bonds, windfalls, tax, more while making fried chicken (remember cornstarch + baking powder for extra crunch).
Finished aws reinvent.
- Deep dive on Amazon EKS.
  - Vanilla k8s, other than the occasional security patch. They manage nodes, os, kubelet, cri, ami (plus the full standard k8s control plane). Upgrading cluster k8s version is turnkey (then test).
  - AWS Key Management Service, AWS certificate manager, AWS Secrets Manager. No need for vault or others.
  - IAM roles to service accounts, then runs pods as that service account (not 1-1 mapped SA <-> NS). RBAC and cluster separation for different tenants, of course.
  - No ssh, access nodes through AWS Systems Manager.
  - They scale the cluster level, even up to 10s of thousands of nodes. Also intelligent scaling of stuff like maxRequests. They have a custom next-gen autoscaler called Karpenter (although HPA/VPA/CA are avail). https://github.com/aws/karpenter.
  - Supports opentelem, fluentbit, prometheus, many for obs. Flux for gitops (although argo better).
- Kubernetes at AWS: Strategy, road map, and vision.
  - Combining with AWS Outposts, remember you can run EKS on-prem.
  - Supports all the usuals. Helm, spinnaker, istio.
  - Bottlerocket is the aws os built specifically for containers.
- Up-level your container image security with the latest from Amazon ECR.
  - Managed container registry. Integrates with all the expected AWS services, eks ecs ec2 fargate. Currently ~8b images downloaded per week.
  - Scalable, storage-wise. Runs security too, scanning all images (Amazon Inspector). This runs continuously, not just on push. Not just OS, but all layers of images. And can sign/verify images.
- Delivering code and architectures through AWS Proton and Git.
  - Proton handles IaC (terraform), pipelines (jenkins), and observability (prometheus). Offers templates for devs to start, apply, verify against, and deploy.
  - Some template verification offerings as well.
- Amazon builder’s library: Operational excellence at Amazon.
  - Sending “Ops Win” emails, good milestones with large audiences to encourage ops culture.
  - Retros and postmortems, regularly.
  - Go over prepared content together, but also have free time to look at graphs as a team. Open your datadog dashboards and ask for insights.
- Best practices for securing your software delivery lifecycle.
  - Security testing (in your CI pipelines) must include static and dynamic security analysis.
  - AWS CodeArtifact, AWS CodeGuru, AWS CodeBuild, AWS Parameter Store, AWS CodeDeploy, and AWS ECR (aforementioned container scans).
  - Integrate with CodeReview interface, and use ML to automatically suggest changes.
- How to reuse patterns when developing infrastructure as code.
  - AWS Cloud Development Kit and AWS CloudFormation. Competitor to Terraform.
  - Create reusable/sharable components/modules.
- Automating cross-account CI/CD pipelines.
  - Not much.
- Using feature flags to avoid downtime during migrations (LaunchDarkly).
  - New features, debug logging, heavy loadpaths, subjective canaries, switching databases, more.
- Slack is the digital HQ for AWS developers and DevOps teams (Slack).
  - Incident management, organized involvement, anyone can spectate. Much agreed.
  - Standups, PR activity, workflows, emoji responses, everything else. All known, but all good.
- On AWS, details matter: Why full-stack observability wins (Splunk).
  - Extract with OpenTelemetry. Primarily traces, metrics, logs. SDKs for languages, collectors, ui, infra.
  - Processors/filters to exclude secrets.
  - Investigate path: Errors/latency at the mesh -> zoom into node/app/service -> traces and find bottleneck -> check logs for that.
- Intentional and empathetic observability (Datadog).
  - Don’t start with a ticket titled “Create dashboard”. Start with a metric of concern and focus on it, deliberately for outcomes.
    - Kinda agree. Anomaly detection and generics are much better at insights now than they used to be. Specific aren’t always better, and are often biased.
  - Put dashboards in pager description, etc. All the expected. Need a consistent reaction, especially with newhires.

Monday