2021-12-26 – Mostly Software

Annual santa run on the Ducati.
James Webb finally went up. Also watched Don’t Look Up, nice combo.
FIDE Rapid world championship started today. Blitz starts Wednesday.
Added all sbsc payers to the league_user_association table for premium.
More AWS Reinvent:
- Deep dive on Amazon Aurora.
  - MySQL+PostgreSQL. Relational.
  - Data is always stored across 3 availability zones. 1 write. Up to 15 read replicas per AZ?
  - Locks at the storage layer, not the relational layer (higher). Very fast recovery, it’s doing it from disk all the time (nominally).
  - There’s “DevOps Guru” for RDS as well, anomaly detection. Locks, performance issues in the DB, more. Suggests resolution steps.
  - Babelfish is the migration tool from SQL server apps to aurora postgres.
- Amazon DynamoDB: Driving innovation at any scale.
  - NoSQL, millisecond perf, scalable.
  - Key to scaling is very aggressive sharding. Even to a single row sometimes. Standard request router in front.
  - Case study with Mercado Libre (Argentinian online marketplace). They originally self-managed cassandra (kv). Migrated to dynamo.
- What’s new in Amazon RDS for Oracle.
  - Obv managed Oracle DBs in the cloud. Performant and scalable like all others, common theme.
  - Case study with Goldman Sachs and their transaction banking platform on oracle.
- Real-world use cases with graph databases.
  - Focus on Neptune. Any data with relationships, connections. Examples: Fraud detection, identity resolution, knowledge organization.
  - Case study: LexisNexis. Law data, court data, vendor data. Users upload their briefs, and LexisNexis suggests others by connecting data. Extract principal components to establish edges. Then give an API that shows important stats (court outcomes, value of citations, etc). Get new users, grow graph.
- Amazon EBS under the hood: a tech deep dive.
  - S3 for object, EBS for block, EFS and FSx for file. All attach to EC2, offer backups, scale, more.
  - Most run on gp3, aws’ general purpose ssd. io2 is the block express specifically for EBS. Up to 256,000 IOPS. <1ms latency.
- Deep dive on Amazon EFS.
  - “Serverless” (a word used liberally at this conference) and scalable.
  - Can run jenkins, wordpress, airflow, mongo, many.
  - Integrates with ec2, fargate, lambda, eks, ecs, many other aws tools.
  - Automatically moves files into low-frequency zones if they’re not accessed often. Intelligent tiering (customizable).
- Remember fargate is just serverless orchestration. Instead of managing your own nodes and EKS, you just manage your containers and fargate will provision based on their cpu/mem requirements.
- Glacier offers instant retrieval.
- Simplify your file-based workloads with Amazon FSx.
  - Again, pay-for-what-you-use. Nice.
  - Supports windows file server, lustre, netapp ontap, openzfs.
  - Same as EFS. Can run hpc, basic apps, home dirs, data analytics, pipelines, whatever.
- Deep dive on Amazon S3 security and access management.
  - Data lakes, log data, content/assets, configs.
  - TLDR: (1) Block public access (2) encryption (3) bucket policies (4) bucket owners.
  - Can have amazon manage keys or provide your own.
  - IAM policies are very configurable.
- Building a data lake on Amazon S3.
  - Driven by data growth. A single DB is fine. Query, understand, perform. With 1000s of random DBs, need to rethink how to get insights.
  - Ok, well ETL everything into a known structure. Sure. But extra work. And you have to maintain the pipelines. Therefore -> data lake.
  - Add entropy to prefix for bursty workloads for perf.
- AWS storage solutions for containers and serverless applications.
  - Discussed EFS, S3, Event Notifications, Storage Lens, X-Ray, more.

Sunday