- Played with Linear, a new Jira-ish platform for issue tracking. It’s pretty sleek. https://linear.app/mahlstedt
- Cloudwatch.
- Created a custom dashboard for fun in cloudwatch for monitoring, then deleted.
- Basic monitoring is free and samples in 5min periods. To increase to 1min, you can enable Detailed Monitoring, but then you incur charges.
- EC2 doesn’t natively monitor memory. You have to manually install/configure the cloudwatch agent on the instance: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
- wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb (just for x86-64 ubuntu).
- sudo dpkg -i -E ./amazon-cloudwatch-agent.deb
- Created the config /opt/aws/amazon-cloudwatch-agent/bin/config.json and had it collect mem_used_percent every 5 minutes.
- sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
- Created a CloudWatchAgentServerPolicy IAM role and associated it with the EC2 instance.
- Wait a few minutes after starting the cloudwatch agent. Then check cloudwatch in aws browser and you can visualize mem.
- It doesn’t list 22 as a supported ubuntu platform (yet), but it still worked. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html.
- I believe this will incur a slight cost because usage of the cloudwatch agent is considered a custom metric. We’ll see.
- Then added mem to the main EC2 and created a final dash: https://us-west-1.console.aws.amazon.com/cloudwatch/home?region=us-west-1#dashboards:name=EC2withMem
- Favorited some AWS services.
- Supercontest.
- Updated ansible.cfg to remote_user=ubuntu (it uses the current user otherwise, and bmahlstedt does not have access to my new ec2 instances).
- Moved the privkey spec to the hosts file instead of the makefile CLI calls.
- Pointed the privkey to the new pem instead of the old DO one.
- Confirmed that remote backup/restore works.
- IAM.
- Adjusted the root account, aliased to “mahlstedt” now (so I don’t have to login with the 12 digit account ID).
- Created usergroup Admin. Created user supercontest. Wanted a separate user so the root account doesn’t use its own access keys for daily tasks. The root account should just be used for billing, IAM, that sort of thing. I’ll use supercontest@mahlstedt for most.
- Cleaned all root access keys.
- Enabled MFA for root and supercontest users.
- Remember “supercontest” is just an IAM user, used to manage AWS permissions. So I will use it to handle EC2 and everything within the AWS mgmt console. I will log in as supercontest. But then on the EC2 instance, the app still runs as the default user “ubuntu” – no change there.
- In vscode, shift-alt then select columns and you can delete it like vim.
Memory Profiling Section
- Using filprofiler (pip install filprofiler).
- Then “fil-profile run <x>.py”
- filprofiler is more for jobs (ones that start and stop), not services (persistent).
- I’m more concerned with usage over time -> identifying memory leaks.
- Switched to packages that profile running apps. Looked at the builtin tracemalloc for a bit then opted for Pympler as a higher-level suite.
- Has a memory leak detection module (muppy) as well as a class tracker.
- Focusing on muppy – you basically call at the beginning (or end) of a function and then invoke the function a few times. It will delta which objects are new since last invocation. If nothing is diffed, nothing leaked. If some object is new, it stuck around, and will leak. You can whittle down to find it in code.
- Class tracker allows you to check which objects your program creates (and children/parents, it follows the whole object tree), and the size of each.
- On the matchups page refresh (just within function, no url preprocessor considerations)
types | # objects | total size
===================================== | =========== | ============
dict | 97 | 44.10 KB
str | 72 | 11.58 KB
set | 31 | 9.02 KB
tuple | 60 | 4.22 KB
weakref | 30 | 2.58 KB
method | 30 | 2.11 KB
sqlalchemy.orm.state.InstanceState | 30 | 1.88 KB
sqlalchemy.util._collections.result | 14 | 1.20 KB
int | 40 | 1.09 KB
supercontest.models.models.Score | 14 | 896 B
supercontest.models.models.Line | 14 | 896 B
flask_sqlalchemy._DebugQueryTuple | 8 | 832 B
datetime.datetime | 16 | 768 B
float | 30 | 720 B
supercontest.models.models.Week | 1 | 64 B
- On the all-picks page refresh (just within function, no url preprocessor considerations)
types | # objects | total size
========================================= | =========== | ============
dict | 295 | 101.28 KB
str | 240 | 14.50 KB
set | 35 | 7.93 KB
list | 33 | 4.99 KB
tuple | 69 | 4.87 KB
weakref | 34 | 2.92 KB
method | 34 | 2.39 KB
sqlalchemy.orm.state.InstanceState | 34 | 2.12 KB
sqlalchemy.util._collections.KeyedTuple | 16 | 1.50 KB
int | 45 | 1.23 KB
supercontest.models.models.Line | 16 | 1024 B
supercontest.models.models.Score | 16 | 1024 B
datetime.datetime | 18 | 864 B
float | 18 | 432 B
flask_sqlalchemy._DebugQueryTuple | 1 | 104 B
- On the leaderboard page refresh (just within function, no url preprocessor considerations)
types | # objects | total size
=================================== | =========== | ============
dict | 242 | 121.92 KB
tuple | 43 | 3.89 KB
str | 3 | 3.88 KB
set | 0 | 2.00 KB
list | 17 | 1.96 KB
list_iterator | 18 | 1.12 KB
flask_sqlalchemy._DebugQueryTuple | 3 | 312 B
float | 12 | 288 B
zip | 1 | 72 B
map | 1 | 64 B
int | -2 | -56 B
- This is not great. Means that in each call to (primarily) the allpicks and lb pages, we’re leaking a TON of objects. Totalling about 150KB in each call.
- Very very common places for memory leaks: every global var for the flask app. Not objects created inside the route, but anything in the global space (init, toplevel views, etc).
- For supercontest, all these objects: db, mail, cache, scheduler, csrf, app, blueprints.
- Pretty good overall video: https://www.youtube.com/watch?v=s9kAghWpzoE
- Remember autorestarts of workers/etc (esp for webservers) – very common bandaid solution to mem leaks. Can mask problems (but are a quick fix also).
- Went deeper and added /memory-print and /memory-snapshot endpoints to the main blueprint so that I could debug. Implemented tracemalloc.
----
Testing route /season2022/league0/week6/matchups over 50 calls
Memory before: 180432896
Memory after: 180432896
Top 5 increases in mem by line:
/usr/local/lib/python3.7/cProfile.py:87: size=70.1 KiB (-87.2 KiB), count=816 (-1015), average=88 B
/usr/local/lib/python3.7/threading.py:904: size=60.6 KiB (+59.1 KiB), count=1723 (+1680), average=36 B
/usr/local/lib/python3.7/pstats.py:248: size=54.9 KiB (-52.2 KiB), count=439 (-418), average=128 B
/usr/local/lib/python3.7/cProfile.py:131: size=145 KiB (+49.7 KiB), count=1857 (+636), average=80 B
/usr/local/lib/python3.7/site-packages/jinja2/nodes.py:220: size=10.1 KiB (-46.7 KiB), count=129 (-464), average=80 B
----
Testing route /season2020/league0/week3/picks over 50 calls
Memory before: 180432896
Memory after: 180432896
Top 5 increases in mem by line:
/usr/local/lib/python3.7/cProfile.py:87: size=69.9 KiB (-87.5 KiB), count=813 (-1018), average=88 B
/usr/local/lib/python3.7/cProfile.py:67: size=0 B (-69.4 KiB), count=0 (-740)
/usr/local/lib/python3.7/threading.py:904: size=69.2 KiB (+67.6 KiB), count=1967 (+1924), average=36 B
/usr/local/lib/python3.7/pstats.py:248: size=54.9 KiB (-52.2 KiB), count=439 (-418), average=128 B
/usr/local/lib/python3.7/site-packages/jinja2/nodes.py:220: size=10.1 KiB (-46.7 KiB), count=129 (-464), average=80 B
----
Testing route /season2021/league0/leaderboard over 50 calls
Memory before: 180432896
Memory after: 180432896
Top 5 increases in mem by line:
/usr/local/lib/python3.7/cProfile.py:87: size=70.1 KiB (-87.2 KiB), count=816 (-1015), average=88 B
/usr/local/lib/python3.7/threading.py:904: size=77.6 KiB (+76.0 KiB), count=2205 (+2162), average=36 B
/usr/local/lib/python3.7/pstats.py:248: size=56.6 KiB (-50.5 KiB), count=453 (-404), average=128 B
/usr/local/lib/python3.7/cProfile.py:131: size=145 KiB (+49.9 KiB), count=1860 (+639), average=80 B
/usr/local/lib/python3.7/site-packages/jinja2/nodes.py:220: size=10.1 KiB (-46.7 KiB), count=129 (-464), average=80 B