Storage Platforms for the Modern Era

One of my teams at work is the online storage team (love it!), so I’m focusing efforts on how we improve the high availability of these systems and improve the system behavior during incidents or degradation. ie all the MTs: MTTR, MTBF and MTTD.

I’ve been spending a lot of time considering reliability improvements and turning those into a series of architecture and storage design principles to follow.

For these systems we’ve learned that our greatest leverage is in MTBF, which is predicated on operating these systems in the messy real world of cloud compuuting with the expectation of hardware failures (gp2 for 99.8 to 99.9% SLA/yr/volume).

What’s a recipe for improving current system behaviors?

Partition critical and non-critical workloads
Use read replicas as heavily as consistency requirements allow
Choose the right cost/reliability threshold for your workloads (gp3 vs io2)
Remove cross-shard queries from your critical path
Run storage systems with sufficient headroom that you don’t have to performance tune in a panic
Ensure 1-2 years of architectural runway on systems in case you need to shift to a new storage platform (ie eat your veggies first)
Horizontally shard at application layer by promoting out hot or large tables to dedicated cluster (ie for aurora mysql)

With enough business success and a data intensive features, you’ll hit an inflection point you must either adopt a new online storage technology OR invest heavily in a sharding layer on top of your existing storage (vitess on mysql). Based on evidence of adoption at other companies, adopting a complex sharding solution is more expensive and less featureful than adopting a full storage platform including those features.

Ideal online storage system characteristics

Legend: ✅=yes, ☑️=partial, ⭕=no

Feature / Behavior	Aurora MySQL 5.x	MongoDB 3.x	TiDB 6.x	ScyllaDB 2022	FoundationDB 7.0
No downtime or errors due to node failure, maintenance or upgrades	⭕	☑️ ¹	☑️ ²	✅ ³	✅ ⁴
Workload priority management	⭕	⭕	☑️	✅ ⁵	☑️
Placement rules	⭕	✅ ⁶	✅ ⁷	⭕	⭕
Table partitioning	⭕	☑️	✅	⭕	✅
Full hardware utilization of writer/readers	⭕	⭕	✅	✅	✅
Ability to transparently and safely rebalance data	⭕	☑️ ⁸	✅	✅	✅
Linear horizontal scaling	⭕	✅ ⁹	✅	✅	✅
Change Data Capture	✅	✅	✅	✅	✅
Prioritize consistency or availability per workload	⭕	⭕	⭕	✅	☑️
Good enough support for > 1 workload (k/v, sql, document store)	⭕	⭕	✅	☑️	✅
Low operational burden	✅	✅	✅	✅	✅
Supports/allows hardware tiering in db/table	⭕	☑️	✅	⭕	⭕
Safe non-downtime schema migration	⭕	☑️	✅	✅	✅
OLAP workloads	⭕	⭕	✅ ¹⁰	☑️	⭕
SQL-like syntax for portability and adoption	✅	⭕	✅	☑️	⭕
Licensing	✅	☑️ ¹¹	✅	☑️ ¹²	✅
Source available	⭕	✅	✅	✅	✅

Legend

(Chart filled in using my significant experience with MongoDB (<= 3.x) and Aurora Myql (5.x) but knowledge of TiDB, ScyllaDB and FoundationDB comes from architecture, documentation, code, and articles)

Predictions for the next 10 years of online storage

Tech companies will require higher availability so we’ll see a shift towards multi-writer systems with robust horizontal scalability that are inexpensive to operate on modest commodity hardware.
Storage systems will converge on a foundational layer of consistent distributed K/V storage with different abstraction layers on top (TiDB/Tidis/TiKV, TigrisDB, FoundationDB) to simplify operations with a robust variety of features.

Downtime during failover but good architecture for maintenance and upgrades ↩︎
Downtime during failover but good architecture for maintenance and upgrades ↩︎
Best ↩︎
Best ↩︎
In Enterprise Version ↩︎
In Zones ↩︎
Placement Rules ↩︎
Architecture in 3.x has risky limitations and performance issues on high throughput collections ↩︎
See ^8 ↩︎
HTAP functionality through listener raft nodes on TiFlash with columnular storage ↩︎
SSPL terms are source available but not OSI compliant with and have theoretical legal risks in usage on par with AGPL except less widely understood ↩︎
AGPL ↩︎

Preparations for Exiting Non-Zero

micro grief death

What happens to your digital assets, legal assets, crypto, photos, emails, laptops, servers, and bereaved when you suffer from a fatal accident and exit non-zero?

If you’re the one left behind, what will make the process less layers of trauma?

(Based on a recent tragedy I’m close to… RIP mah friend </3)

First off, keep your passwords in a password manager and preserve the emergency recovery kit with one or more parties. If paranoid, split the content among N parties where it requires N-2 in order to recreate the full unlock info. If you have an attorney or a safety deposit box, consider keeping a copy there. Like physical security and trusting your local hardware, if you can’t trust your lawyer or deposit box you have bigger problems that need solving.

Keep a running list of your accounts, stocks, investments, company equity, stock options, RSUs, beneficiaries, etc. Either share the link to this spreadsheet ahead of time with loved ones or keep record of it in your password manager. Perhaps use a subtle tag like exit-127 in your password manager on all logins that should be investigated in the case of your death.

Use OTP instead of SMS and store it somewhere that could be recovered if you’re hit by a bus.

Store your laptop login in your password manager, with clear naming, so your bereaved can get in to figure out your accounts.

Make sure you and sig-other are both listed on accounts and cross referenced as beneficiaries, esp if unmarried.

Setup a Medical Power of Attorney and a Will or Trust… talk to an estate lawyer.

Share instructions about your hardware wallet. Make sure the ones you’ll leave behind know how to unlock it and have a trusted friend that can assist for redundancy.

Expect beneficiaries to need to engage with your most recent place of employment to wrap up loose ends, like returning company property (laptop, notebooks).

As the bereaved, pace yourself. Tackle the immediate needs and deal with grief, you can cancel Netflix and Spotify later. Speaking of which, grab credit card statements, bank statements, Mint login or YNAB and figure out weeks later what needs to be canceled. Get help from others for all the things you can outsource…. it’s going to be additional hardship working through the paperwork and logistics while thinking about your beloved.

So long for now.

G’luck and don’t exit non-zero.

Hiring That First Backend Engineer

micro transcription stream-of-consiousness startup engineering excellence principles

[Facts adjusted to protect the innocent]

A friend approached me to talk about what to look for when hiring their startup’s first backend engineer. The company is ~8 people, with a frontend/3d CTO and 2 strong engineers on the frontend. There is currently no backend tech or staffing, so this is hiring from the ground up.

First off, it’s hard to hire well for a skillset you lack, so get advice from people who have that expertise.

For personality, they need to be scrappy and able to do more with less. This is entirely at odds with the expectations that exist for staff in larger corporations. You want more “indie hacker” style than deep expertise on LSM Trees.

They need to be personable and able to work well with various business units, since it’s all hands on deck to survive the first couple years of a startup. They might need to be front line support for Enterprise customers and will require the patience and tact to retain that customer despite less stable technical systems.

They need to understand the build-versus-buy debate and strongly prefer to buy when staffing is short. They will do things with Zapier, Airtable, etc that you find appalling… and they’ll use that saved time to invest in mission critical systems getting the product features or reliability they need.

They know how to build reliable systems, but save their efforts for the systems that need it. Sometimes a scholastic grade of C+ is the right benchmark for Tier 2 systems and they know how to make the tradeoff.

They’re biased towards boring technologies and realize how few innovation tokens a young startup can truly spend. At this stage of my own development, this means they’re using tech like Golang, gRPC, Aurora MySQL, and hosted solutions. They realize every hour needs to be delivering business value and reinventing wheels with self-hosting or new flashy tech is a failure mode.

They see technology as a means to an end of delivering product features and “wow” for customers.

They’ll need to be the DevOps and Developer Efficiency team, on top of their main backend architect and coder role. They need the skillset and willingness to design IAM permissioning, wrangle sensible and secure defaults into AWS/GCP, and ensure developers have a minimal amount of tooling and support to be effective.

They’re the ones who will setup your Continuous Integrations, Continuous Deployment, deployments, rollbacks, operational manuals, paging system, alerting, metrics, commit hooks, linting, unit testing, end to end testing, staging environments, etc.

They’re designing a storage layer that will likely survive for 5+ years… so while they’re not planning on ROFL scale, they have a general idea of how the current system would evolve into ROFL scale. (TLDR Please use Aurora MySQL… if that can’t handle your ROFL scale… consider DynamoDB or MongoDB can rofl-scale with coaxing and mild care).

They’ll setup your data warehousing and long term archival (blob storage). They don’t need to be an expert here but should have seen it done before. If they can use a hosted solution for warehousing, that’s best. Otherwise, they know enough to choose among various hosted/self-hosted solutions (snowflake, spanner, redshift, clickhouse).

They’ll work with your analysts and eventually will set them up with a dedicated Business Intelligence tool. When I sourced this in a prior company we settled on Periscope and it treated us well. They’ll make sure you run it off a read replica so you don’t endanger production.

They’ll do the rest of your backend hiring, they should either have the skill and experience in hiring, firing, and leadership. They don’t have to be a people manager but do need to be willing to act in whatever way is best for the organization.

They need to be scrappy: ie able and willing to tackle any problem. They should also be willing and able to work on the frontend if that’s the most critical task of the moment. If they can’t solve a problem, they know to find someone who can for advice or admit they’re stuck and brainstorm solutions.

They need to be product minded and think of how their expertise can unlock that product vision. They’ve either considered founding their own startup, worked in a small one or done entrepreneurial work before.

They need to understand the value of dollars, in your startup. It’s different than in larger corporations with greater runway. If someone isn’t productive, it could result in the full startup failing and putting everyone out of work… better to coach that person and fix it or send them packing.

They need to be presentable among a wide audience: sales staff, investors, techcrunch, other engineers.

Hyper Growth: A Hyperbolic Primer

micro

I’ve been thinking about scaling in my current role as the Manager of Managers for our Platform Group.

No, this isn’t about a ~~ridiculously~~ impressively large MongoDB cluster, or scaling MongoDB config servers with insufficient network bandwidth at scale, about finding the right abstractions for a tangled web of an area, or using wiredtiger’s zlib to reduce storage needs and no it isn’t about platform services with high QPS.

I’ve been thinking about scaling people and their units of operation.

We’ve been growing ~30-50% per year in my part of the Engineering organization and I see the inherent challenges of staying far enough ahead of that growth curve so that it’s a sustainable climb.

At this rate of growth:

What you start planning now needs to be the solution that will serve from now+6 months to now+18 months. Anticipate the execution lag. Watch the more distant horizon.
Hire leaders before you’ve filled your IC ranks, otherwise you’ll be ~6 mo or more behind and straining existing leaders.
Treat org design as you would distributed systems, anticipate the following challenges and be ready with redundancy and fallback positions:

Attrition
Promotion
Re-organization
Parental leave
Bus factor
Burnout

Believe in specialization of labor and use that as a way to horizontally scale leadership and to help IC engineers focus on their area of expertise. Hire PMs and EAs to scale out EMs, their bosses and the ICs.
Invest in recruiting org to enable hyper growth hiring. They need to be hired in advance of the true necessity or results will be 6 months delayed.
Hire additional EMs when your teams exceed 5 engineers so that the group can avoid scaling bottlenecks. By the time a new EM starts, you’ll have 8 engineers in that group and be blocked from scaling further without starting to stretch leadership thin.
Minimize as much friction in hiring as possible and get better at managing performance.

2021 Year in Review

micro 2021 yearly-review

Introduction

2021, the second year of pandemic.

I joined a Fintech company at the recommendation of a friend who works there and during the interview process learned that I was VERY excited about leading a team dealing with their scaling problems. In hindsight, I should remember that all large and challenging tasks come with commensurate amount of stress :P.

I started with them in the spring, leading a team in their Platform division and ended the year by taking over leadership of the Platform division. I’ll spend 2022 guiding the managers in our division on projects that are the backbone of most other development in our engineering organization. By end of year, our division will be ~40 engineers. That’ll keep me busy with hiring, mentoring managers, and guiding engineers on key projects.

2021 was a good year personally in many ways but a hard one with isolation. I’ve been working remotely for a decade, in a combination of fully remote companies and hybrid companies. I’ve loved my remote work experience, except once the pandemic hit and cut my social life off outside work. This is the first role I’ve taken that’s leading a colocated group. Honestly, it was one of the few hesitations I had when taking this role. But because my group was colocated and happily so, I started leaving my house 3x/wk to commute to our office. It’s been wonderful! I’m happy to be out of my house, making social connections, and having new stimuli.

The commute itself is long and much earlier than I would normally think a good idea. But I’ve gotten accustomed to an earlier bedtime an 5:45am alarm clock… and having an hour or more in office before the day starts for my individual responsibilities adds to my happiness, accomplishments and satisfaction.

All in all, the year closed out well, went through rocky spots and I’m grateful that it was gentle to myself, friends and loved ones. I even learned that someone I had mentored not only was able to transition into working in tech from non-traditional background, but they recently started a role as a Lead Software Engineer. We met because they interviewed at my company in 2017 or 2018 for a junior role. They weren’t yet far enough along to hire, but I was impressed at their story and grit. So I turned them down for role and offered to mentor them as they pursued leading software development. I’m proud of them and their accomplishments and it was a wonderful surprise to hear about their next role during the Christmas holiday.

Accomplishments

Wrote 2 plugins for Joplin Note system (auto tagger and auto alarm see link)
Prototyped many MongoDB related systems/configurations/approaches to solve foundational issues
- Terraform for Mongo configurations (unpublished)
- Mongo cluster automation via terraform, auto-scale groups, consul link
- A chunk splitter for MongoDB cluster on older version that lacks correct chunk size behavior when using many mongos (unpublished but open source potential)
- MongoDB and I are currently on the “I know how to search source code for our version of it rapidly and have bookmarks for it in sourcegraph”.
Solved challenging longstanding large db cluster scaling issues and improved cluster resilience and team operational knowledge. Proud of my team and of myself here, it’s a worthy and difficult feat.
Joined our hiring committee at work and automated part of our process with a python script + google sheets + google docs as templates.
Dabbled in learning/writing Rust link
Started using earthly to unify my CI and local experience: eg. It’s the best docker experience I’ve had. Debugging and building up a commandlist is wonderful.
Automated deployments of my blog using Github Actions + Earthly.
Automated release builds of my Joplin plugins
Setup shortcuts for blogging to reduce friction. Credit to @brandur for inspiration
Started using “iA Writer” app on Mac for drafting blog posts in distraction free markdown. I use a ruby script to prefill the frontmatter, add file to git and then open file for writing in “iA Writer”.
I tried three flavors of note taking systems at home and work, starting with neuron, followed by a vscode plugin driven neuron like experience with my own custom hashtagging, and finishing the year with Joplin.
- Joplin strikes a nice balance of extensible and batteries included. As noted earlier, I’ve created two plugins of my own to add small quality of life improvements.
- I appreciate Joplin’s proper due date tracking and that I can use plugins or my own code to determine best method for displaying those.
- I appreciate that I can pop open my own editor, if I want more in-depth note taking. I have this set to vimr as my gui option.
- I appreciate the export functionality to dump my notebook into markdown files with front-matter.
- Joplin’s a simple version of what I want out of Evernote but with open source extensibility and a solid plugin architecture.
Learned to make a solid milk foam for lattes and cappuccinos on our work espresso machine, while using the manual setting for the steam wand. I’ve been hawking my wares at work and rustling up takers of my latte efforts so I can more quickly grind barista xp.
Lost ~8% body weight through portion control and eating nutrient rich but not calorie rich foods. It’s stayed steady at that level with negligible amounts of exercise and I plan to push it further in 2022.
I became a more skilled motorcyclist and mechanic.

Posts

I had an inconsistent year of blogging in 2021 that included 12 posts.

My posts were focused around work topics and unsurprisingly I posted more when I had less demanding responsibilities (March while on holiday before starting new role, Nov around holidays and December around holidays).

I was able to post more easily this year because of automating my post-drafting mechanism and my deployment of the site. Those little reductions in friction were worthwhile optimizations and I posted more than I otherwise would have, especially my shorter micro posts.

Plans for 2022

Do everything I can to delegate and empower others at work
Utilize better specialization of labor to foster improved morale, velocity and satisfaction in work.
Keep time in my work and personal schedule for heads down focused efforts
Publish at least one substantial open source project, either at work or personally. Top of mind options:
- the mongo chunk splitter
- mongo deployment orchestration
- mongo configuration via terraform style declarations
- something entirely non-mongo, like a mongo wire-protocol compatible distributed database architecture built on sqlite
- something genuinely unrelated to mongo and also not related to automating google docs as templates.
Blog more times than there are months in the year
Learn to ride a dirtbike
Find ways to integrate incidental exercise into my life. Walking 1-1s at work are a good example of this but I want more vigorous pursuits that I enjoy outdoors.
Be proactive in taking vacation
Don’t let perfection be the enemy of done

Writing Two Joplin Plugins

micro joplin

I wrote two Joplin plugins today.

Auto create tags based on any content in title or body with a hashtag repo
Auto alarm creation based on natural language in title repo. With this I can type notes like, “Write blog post due at 4pm tomorrow” and have the alarm auto-set :).

The Joplin plugin interface was clean and simple once I browsed through the docs and a few example plugins.

PS - Thanks to @forcewake on Github for the plugin that accounts for 50% of the code in my auto-tag plugin repo

Mongo Deployment Experiment Using ASG, Consul and Terraform

mongodb terraform packer edited

Introduction

I’m exploring Terraform to deploy MongoDB sharded clusters. Aka, binge coding Terraform, Packer and Consul was a delightfully obsessive way to spend the recent holidays. I’m learning a lot of Terraform, Consul and Packer along the way :). Terraform’s impressive and I’m enjoying architecting a robust system from AWS’s building blocks.

Design

The principle this system follows is self-healing and immutability.

Components:

Packer - builds server images of the core functionality (mongod shardsvr, mongod configsvr, mongos) on top of a base image of Ubuntu LTS with consul agents pre-configured.

Terraform - deploys, updates and deletes AWS infrastructure:
- SSH keys
- Security groups
- VPCs and Subnets
- Auto-scale groups + Launch configurations
- EBS Volumes (boot disks and db data storage)
Consul Servers - these are the 3-5 servers which form the stable ring of consul server elements.
Consul - each mongo server has consul setup with auto-join functionality (aka retry_join: [provider=aws...") based on aws tagging.
- Used for dynamic DNS discovery in conjunction with systemd-resolved as a DNS proxy.
- Will use consul-template to update config files on servers post launch. (I’m hoping this is an elegant solution for ASG booted instances that need final configuration after launch and a way to avoid having to roll the cluster for new configurations.)
Auto-scale groups (ASG)
- Each mongo instance is an auto-scale group of 1.
- ASG monitors and replaces instances that become unhealthy
Auto re-attach of EBS data volume
- In the event that a mongo instance becomes unhealthy, ASG replaces the node but it will initially lack the db data bearing EBS volume.
- That EBS volume is prohibitive to recreate on large volumes when considering the restoration time + needing to dd the full drive to achieve normal performance.
- Instead of a new volume, a cronjob runs on each data configsvr and shardsvr that each minute tries to re-attach the EBS db data volume paired to this instance using metadata from the EC2 instance tags and the EBS volume tags.
- The cronjob looks up the required metadata and executes aws-volume-attach.
- If the volume is currently attached, aws-volume-attach is a no-op.
EBS Volumes
- DB Data volumes are separately deployed and persist after separation from their instance.
- These will be in the terrabyte size range.
- To replace a drive (corruption/performance issues)
  - Provision an additional drive from snapshot using terraform
  - Update the metadata of that shard’s replicaset member to point to the new drive’s name
  - terraform apply

My next steps are to automate the replicaset bonding and then shard joining. The open source tooling for this portion isn’t what I want, with the closest being mongo ansible. It’s an established tool but I want something more declarative and a simpler model of what it will do when executed. As a result, the answer might be a custom terraform provider to manage the internal configuration state of MongoDB. Philosophically the CRUD resource management and plan/deploy phase of Terraform matches what will give me confidence using this on production clusters.

I’ll open source the work if it gets to a mature spot. Right now the terraforming successfully spins up all the mongod nodes, networking, VPCs, security groups, ec2 instances, ebs volumes and they auto-join their consul cluster.

Credit for the concept of this approach belongs to multiple different blog posts, but the original idea of ASG + EBS re-attaching came from reading about how Expedia operates their sharded clusters. Thanks!

Neovim+Vscode fix nargs bug

micro bug neovim vscode

VSCode+Neovim wasn’t working for me today and I took the time to debug and find the patched issue.

When starting VSCode with Neovim plugin, my VSCode displayed an error code:

line   34:
E1208: -complete used without -nargs
line   10:
E1208: -complete used without -nargs

Which led me to first finding an bug report in vim-ripgrep and I assumed the problem was in vim-ack due to similarity of purpose.

I removed that plugin from my plugged directory and restarted VSCode, but not luck.

Next I discovered a vscode-neovim bug and manually patched the two files per diff:

# Note prefix directory will depend on your choice of nvim
# package manager. I'm using plug.
~/.vim/plugged/vim/vscode-file-commands.vim:34
~/.vim/plugged/vim/vscode-tab-commands.vim:10

VSCode worked on the next start and once a new release of vscode-neovim is released, it will auto-update 😊.

Awesome Bundler Feature: Inline Gemfiles

micro ruby

Ruby’s Bundler introduced a feature that I’m loving for prototypes and quick scripting

Inline Gemfiles

Introduced in v1.10, these avoid needing a separate declaration of a Gemfile and you write it directly into the script you’re creating.

It fixes the following ergonomic issues:

Writing a separate Gemfile when your program is a single Ruby script is overkill
Packaging your Gemfile to accompany aforementioned script is a hassle
For single file scripts you can now rely on Gems!

How To Use It

require 'bundler/inline'

# Declare Gemfile in ruby script
gemfile do
  source 'https://rubygems.org'
  gem 'mongo'
  gem 'pry'
end

client = Mongo::Client.new(ENV.fetch('MONGO_URL'))

PS

I’m not writing much Ruby these days because I work leading engineers, but when I hack on things in my spare time I’m prototyping and want to iterate on an idea before converting it to Golang. I did this recently when iterating on how to speed up controlled MongoDB failovers and when spiking out a terraform-like system for MongoDB configuration management.

Bundler-inline is great for these situations because they’re a single Ruby script but require installing Gems :).

The Power of Saying No

micro

As an engineering leader, saying no is a superpower.

Counter: Saying yes and imagining what’s possible is a superpower.

Honorable mention: Avoiding becoming a gatekeeper or single point of failure is a superpower

TLDR: Build distributed and fault tolerant human systems. Reject what needs to be rejected and say yes to ambitious things you can’t yet conceive.

xargs.io

All the IO and Multiplexing

01 Nov 2022

What’s a recipe for improving current system behaviors?

Ideal online storage system characteristics

Predictions for the next 10 years of online storage

16 Feb 2022

12 Feb 2022

08 Feb 2022

02 Jan 2022

Introduction

Accomplishments

Posts

Plans for 2022

28 Dec 2021

28 Nov 2021

Introduction

Design

26 Nov 2021

24 Nov 2021

How To Use It

PS

13 Apr 2021