Dan Mayer
08 January 2021

Learning with Game Days

photo credit [email protected]

Learning with Game Days

Many different companies and posts talk about why and how to run Game Days. I won’t rehash all of that in this post, instead I will give a basic intro and link to some sources then dive into some more specific Game Days I have recently been involved with.

A game day simulates a failure or event to test systems, processes, and team responses. The purpose is to actually perform the actions the team would perform as if an exceptional event happened

Below are a few recent Game Day examples, with some details of what we did, what we learned, how we broke something in a “safe” manner, and a runbook to run your own. In all cases, one important factor of running the Game Day is having a way to stop if the simulated incident starts to cause a real incident. When planning a simulated emergency make sure you have a planned way to escape out of the test if something unexpected is occurring.

Safe and Confident Deployments

For various reasons some of our systems were relatively slow to deploy. This means

...



Dan Mayer
02 December 2020

Performance of JSON Parsers at Scale

photo credit [email protected]

Performance of JSON Parsers at Scale

In a recent post, benchmarking JSON Parsers (OJ, SimdJson, FastJsonParser). This compared the parsers based on local microbenchmarks. In the end, I recommended for almost all general use cases go with OJ. Saying that FastJsonParser might be worth it for specific use cases. I want to do a quick follow up on sharing what happens when microbenchmarks meet real-world

...



Dan Mayer
15 November 2020

benchmarking JSON Parsers (OJ, SimdJson, FastJsonParser)

photo credit Tumisu lt: @pixabay

UPDATE: Added FastJsonParser

After some feedback on reddit (thx @f9ae8221b), pointing out a JSON gem I wasn’t aware of, I updated the benchmarks to also support FastJSONparser and cover symbolize_keys, which is important for my companies use cases (which a co-worker pointed out) and can cause significant performance issues if you have to do that independently of JSON parsing.

Performance Benchmarks between OJ, SimdJSON, FastJsonparser, and StdLib

I was recently looking at the performance of some endpoints that process large amounts of JSON, and I wondered if we could do even better than we do in terms of performance for that processing. Across our company we have recently switch most of our apps from the Ruby StdLib JSON to OJ, but I had read about SimdJSON and was curious if we should look further into it as well. In this article I will tell you a bit about each of the Ruby JSON options and why you might want to consider them.

OJ

OJ is a Ruby library for both parsing and generating JSON with a ton of options. I would basically say if you don’t want to think too much but care about JSON performance just set up the OJ gem, and it should be

...



Dan Mayer
22 October 2020

Ruby: Understanding create_or_find_by vs find_or_create_by

photo credit geralt: @pixabay

Performance Benchmarks & Considerations between create_or_find_by & find_or_create_by

I was recently optimizing an endpoint and got to think through some interesting differences between two Active Record methods that help you either find an existing record or create a new one. At first glance, it seems either is fine with some notable differences around their race conditions.

find_or_create_by

The find_or_create_by method has been around longer and is more familiar to many Rubyists. The race condition is called out in the linked docs, excerpt below.

Please note this method is not atomic, it runs first a SELECT, and if there are no results an INSERT is attempted. If there are other threads or processes there is a race condition between both calls and it could be the case that you end up with two similar records.

This lead to Rails 6 adding the newer methods…

create_or_find_by

The new create_or_find_by methods have a more rare race condition (on deleted ids), but can prevent a more common insert race condition on duplicates… It is well described in this post, Rails 6 adds create_or_find_by, along with some downsides. For example without a unique DB constraint it will create duplicates (ex: add_index :posts, :title, unique: true). These issues are also called out in the docs

...



Dan Mayer
15 July 2020

Ruby: Patching StdLib in Gems

photo credit patches: AnnaER @pixabay

Why Patch Ruby StdLib Code in Gems

Well, the Ruby community does this a lot, it can unlock powerful enhancements, features, observability, and more…

Here are some examples of patching Ruby’s StdLib (standard library). Let’s just look at a few that patch a single piece of Ruby, Net::HTTP. Many libraries want to tap into what is happening around the network.

Sometimes opposed to patching upstream Ruby code, one can just have adapters/wrappers around them, while related it is a much different approach and you can see how Faraday handles adapting Net::HTTP as an example of that approach. Which is safer, but requires upstream apps to change their code to use the libraries’ APIs as opposed to modifying existing behavior.

Gems Patch Ruby StdLib, So What?

The problem comes up with multiple gems trying to patch the same method. From the examples above, there are multiple ways to attempt to modify the original code, which doesn’t always play nicely together.

  • alias, alias_method, and the like
  • prepend, class/module extension ways of extending a method and using super
  • replacing constants, I don’t know the common term for what WebMock does to patch Net::Http

If you have multiple gems patching the same upstream Ruby StdLib (or Rails) class or function, you can run into issues. This is a known Ruby ‘Bug’ along with a known solution to detect and patch in the same way.

Example: Errors: stack level too deep

The reason I am writing this up is that I had a bug in Coverband for months, thx bug reporters(@) I appreciate it, that made no sense to me… I couldn’t reproduce it, I didn’t have any great stack traces, I had no idea what area of code the issue was even in… I couldn’t even investigate the issue. At the time all I really knew about the bug? Exception: Stack level too deep error.

After months, of once in awhile taking a look but not understanding the problem… I got a new bug report from @ hanslauwers… Which, added some details, specifically that the gem AirBrake and Coverband, both were patching Resque… but in different ways…

A few days prior to the above report, I saw while working on another project this excellent description of a problem that had been solved in the MiniProfiler project, the readme documents how to resolve Net::HTTP stack level too deep errors… So the new bug report made my spidey sense tingle, and I was finally able to fix it.

How to handle applications differences

I ended up following the same pattern as MiniProfiler, which described the problem and the fix excellently in it’s readme.

If you start seeing SystemStackError: stack level too deep errors from Net::HTTP after installing Mini Profiler, this means there is another patch for Net::HTTP#request that conflicts with Mini Profiler’s patch in your application. To fix this, change rack-mini-profiler gem line in your Gemfile to the following:

… examples …

This conflict happens when a ruby method is patched twice, once using module prepend, and once using method aliasing. See this ruby issue for details. The fix is to apply all patches the same way. Mini Profiler by default will apply its patch using method aliasing, but you can change that to module prepend by adding require: [‘prepend_net_http_patch’] to the gem line as shown above.

The readme, explains the issue, has code examples for how app’s integrating the gem can resolve the issue, and links to the original Ruby “Bug”, which explains the issue in detail and discusses approaches to solve the problem

Coverband’s Patching Solution

This is the PR that was merged after understanding the problem and approach I took to resolve the problem. Again, heavily patterned off the MiniProfiler solution.

In the end, it is a pretty simple fix, but it took time and various folks participating in the bug report to understand. If you see an open github issue that still seems relevant, add some comments and details. You never know if you will be the trigger that helps folks understand and resolve the issue.

I know patching always gets a bad wrap in Ruby, and it can be hard to fully understand and debug, but it is also extremely powerful. It is good to understand the gotcha’s that can occur, and how to work around those issues, especially if you are shipping shared code that can patch other shared code like Ruby’s StdLib.  

...



Dan Mayer
14 January 2020

Rails Flaky Spec Solutions

photo credit flaky wall: pixabay

Introducing Rails Flaky Spec Examples

I have written about how our teams are dealing with flaky ruby tests in legacy projects. In this post, I want to show, how we can teach about common testing pitfalls and how to avoid them.

In this post, I want to introduce a new project Rails Flaky Spec Examples. I created the example flaky Rails spec suite for an internal code clinic at Stitch Fix. We ran it as a 1-hour workshop to teach about common flaky test issues and how to resolve them. I am hoping that over time, I can continue to grow the example base and talk about some of the more complex flaky tests and how we could perhaps more systematically avoid them. As I work with this project over time, I hope to make it into a good learning tool for the community.

Flaky vs Stable Suite

Running the flaky and thens stable suite

Why A Rails Suite? One Problem with Flaky Test Posts

While

...



Dan Mayer
20 November 2019

What I learned about Software Development from building a Climbing Wall

Theo liked to imitate as he was learning to walk.

TL;DR: This post is just for fun!

I didn’t really learn about programming, this is just a catchy title, and I wanted to share a big project I have continued to work on that has nothing to do with programming. I thought I could make a kind of funny post by stringing together a bunch of programming ‘wisdom’ that could really be associated with nearly anything.

The Wall's Current State

climbing wall’s current state

Prototype, Iterate, Test, and Iterate More

Climbing Wall: If you aren’t sure how much you will use the wall, or how much effort you want to put into it you can start small. Build and expand over time.

Software: In software, most of the time these days folks take an agile / lean approach and try to deliver working MVPs along the life of the project to deliver continuous value and learning.

Prototype

I started really small, initially with a training board. I installed this above the stairs entering the basement.

climbing board

A training board lets you strengthen your fingers, and do weird pull-ups

Iterate

I then added a few holds directly into the wall

...



%