A small refactoring project I took on turned into a crazy journey. Filled with frustration, production bugs, and quick fixes…
started one of those, "it will be an easy little refactoring"… Now I am lost in a sea of yak hair on a ship crewed by yaks… What happened?— Dan Mayer (@danmayer) February 10, 2015
As you can see from the time between these two tweets it also, was burning away on my back burner annoying me for 8 days. I worked on other things for most of that time while it went through some PR review and QA, but it was still a task to check in on each day.
Hooray! Started as an estimated 2 hr refactoring -> 9 hours of coding, 1 1/2 week of QA / PR review -> 138 prod errors -> fix -> it's live!— Dan Mayer (@danmayer) February 18, 2015
There has got to be a better way
TLDR; I did a refactoring, which I believe should have been done. After I completed the release, I realized how I could have done it safer and with less stress all around. Read the breakdown and two solutions to see a better way to handle situations like this.
First let me start by saying I am very aware of responsible refactoring, which is a great thing to keep in mind. I knew when I first started this refactoring it would be a little dangerous, but decided to take it on for a few reasons.
- we had extracted the functionally of the app into a gem
- the gem had made it into usage in ALL other consumer facing applications
- the legacy app code was basically a duplicate
- the legacy app code had fallen out date
- the legacy app code was missing some features
- the refactoring has been suggested since the initial creation of the gem and beyond being out of date the code methods and results were beginning to diverge.
OK, this made me think it was worth the risk and there were risks.
- the legacy app is poorly tested
- the legacy app was notoriously hard to manually test
- the code wasn’t designed into the app well in the first place, it was all over the place, my final change illustrates that well
git diff => Showing 43 changed files with 130 additions and 492 deletions.
- while the original code was out of date and didn’t always match the results of the newer gem, we might have started to “code around” the issues, you know “it’s not a bug, it’s a feature.”
As @tcopeland said, “ha whew. Replacing lots of global-ish methods is tough”
Quick breakdown of the problem
Since every situation is a bit different. I want to try to give a bit better idea of the problem I faced.
#app/controllers/application_controller.rb include HelpfulModule # Hey cool, it is mixed in all our controllers, lets really make this global # we can just include a bunch of helper methods, # and sprinkle usage of this all over the view layer as well! helper_method :helpful_check? helper_method :give_me_some_type_name ...# a surprisingly large set of these helper_method :is_some_special_thing? #app/controllers/api/various_base_controller(s).rb include HelpfulModule #app/controllers/modules/helpful_module.rb module HelpfulModule protected def helpful_check? !!(rand > 5) end def give_me_some_type_name case when (helpful_check?) then true when is_some_special_thing? then false else true end end ... #many more methods def is_some_special_thing? return true end end #many controller and module call sites app/controllers/modules/auth_support.rb app/controllers/home_controller.rb ... #many view layer call sites app/views/cool/index.hml.erb app/views/something/_partial.hml.erb ...
The way I did it
I didn’t realize the code had spread so far and wide through out the app. I could thought I’d simply create the object one place in the request and then switch all the calls to it. Basically the change was something like below.
- I removed the module:
git diff => - include HelpfulModule.new
I removed all the various helper methods
git diff =>
- helper_method :helpful_check? - helper_method :give_me_some_type_name ...# a surprisingly large set of these - helper_method :is_some_special_thing?
I added an accessor
def helpful_thing @helpful_thing ||= Gem::HelpfulModule.new(request, "stuff") end
- I removed the module:
- I removed the module
I then updated all those many many call sites
git diff =>
#app/controllers/home_controller.rb - is_some_special_thing? + helpful_thing.is_some_special_thing?
After thinking, I had made all the changes. I ran the test suite. I had a bunch of errors and failures. I missed various calls. Some places didn’t have access to the initialized object. I fixed both good and bad tests (tests which basically had hard coded expectations), and got everything passing. I looked at my diff and realized I had been all over the code, changing far more files than expected and having to fix more unexpected failures than imagined. I knew this had become riskier than initially imagined. Also the time it took me was far longer to get to reach a completed state than I thought it would. I knew that the tests in the legacy app weren’t covering the change well enough and went though manual testing myself along with related fixes. I then passed it off to the mobile team to help QA since it would effect some mobile APIs. We finally deployed it, and boom exceptions. A quick rollback and fix, deploy… All seems good a few hours later some error reports come in, with another minor issue which didn’t raise any exceptions… Another fix released. Finally, the long 8 day journey of the minor refactoring is over.
The way I should have made the change
I should have broken this into two steps. It would have made the initial estimate far more accurate at a few hours. It would have significantly reduced the risk. It also would have reduced the scope of where I needed to focus testing, both manual and automated to catch any unexpected changes.
- Replace global methods implementation in module wrapper to simply make calls to the new gem. Initially leaving all callers as they are
- After that has been successfully deployed, slowly move callers to direct calls, where it makes sense. Otherwise leave the level of abstraction as a single container for all interactions with the gem.
That change looks something like this
#app/controllers/modules/helpful_module.rb module HelpfulModule protected def helpful_check? helpful_thing.helpful_check? end def give_me_some_type_name helpful_thing.give_me_some_type_name end ... #many more methods def is_some_special_thing? helpful_thing.is_some_special_thing? end private def helpful_thing @helpful_thing ||= Gem::HelpfulModule.new(request, "stuff") end end
Which is obviously a much simpler and less invasive change. It is easier to reason about and test. It is something I could have completed much faster and released with more confidence.
Yes this is actually pretty common refactoring advice. It has been around nearly as long as the concept of refactoring. Make this type of refactoring in two or more steps opposed to doing it all at once. For me the point was that it is easy to forget the challenges of working with a large and legacy production system. The complexity demands additional attention both in terms of adding features and making “smaller” refactorings. A big part of any change to a large complex production system at the heart of a companies systems, should be a roll out plan. Always think of the safest way to try to move forward. Sometimes that is feature flags, A/B testing, limiting to employee users, and sometimes it is breaking up a refactoring into smaller easier and safer steps, like I should have.
blog comments powered by Disqus