December Software Links 31 December 2017

Some of my favorite links from December. If you have any good links I missed pass them my way.

continuing my trend of having something from @davetron5000 each month, and this month I have two ;)

Software Development

Ruby

DevOps

Tech Management

Random

Random Links

Links by wsyperek

comments

On Tech Challenges 18 December 2017

Coding Challenge

The company I am with, Offgrid-Electric, has evolved our hiring process over time to try to ensure we can find a diverse and strong distributed team. There are many interesting processes related to hiring a distributed team, but I will quickly focus on the coding challenge, technical challenge, or whatever folks are calling it these days.

Job Challenge

image by FotografieLink

Offer Challenge Options

This post is kicked off by the thread below.

Folks followed up asking for more details on other challenge options we provide and why, which I will try to breakdown below.

What is the point?

First off why have a coding exercise in person, take-home, or other? The goal is to allow a candidate to demonstrate skills beyond being great to talk with during interviews. That they can back up the conversation with problem-solving skills that match the team’s expectations. There are far too many posts and stories joking about how few developers can code, which I will skip over, but often a team wants a way to help ensure that a candidate has skills that will make them successful if they join. At OGE we try to allow for the widest possible options to prove out that a candidate would be technically qualified or the position.

Tech Challenges Options

We offered a few standard options and have added more and evolved our challenges over time.

Take Home Challenge: We have someone on our team develop a challenge for any job description we are hiring for. The challenge is prepared before we even post a job opening. In general, we have someone on the team develop the challenge and complete it themselves to ensure they fully understand the challenge and have a good estimate of the time required to complete it (generally 2-4 hrs). This falls somewhere into the category of building a small application that does X, Y, and Z given this sample data set. Given folks likely have a current job we give plenty of time to work on the challenge and include instructions for how to submit the challenge and things we expect to see.

Pair Programming: We had some folks talk to us that enjoy pair programming and find it a good way to understand the way our current team works and thinks. We offer to pair on the take-home challenge and work on it with the candidate at a scheduled time, or to work on some of the code we are currently working on, or to pair on some code that the candidate has recently been working on. Some folks are very comfortable with this kind of exercise while others get very nervous pairing with someone they are just getting to know. Some folks have paired a lot in the past, or it might be a fairly new experience for them. Unless your team practices 100% pair programming, I believe you should offer some alternatives to just pairing. I like having a take-home challenge that can also be offered as a pairing exercise.

Open Source Work: This is a great option for folks that already have a lot of publicly available code they are proud of and showcases their skills. There is no reason to make someone spend extra hours of their time when you could have them walk you through code they are passionate about and have already been working on. I have enjoyed learning about some candidates projects even if it wasn’t a great fit and the enthusiasm for something they have recently worked on in their own time shows through. It is important to have other options beyond just open source. While this is a great alternative, it can’t be the primary or expected way to check off the technical challenge. You can find more detailed reasoning about this on various diversity hiring posts, but not everyone has the same privilege to spend extra time outside of work on open source, some jobs don’t allow it, it isn’t everyone’s cup of tea.

Technical Questioning: While I prefer to do something with code like the options listed above, sometimes that isn’t the best fit for whatever reason. In these cases, we have done some technical white boarding type interviews over video chat. In my interview, at OGE, I walked through how to split out a monolith application into smaller component applications and discussed the kind of DevOps you need to support such transitions. This format has also worked well for QA interviews. I try to avoid trick questions / brain teasers. Asking data modeling questions tends to work pretty well over a whiteboard for an example of good technical questions.

Choose your own Adventure

Taking this even further, given our only goal with this part of the interview process is to come away with some confidence that the person is technically capable in a set of skills… I often ask the candidate after presenting our standard challenge options listed above if that have any other way they would like to prove out technical skills. I will list some things candidates have suggested or done before.

Technical Writing / Educational Materials: We have had some folks apply that have created online materials for courses. Sharing video series that teach programming concepts, blog posts detailing how to use some technology, or videos of talks given at a conference covering technically relevant material.

School Projects / Papers: If you are talking to someone recently coming out of university they might have some great projects they could share with you. If it is a group project just get into specifics about what pieces they directly contributed. We recently had a candidate submit their Ph.D. thesis as a way to confirm their technical skill.

Code Reading / Walkthrough: A few candidates that didn’t have open source code but as consultants had learned to quickly look at projects and distill how they worked. In this case, they were also curious to see how our systems worked. After some discussion, we walked through sections of our companies source code with the candidate reading the code and explaining what it did and the recommendations they would have to improve that section of code. While this is a bit more challenging on the interviewer’s side, it can give candidates a great picture of the kind of code and the quality of the project they will be getting involved in. If the person can quickly jump in and absorb parts of the project’s domain and explain it back to you, it can give you the confidence you need to move forward.

Comfortable Candidates Can Share Their Skills

In the end, it comes down to just making sure you set a candidate up to show you their best work. You don’t want to setup a situation that will have a great candidate so nervous they don’t show what they are capable of. Being a globally distributed team, in the end, I don’t expect all our team to work the same way, so I wouldn’t expect all candidates to succeed in the same style of challenges. You can always find a developer that loves to pair or hates it, or gets very nervous working out a problem on a whiteboard or could lecture in front of a whiteboard for hours on end. All of that is fine, let em shine.

comments

November Software Links 02 December 2017

Some of my favorite links from November. If you have any good links I missed pass them my way.

Software Development


intentionally and proactively providing an extra-normal level of support and manual help to users in order to learn about the barriers people are hitting

service and support driven development

Data

Ruby

Devops

Tech Management

Random

Random Links

Links by wsyperek

comments

Active Record Database Documentation 12 November 2017

Database Documentation

Documentation Folders CC image from Pixabay

Active Record Database Documentation

This post will cover how to use the Active Record feature to add comments to your tables, columns, and indexes. This feature makes it easy to keep your database documentation up to date as you can add descriptions at the same time as you add or update tables and columns.

Documentation Embedded with Change Process

Why would we want to use Rails to build our database documentation?

I believe documentation close to code and embedded in the code change process has a better chance of staying up to date and relevant.

I also think we want to add human context to our information in same tools we build our database. We want to do this in source code that is trackable for the same reason we run database migrations in Rails opposed to just having a DBA make schema changes outside of our application code change process.

When we embed our database documentation in our standard code change process we easily get many advantages.

  • See DB comments change over time because they are part of Git
  • Search in code editor tools (and github)
  • Documentation can be reviewed as part of PRs by a data team, analysts, or other folks who might be the target documentation audience

By having the documentation embedded in the database directly, other values can be unlocked.

  • The documentation is embedded in most DB explorer tools (SQL workbench, Postico, etc).
  • Single source of truth documentation. It is easy to generate and push to documentation repositories (markdown, html, confluence). Either from Rails, CI, or any other tool in your workflow (see examples below).

Database Documentation in Postico

Postico OSX Postgres client showing comments as you explore DB structure

Code Samples

Below are some code samples to help you get started with a workflow around database documentation.

Migration

A migration adding comments to a previously existing table. You can add descriptions to call out deprecated fields, gotchas, planned refactorings, or add historical context that may be helpful to the next developer trying to understand what the field means.

class AddContactComments < ActiveRecord::Migration[5.1]
  def change
    msg = 'Contacts table holds individual details about our contacts, it is associated with leads and customers'
    change_table_comment(:contacts, msg)

    change_column_comment(:contacts, :first_name, 'the contacts first name')
    change_column_comment(:contacts, :last_name, 'the contacts last name')
    change_column_comment(:contacts, :house_latitude, 'the house_latitude the contact lives at')
    change_column_comment(:contacts, :house_longitude, 'the house_longitude the contact lives at')
    change_column_comment(:contacts, :house_location_accuracy, 'the accuracy range we captured the GPS with')
    change_column_comment(:contacts, :deleted_at, "the date the contact was 'hidden' from our DB")
    
    # call out gotchas
    # tasks have assignee_id while contacts still use agent_id in the DB, this is a recommended refactoring
    msg = 'the agent_id field is for who the contact is currently assigned to, various places in the code and API it is referenced by assignee_id'
    change_column_comment(:leads, :agent_id, msg)
  end
end

Ruby DB Docs Access

To generate documentation which could be pushed to a wiki, html, confluence, or elsewhere you can iterate through a tables columns and fetch the comments.

> ActiveRecord::Base.connection.table_comment('leads')
=> "Leads table holds individual details about leads, related associations, event timestamps, and joins to contact"	
> Contact.columns_hash['literacy'].comment
=> reading level: {"-1"=>"none", "0"=>"no_read", "1"=>"limited_read", "2"=>"read_fluent", "none"=>"none", "no_read"=>"no_read", "limited_read"=>"limited_read", "read_fluent"=>"read_fluent"}

# iterate through all the columns on a table and output them to your documentation file or API
Contact.columns.each do |c|
  puts c.comment 
end

SQL DB Docs Access

Obviously you don’t need Rails to get at this information, you can pull it out with raw SQL as well. Covered in this post: Querying table, view, column and function descriptions

# get the all the table comments in your DB
SELECT c.relname As tname, CASE WHEN c.relkind = 'v' THEN 'view' ELSE 'table' END As type, 
    pg_get_userbyid(c.relowner) AS towner, t.spcname AS tspace, 
    n.nspname AS sname,  d.description
   FROM pg_class As c
   LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
   LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
   LEFT JOIN pg_description As d ON (d.objoid = c.oid AND d.objsubid = 0)
   WHERE c.relkind IN('r', 'v') AND d.description > ''
   ORDER BY n.nspname, c.relname ;

# get all the comments for the contacts table
SELECT a.attname As column_name,  d.description
   FROM pg_class As c
    INNER JOIN pg_attribute As a ON c.oid = a.attrelid
   LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
   LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
   LEFT JOIN pg_description As d ON (d.objoid = c.oid AND d.objsubid = a.attnum)
   WHERE  c.relkind IN('r', 'v') AND  n.nspname = 'public' AND c.relname = 'contacts'
   ORDER BY n.nspname, c.relname, a.attname ;
comments

Databases Across Environments 26 October 2017

Random Links

DB Syncs CC image from Pixabay

Syncing Databases Across Environments

It seems that often when a business grows, at some point it is hard to create useful QA, development, and staging data to cover all the cases that can occur on production. Eventually, there is a need to try to replicate certain cases or debug various issues that is much easier when you can pull the production data to another system to experiment and debug safely. Nearly, everywhere I have worked eventually we need to clone large portions of the production DB to staging… Quickly followed by wanting to pull staging data to local development. While sometimes various privacy and security concerns must be taken into account, in general we are just talking about replicating a database or tables from one place to another. I will cover some approaches to moving DBs.

DB Rotation on AWS

If you are in a cloud environment use the tools they provide when you can and avoid building a custom solution. We use AWS, and we run our Postgres DB via the RDS service. We have backup and retention rules and the ability to restore snapshots to different names. We leverage snapshots to move the DBs from one environment to another. If you are using Google cloud or Heroku there are similar options. I am not going to layout the code, but you can also do it easily via the AWS GUI. Our basic setup when in the cloud is detailed below.

  • Production: has a nightly snapshot taken
  • Staging:
    • about once every other month we restore the staging DB from a production snapshot
    • after the snapshot is restored some scripts run (clean up, sanitize, and truncate large un-needed tables) so it will run better on a smaller DB instance
    • Careful about various test data that is expected by managers and your QA team. You might have destroyed a lot of work
    • We handle that by QA regression suite will detect our QA team data is missing and recreate example QA data for the team
    • staging has nightly snapshots taken
  • Dev Stacks: deployed development environments
    • these are created dynamically by developers from git branches
    • they restore a new DB from the latest staging snapshot
    • then run migrations and whatnot on the branch to get it to the correct state
    • we don’t take any back ups of these stacks which are traditionally destroyed after a few days of use

Full DB to local Development

While the AWS toolchain is great you can’t restore an AWS snapshot to your local Postgres DB. For that we just created a simple script that will dump Postgres and upload it to S3 in a compressed format. We have some options to exclude some extra large tables that we generally don’t care about for development, you could always pull those later using the method for individual tables mentioned later.

Full Postgres to S3 Dump Rake Task

  def db_to_dump
    ActiveRecord::Base.connection_config[:database]
  end

  def db_host
    ActiveRecord::Base.connection_config[:host]
  end

  def db_port
    ActiveRecord::Base.connection_config[:port]
  end

  def db_user
    ActiveRecord::Base.connection_config[:username]
  end

  def db_pass
    ActiveRecord::Base.connection_config[:password]
  end

  ###
  # Run this on the target environment you wish to dump (staging dev stack)
  # bundle exec rake data:dump_full_db
  #
  # The DB files should get uploaded to S3 ready 
  # to be pulled into another environment
  ###
  desc 'dump full DB to S3'
  task dump_full_db: :environment do
    tables_to_exclude = %w[really_big_tables papertrail_versions]
    exclude_lines = tables_to_exclude.map { |table| "--exclude-table-data=#{table}" }.join(' ')
    full_db_file = '/tmp/full_dev_dump_data.sql'

    begin
      s3 = AWS::S3.new(access_key_id: env['S3_ACCESS_KEY_ID'],
                     secret_access_key: env['S3_SECRET_ACCESS_KEY'])
      bucket = s3.buckets['dev-db-bucket']
    
      # HACK this is needed so you don't pass the password
      File.open(pg_pass_file, 'w') { |f|
        f.write("#{db_host}:#{db_port}:#{db_to_dump}:#{db_user}:#{db_pass}")
      }
      `chmod 600 #{pg_pass_file}`
      
      `pg_dump #{exclude_lines} -O -v -x -F c -f #{full_db_file} -h #{db_host} -p #{db_port} -U #{db_user} #{db_to_dump}`

      [full_db_file].each do |file|
        if File.exists?(file)
          puts "uploading #{file}"
          path = Pathname.new(file)
          obj = bucket.objects[path.basename.to_s]
          obj.write(path)
        end
      end
    ensure
      `rm #{pg_pass_file} &> /dev/null`
    end
  end
  
  ###
  # Run this on the environment you wish to create or restore
  # the most recent dump.
  #
  # bundle exec rake reload_full_from_s3
  ###
  desc 'reload dev DB from S3'
  task reload_full_from_s3: :environment do
    unless ENV['SKIP_DOWNLOAD']
      s3, bucket = s3_and_bucket
      full_db_file = '/tmp/full_dev_dump_data.sql'

      [full_db_file].each do |file|
        puts "downloading #{file}"
        path = Pathname.new(file)
        obj = bucket.objects[path.basename.to_s]

        File.open(path.to_s, 'wb') do |s3_file|
          obj.read do |chunk|
            s3_file.write(chunk)
          end
        end
      end
    end

    db_to_dump = ActiveRecord::Base.connection_config[:database]
    ActiveRecord::Base.remove_connection

    puts `psql -c "drop database #{db_to_dump};"`
    if $? != 0
      puts 'drop DB failed (you likely need to close console or app)'
      exit 1
    end
    puts `psql -c "create database #{db_to_dump};"`
    puts `pg_restore  --verbose --dbname #{db_to_dump} -F c #{full_db_file}`
  end

Single table to Staging or Development

OK that makes it easy for a new dev to get a copy of the latest full DB, what if you just want to pull the most recent payments, customers, or products… A much faster way it to load a table or two.

Postgres Single table S3 Dump & Load Rake Task

  ###
  # Run on target environment to dump a table
  # TABLE_NAME=phone_block_lists bundle exec rake db:dump_db_table
  ###
  desc 'dump single DB table to S3'
  task dump_db_table: :environment do
    table = ENV['TABLE_NAME'] || 'users'
    table_file = "/tmp/#{table}.sql"

    begin
      s3, bucket = s3_and_bucket
      write_pg_pass
      `pg_dump --no-owner --no-acl -v -F c -h #{db_host} -p #{db_port} -U #{db_user} --table public.#{table} #{db_to_dump} > #{table_file}`

      [table_file].each do |file|
        if File.exists?(file)
          puts "uploading #{file}"
          path = Pathname.new(file)
          obj = bucket.objects[path.basename.to_s]
          obj.write(path)
        end
      end
    ensure
      `rm #{pg_pass_file} &> /dev/null`
    end
  end
  
  ###
  # Run on environment where you want to load the table
  # TABLE_NAME=phone_block_lists bundle exec rake db:reload_table_from_s3
  ###
  desc 'reload dev DB table from S3'
  task reload_table_from_s3: :environment do
    table = ENV['TABLE_NAME'] || 'administrative_areas'
    table_file = "/tmp/#{table}.sql"
    db_to_dump = ActiveRecord::Base.connection_config[:database]

    unless ENV['SKIP_DOWNLOAD']
      s3, bucket = s3_and_bucket

      [table_file].each do |file|
        puts "downloading #{file}"
        path = Pathname.new(file)
        obj = bucket.objects[path.basename.to_s]

        File.open(path.to_s, 'wb') do |s3_file|
          # this is slow perhaps move to AWS CLI download
          obj.read do |chunk|
            s3_file.write(chunk)
          end
        end
      end
    end

    puts `psql --dbname #{db_to_dump} -c "truncate table #{table}"`
    puts `pg_restore  --verbose --data-only --dbname #{db_to_dump}  -F c #{table_file}`
  end

Database Helper Scripts

These are small examples extracted out from our database helper rake task. We have a number of other little helpers as well. It can be useful to build up some small simple tools to help you team move data around environments. It can also really simplify the process of bring on new developers to the team. Hope the simple examples above will be helpful to some folks.

comments
Dan Mayer Profile Pic
Welcome to Dan Mayer's development blog. I primary write about Ruby development, distributed teams, and dev/PM process. The archives go back to my first CS classes during college when I was first learning programming. I contribute to a few OSS projects and often work on my own projects, You can find my code on github.

Twitter @danmayer

Github @danmayer