Putting Your Best Code Forward

I like sharing things I’m working on. When I’ve searched for a solution, and not found one, I hope that sharing what I did find will help some other poor schmuck like me in the future. In fact, more than once I’ve done a search for a problem, and found a solution – in one of my own blog posts from years back.

I’ve been working on a Rails Security talk, and needed a way to figure out just how many lines of code are present in a default Rails application. Meaning – not just the lines of code for controllers, views, etc, but also the number of lines of code for all of the default gems. So I put together the following code:

GFILE = "Gemfile.lock"
 
list = File.read(GFILE)
 
gems = []
 
list.each_line do |line|
 line = line.strip
 break if line.include?("PLATFORMS")
 next if line.include?("GEM")
 next if line.include?("remote:")
 next if line.include?("specs:")
 next if line.empty?
 gems.push(line.split(' ').first)
end
 
total = 0
gems.uniq!
gems.each do |gem|
 puts "Processing #{gem}"
 contents = `gem contents #{gem}`.split
 local = 0
 contents.each do |file|
  output = `wc -l #{file}`
  amount = output.strip.split(' ').first.to_i
  local += amount
 end
 puts " LOC: #{local}"
 total += local
end
 
puts "Total Lines: #{total}"

It did what I needed, and I could have just thrown it away. But I figured someone else might need something like this, so I posted in publically. The problem? Let’s be frank – that’s some awful code. `GFILE`? `local`? When we post code, we should put our best code forward – examples of how to do things right. I didn’t post it as a draft – I posted it as a solution that worked. My good friend [J.B. Rainsberger][2] pointed out in a comment that I could do better. And he’s right! So let’s clean this up:

require 'set'

GEM_FILE_TO_PROCESS = "Gemfile.lock"

The first thing was to change that horrible name to something understandable. I also included the set library, because I really wanted a `set`, not an `array` that I needed to call `uniq!` on. Next, I extracted the line checks to separate methods, based on what I interpreted them to do when I came across them:

def gem_list_finished?(line)
 line.include?("PLATFORMS")
end
 
def non_gem_line?(line)
 line.include?("GEM") ||
 line.include?("remote:") ||
 line.include?("specs:") ||
 line.empty?
end

I then extracted out the line counting to their own methods:

def gem_name_without_version(line)
 line.split(' ').first
end
 
def get_files_in_gem(gem)
 `gem contents #{gem}`.split
end
 
def line_count_for_file(file)
 output = `wc -l #{file}`
 output.strip.split(' ').first.to_i
end

This has the handy side effect of removing the system calls (code called in backticks) away from the main logic. I also pulled out the `puts` statements into a log function:

def log(message)
 puts message
end

Now comes the meat of it. We still read in the file, but also initialize a set to store the gems, and put our other critical variable all in one place:

gemfile_list = File.read(GEM_FILE_TO_PROCESS)
gems_to_process = Set.new
total_line_count = 0

Next we process the Gemfile. Note that it’s clearer now that we have conditions we stop processing, and conditions where we skip processing:

gemfile_list.each_line do |gem_line|
 gem_line = gem_line.strip
 break if gem_list_finished?(gem_line)
 next if non_gem_line?(gem_line)
 gems_to_process.add(gem_name_without_version(gem_line))
end

Finally, we walk the gems we found and get their contents. I could probably use inject here instead of initializing a variable, but I prefer it to be a little clearer:

gems_to_process.each do |gem|
 contents = get_files_in_gem(gem)
 gem_line_count = 0
 contents.each do |file|
  gem_line_count += line_count_for_file(file)
 end
 total_line_count += gem_line_count
end

All in all, it’s a much better representation of something I would want to share. Could it get better? Absolutely. I could create a model around the Gemfile, asking the lines questions instead of checking them explicitly. And I probably will – if I ever need to turn this into something more explicit. The full code is below, and at [the gist][1]:

require 'set'
 
GEM_FILE_TO_PROCESS = "Gemfile.lock"
 
def gem_list_finished?(line)
 line.include?("PLATFORMS")
end
 
def non_gem_line?(line)
 line.include?("GEM") ||
 line.include?("remote:") ||
 line.include?("specs:") ||
 line.empty?
end
 
def gem_name_without_version(line)
 line.split(' ').first
end
 
def get_files_in_gem(gem)
 #gem contents returns the files
 #as a line break delimited string
 `gem contents #{gem}`.split
end
 
def line_count_for_file(file)
 output = `wc -l #{file}`
 #line count is the first column from
 #the returned value
 output.strip.split(' ').first.to_i
end
 
def log(message)
 puts message
end
 
gemfile_list = File.read(GEM_FILE_TO_PROCESS)
gems_to_process = Set.new
total_line_count = 0
 
gemfile_list.each_line do |gem_line|
 gem_line = gem_line.strip
 break if gem_list_finished?(gem_line)
 next if non_gem_line?(gem_line)
 gems_to_process.add(gem_name_without_version(gem_line))
end
 
log "TOTAL GTP: #{gems_to_process.count}"
 
gems_to_process.each do |gem|
 log "Processing #{gem}"
 contents = get_files_in_gem(gem)
 gem_line_count = 0
 contents.each do |file|
   gem_line_count += line_count_for_file(file)
 end
 log " LOC: #{gem_line_count}"
 total_line_count += gem_line_count
end
 
log "Total Lines: #{total_line_count}"

So what are your thoughts? Could it be improved even more?

Comments on this entry are closed.

  • Ronald Jeffries March 9, 2014, 2:24 pm

    Nice, and well handled.

    Now I wonder what would happen if you coded it by top down by intention … :)

  • J. B. Rainsberger March 9, 2014, 3:38 pm

    I did the following: https://gist.github.com/jbrains/9451941

    I did it in 28 commits, so if you want the play-by-play, then you look at the code revision-by-revision.

    I didn’t bother with the smell of depending on puts and “, because I didn’t feel like it. :)

    I just wish extracting the relevant section of Gemfile.lock were cleaner. Neither sed nor awk nor head nor tail nor cat seemed to do it any better than the procedural Ruby that you wrote initially. I don’t know if one can tell sed to do -n ‘/regex1/,/regex2/p’ while excluding the lines that match those regexes. That’s what I want.

    Thanks.