Ruby


Update: I’m ditching Sake for Thor. These tasks have been ported to thor and are available on github.

I’ve started using Git as my SCM of choice for Subversion projects over the last several months and have found that, while I don’t want to use Subversion anymore, there are some things it makes easier than git. For example, let’s say you’re working on something and you want to pull in the changes from other people on your team. With svn it’s simply:

$ svn up

With git things are different, since it only merges changesets and not locally changed files. This was a pain before git-stash came along, since I’d have to back out a change, update, and then reapply it. Even with git-stash things are a bit more painful. Here’s the equivalent to the above for a git-svn project:

$ git stash
$ git svn rebase
$ git stash apply

Oh, and that’s only if you’re on the master branch. If you’re on another one (and you should be), then here’s what it looks like if you want to keep master up to date too:

$ git stash
$ git checkout master
$ git svn rebase
$ git checkout mybranch
$ git rebase master
$ git stash apply

Whew! Note that this mostly applies to git-svn projects. For regular git projects a git-pull will do nicely.

I got sick of this, and I noticed that the Rubinius project uses a Rakefile to handle a fair number of the git commands, including updating and pushing. Here’s a Sake script that gives you two tasks: git:update and git:push which automatically check whether the project is a git-svn project and do the right thing. Install it like so:

$ sake -i http://pastie.caboo.se/147964.txt

And now we’re back to a one-liner:

$ sake git:update

Update: I just added git:open and git:close which you should think of as opening and closing issues. They just create and delete branches and can be used like this:

$ sake git:open
* Name your branch: ofx
* Switching to master
Switched to branch "master"
Switched to a new branch "ofx"
$ sake git:close
* Switching to master
* Deleting branch ofx

And don’t worry, git:close is safe and won’t destroy your work if you haven’t merged it yet:

$ sake git:close
* Switching to master
* Deleting branch ofx
* Branch ofx isn't a strict subset of master, quitting

Update: I gave this its own repo on github, so go forth, and git.

I started reading Giles Bowkett’s blog a little while ago. He has a lot of rants, and it’s fun. I would have commented on his blog but it seems to be restricted. In a recent article he criticized someone for their custom dynamic finder, attacking them on a number of fronts: SQL performance, architecture design, and readability. But he forgot to check his code in irb, claiming that:

#merge() is a Hash method. It adds new keys to an existing hash based on another existing hash. Unfortunately, it does so destructively. If you have two values defined for the :conditions key in two distinct hashes and you merge them, only one :conditions value will survive the harrowing ordeal. #merge() is Darwinian - it’s survival of the fittest in there.

Amusing, but wrong:

  1.  
  2. >> h={:conditions => "true"}
  3. => {:conditions=>"true"}
  4. >> g=h.merge(:conditions => "false", ‘order’ => ‘email desc’)
  5. => {:conditions=>"false", ‘order’=>"email desc"}
  6. >> h
  7. => {:conditions=>"true"}
  8. >> g
  9. => {:conditions=>"false", ‘order’=>"email desc"}
  10.  

You were thinking of update and merge!. Also, he quotes Assaf as saying:

`*opts` is a bug, lacking an understanding of how Ruby handles arguments. `*opts` refers to all other arguments you pass, the vararg equivalent. But when you call `find_by_email(”…”, :limit=>5, ‘order’=>”name”)` the last two “arguments” (limit and order) are actually keys in a hash that’s passed as a single argument. So anything you pass to find_by_email, the original, that falls inside *opts, would be passed to find after the last argument it expects (the hash of options), which should, if implemented correctly, cause find to fail with an error.

That’s correct up to the last sentence. Let’s look at the code in question:

  1.  
  2.   class User < ActiveRecord::Base
  3.     def self.find_by_email(email, *opts)
  4.       with_scope(:find => { :conditions => [‘lower(email) = lower(?)‘, email] }) do
  5.         find(:first, *opts)
  6.       end
  7.     end
  8.   end
  9.  

This will work. There’s nothing wrong with using the splat operator here other than the fact that it’s going to make this method slower than it out to be. Here’s a better implementation:

  1.  
  2.   class User < ActiveRecord::Base
  3.     def self.find_by_email(email, options={})
  4.       with_scope(:find => { :conditions => [‘lower(email) = lower(?)‘, email] }) do
  5.         find(:first, options)
  6.       end
  7.     end
  8.   end
  9.  

C’mon Giles, you’re better than this.

Figuring out what’s wrong in Ruby can be a pain. That dynamic typing that you find so nice while writing code can sometimes work against you when reading it — and troubleshooting it. What most of us end up resorting to is basically one step up from how most JS debugging happens: puts.

While this is annoying, it doesn’t have to be this way. Ruby comes with a debugger, but it is slow on non-trivial applications. The best alternative is a gem called Ruby Debug, which is fast and has a bunch of goodies. I come from the GUI-debugging world of VS.NET and IDEA, so getting into Ruby Debug was a little bit of a challenge for me at first. If you come from the gdb world you should feel right at home.

Watch the Ruby Debug Basics screencast

from the screencast: fib.rb

  1. def fib(n)
  2. @fib_cache ||= [1, 1]
  3. @fib_cache[n] ||= fib(n-1) + fib(n-2)
  4. end
  5.  
  6. puts fib(20)

Next screencast will cover debugging a Rails application.

Ruby is a great language. Really. How many other languages let you extend other classes? Modify their behavior? Overload operators? With great power comes great opportunity, and in this post I’ll cover one of my recent extensions to Ruby that was recently accepted into Rails’ core.

Date and Time

Managing dates and times is not exactly fun in any language. There are time zones, leap years, two meridians, twenty-four hours, 365 days (sometimes), and a mess of inconsistently long months. Those of us who use Ruby frequently are familiar with how to get objects corresponding to the current time and the current day:

  1. >> Time.now
  2. => Thu Jan 25 20:12:50 -0800 2007
  3. >> Date.today
  4. => #<Date: 4908251/2,0,2299161T>

Sexy methods…

Rails has these really cool extensions to Numeric that allow computing lengths of time in a more human-readable way. For example, if I wanted to figure out how long 12 hours is, I could just call 12.hours and get back the number of seconds in 12 hours, which could then be conveniently added to an instance of Time, like so:

  1. >> t = Time.now
  2. => Thu Jan 25 19:31:17 -0800 2007
  3. >> t + 12.hours
  4. => Fri Jan 26 07:31:17 -0800 2007

Sweet! What’s even cooler though is that, in addition to Numeric with seconds, minutes, hours, days, weeks, fortnights, months, and years, we also get from_now, ago, since, and until:

  1. >> 1.day.from_now
  2. => Fri Jan 26 19:34:20 -0800 2007
  3. >> 2.weeks.ago
  4. => Thu Jan 11 19:34:35 -0800 2007

but kinda dumb.

Nice! These are so easy to read. These were great for simple things, but since they returned numbers, they couldn’t take into account the current date and how many days were in a month, leap year, etc. Notice that this causes problems when the month doesn’t have 30 days in it:

  1. >> Time.now
  2. => Thu Jan 25 21:01:31 -0800 2007
  3. >> 1.month.from_now
  4. => Sun Feb 24 21:01:34 -0800 2007

So it’s good for approximations, but not much else. Well what now?

A smart method

To compensate for the lack of accuracy in the above sexy methods, Time#advance was added to make it easy to do smart addition to Time instances. Here’s some of the above using advance:

  1. >> t = Time.now
  2. => Thu Jan 25 21:11:32 -0800 2007
  3. >> t.advance(:hours => 12)
  4. => Thu Jan 25 21:11:32 -0800 2007
  5. >> t.advance(:months => 1)
  6. => Sun Feb 25 21:11:32 -0800 2007

Excellent! Not quite as sexy, but definitely smart. A method suitable for Serious Business.

Smart n’ Sexy

I welcomed Time#advance, like everyone else, but I didn’t want to give up the sexy methods. Fortunately, Ruby is great for letting objects masquerade as other objects (the whole duck typing thing), and the de-facto way to do this in Rails is by using Builder::BlankSlate, whose instance undefine almost all of their methods, allowing you to easily proxy all or some methods to something else. The two best examples of its use in Rails are AssociationProxy, which is used by Active Record, and JavaScriptGenerator, which is used by RJS.

To solve the problem, all we need to do is make the duration methods on Numeric return a proxy for the number they used to return which acts accordingly around Time and Date objects. This new class is located in ActiveSupport::Duration, and simply accumulates lengths of time for when it is used around a Time or Date, and a number for use around things that expect it to act like a number (i.e. for backward compatibility). Here’s the new, smart and sexy, methods on Numeric:

  1. >> t = Time.now
  2. => Thu Jan 25 21:21:44 -0800 2007
  3. >> t + 1.month
  4. => Sun Feb 25 21:21:44 -0800 2007
  5. >> t + 1.week
  6. => Thu Feb 01 21:21:44 -0800 2007
  7. >> 1.year.from_now
  8. => Fri Jan 25 21:22:00 -0800 2008
  9. >> 4.years.from_now
  10. => Tue Jan 25 21:22:12 -0800 2011
  11. >> 3.weeks
  12. => 21 days

How’s that last one for overriding inspect? To try it out all you have to do is freeze edge in a Rails project near you!

$ rake rails:freeze:edge

Update: I should add that this recent change also corrected handling around Date objects. I implied that above, but I should be explicit. Before Duration, adding Date.today and 1.day didn’t yield the expected results at all:

  1.  
  2. >> (Date.today + 1.day).to_time
  3. ArgumentError: time out of range
  4.  

Now it behaves as expected:

  1.  
  2. >> (Date.today + 1.day).to_time
  3. => Sat Jan 27 00:00:00 -0800 2007
  4.  

For anyone as annoyed as I am about 1.month.from_now being inaccurate, lean on your favorite Rails team member to get this patch accepted.

When coding there are many guidelines you might opt to follow, such as Convention Over Configuration, the Principle of Least Surprise, and others, all intended to prevent you from falling into certain pitfalls. One such pitfall is one that I myself often fall victim to, and involves spreading the knowledge of the internal workings of one component to several others. That is, if A, B, and C all know how D works, then there’s really little point in grouping functionality inside D in the first place. Avoiding this pitfall is what the Law of Demeter tells us to do. To quote the Wikipedia article:

When applied to object-oriented programs, the Law of Demeter can be more precisely called the “Law of Demeter for Functions/Methods” (LoD-F). In this case, an object A can request a service (call a method) of an object instance B, but object A cannot “reach through” object B to access yet another object to request its services. Doing so would mean that object A implicitly requires greater knowledge of object B’s internal structure. Instead, B’s class should be modified if necessary so that object A can simply make the request directly of object B, and then let object B propagate the request to any relevant subcomponents. If the law is followed, only object B knows its internal structure.

Unfortunately this is done all the time in Rails, partly because they make it so darn easy to access associations and their associations:

  1.  
  2. shipment.user.profile.mailing_address
  3.  

Looks innocent enough. But what happens when we sprinkle this around our codebase in a variety of forms, and then without warning the requirements change and all of a sudden the mailing address is now attached to the user rather than the profile. This will require going through all your code that might look for the mailing address and updating it. You’d better pray you have near 100% test coverage.

The unpredictable and despotic need for change that creeps into every project is the reason you should care about the Law of Demeter, also called the Principle of Least Knowledge. But what do we do about it? The Wikipedia excerpt above makes it pretty clear that we should define mailing_address on Shipment and User:

  1.  
  2. class Shipment < ActiveRecord::Base
  3.   …
  4.   def mailing_address
  5.     user.mailing_address
  6.   end
  7.   …
  8. end
  9.  
  10. class User < ActiveRecord::Base
  11.   …
  12.   def mailing_address
  13.     profile.mailing_address
  14.   end
  15.   …
  16. end
  17.  

Okay, fine. Not the prettiest, but it does help us with refactoring. You may have noticed a problem with both the old code and the new version: what if one of the associations is nil? We’ll get a big fat NoMethodError of course! Let’s fix that:

  1.  
  2. class Shipment < ActiveRecord::Base
  3.   …
  4.   def mailing_address
  5.     user ? user.mailing_address : nil
  6.   end
  7.   …
  8. end
  9.  
  10. class User < ActiveRecord::Base
  11.   …
  12.   def mailing_address
  13.     profile ? profile.mailing_address : nil
  14.   end
  15.   …
  16. end
  17.  

Getting kinda ugly, but it works better now. Both refactoring and nil problems are taken care of. Now that we’ve got that, we can rip it out. A while back Rails got a method called delegate that’ll let us do just this type of thing, providing both the refactoring safety and nil safety. Using this method we can change our code to this:

  1.  
  2. class Shipment < ActiveRecord::Base
  3.   …
  4.   delegate :mailing_address, :to => :user
  5.   …
  6. end
  7.  
  8. class User < ActiveRecord::Base
  9.   …
  10.   delegate :mailing_address, :to => :profile
  11.   …
  12. end
  13.  

Isn’t that cool? Now it’s nice and semantic, safe, and refactorable!

We’re using FK constraints at Attendio. As you all know, this means we must be careful about the order we load fixtures in our tests when calling fixtures. But what if you can’t control the order of the fixture load as when using the FixtureSets or FixtureScenarios plugins? Or maybe you simply don’t want to have to worry about the order?

One solution is to disable FK constraints before loading the fixtures, then enable them afterward. This is liable to be a vendor-specific solution, however, and it just feels bad. Another solution is to determine the correct order and ensure that the fixtures you want to load are loaded according to that order (which would require a modification to the plugins, but at least they could then work without the user needing to specify an order). Let’s say you have a few models:

  1.  
  2. class User < ActiveRecord::Base
  3.   has_many :memberships
  4.   has_many :groups, :through => :memberships
  5.   has_many :taggings, :as => :taggable
  6. end
  7.  
  8. class Group < ActiveRecord::Base
  9.   has_many :memberships
  10.   has_many :users, :through => :memberships
  11.   has_many :taggings, :as => :taggable
  12. end
  13.  
  14. class Membership < ActiveRecord::Base
  15.   belongs_to :user
  16.   belongs_to :group
  17. end
  18.  
  19. class Tag < ActiveRecord::Base
  20.   has_many :taggings
  21. end
  22.  
  23. class Tagging < ActiveRecord::Base
  24.   belongs_to :tag
  25.   belongs_to :taggable, :polymorphic => true
  26. end
  27.  

One correct order for loading the fixtures is users, groups, memberships, tags, taggings (there are four correct ways to load them). All we really have to do is look at the non-polymorphic belongs_to associations, and we’re set. Here’s some code that does just that.

Update: the code is cleaner now, and probably faster since it doesn’t build up an entire dependency tree.

  1.  
  2. def load_order
  3.   models = Dir[RAILS_ROOT + ‘/app/models/**/*.rb]
  4.   models.each { |file| require(file) }
  5.  
  6.   klasses = models.map { |file| File.basename(file, ‘.rb).camelize.constantize }
  7.   klasses.reject! { |klass| !klass.ancestors.include?(ActiveRecord::Base) }
  8.  
  9.   deps = klasses.inject({}) do |h,klass|
  10.     h[klass] = klass.reflect_on_all_associations.select { |a| a.macro == :belongs_to && !a.options[:polymorphic] }
  11.     h[klass] = h[klass].map { |a| a.class_name.constantize }
  12.     h
  13.   end
  14.  
  15.   order = []
  16.   until deps.empty?
  17.     free = deps.keys.select { |k| deps[k] == [] }
  18.     deps = deps.without(*free)
  19.     deps.each { |k,v| deps[k] = v - free }
  20.     order += free
  21.   end
  22.   order.map(&:table_name)
  23. end
  24.  

The code is rough and not yet integrated in a useful fashion, but it gives you an idea and will let you integrate it into your fixture solution as I have. It won’t work with HABTM relationships since there’s no model for the join table. Other than that it seems decent, if a bit on the ugly side. If anyone wants to clean it up I’d be obliged. Oh and by the way it’s under the Creative Commons Share-Alike license.

“Putting Rails to Work” - November Ruby Meetup

Begins: Tue, 14 Nov 2006 at 6:00 PM
Ends: Tue, 14 Nov 2006 at 7:00 PM
Entry fee: Free
Location: Odeo
164 South Park St.
San Francisco, CA 94107
Although Bosco will be away, Josh Susser is organizing this baby and he has something up his sleeve. It’s going be real good. Stay tuned…
Tags: ruby rails

You may have noticed code that looks like this:

  1. array.each(&:notify!)

What, you may ask yourself, does this do? Firstly, each, map/collect, select, reject, inject, etc. are all methods on Enumerable that are included by various classes, like Array. They are the basis for dealing with Arrays in Ruby, and each is the one on which all the others are built (whether literally true, I don’t know, but they could be).

All these methods expect to pass each element of the Array in turn to a block (anonymous function), which normally looks something like this:

  1. array.each {|item| puts item.inspect}

However, there are times when you want to use the same block many times, in many different places. In such cases, you don’t really want your block to be repeated over and over again, so you encapsulate it in a Proc instance:

  1. my_block = Proc.new {|item| puts item.inspect}
  2. array1.each(&my_block)
  3. array2.each(&my_block)

Notice that you use the method-passing syntax of parenthesis rather than block syntax when calling each, and that both calls prefix the block with an ampersand. This is the syntax Ruby uses to indicate that this argument (which must be the last) should be treated as the block for this method call.

In the above examples, my_block is a Proc instance, which is exactly what Ruby expects. So far, so good. However, what if you don’t pass it a Proc?

  1. array.each(&"some string")

Aside from human confusion over exactly what is intended here, what will Ruby do? It will see that the object you’re saying to use as the block is not a Proc instance, and will try to convert it to one by calling the object’s to_proc method, if it exists.

I think that in the core language only Proc instances have this method (they return self), but Rails adds something:

  1. # in active_support/core_ext/symbol.rb
  2.  
  3. class Symbol
  4. # Turns the symbol into a simple proc, which is especially useful for enumerations. Examples:
  5. #
  6. #   # The same as people.collect { |p| p.name }
  7. #   people.collect(&:name)
  8. #
  9. #   # The same as people.select { |p| p.manager? }.collect { |p| p.salary }
  10. #   people.select(&:manager?).collect(&:salary)
  11. def to_proc
  12. Proc.new { |obj, *args| obj.send(self, *args) }
  13. end
  14. end

This to_proc method returns a Proc instance whose contained block takes a target object and any number of trailing arguments. So here’s what we get:

  1. # these two are equivalent
  2. :name.to_proc
  3. Proc.new { |obj, *args| obj.name(*args) }

So we could do this:

  1. block = :name.to_proc
  2. array.map(&block)

However, as I said earlier, Ruby will call to_proc for us, so we can shorten it to this:

  1. array.map(&:name)

Remember that each, map, select, reject, etc. all only pass one argument: the item at the current point in the iteration. This means that in the block above args will be an empty array, so the block can usually be thought of as this:

  1. Proc.new { |obj| obj.send(self) }

And remember that self here is the symbol itself.

Grams and Grandpa Black gave me a book for my birthday called “The Automatic Millionaire” by David Bach. I read the introduction, and so far I’ve been amused at the style of his writing, which is a bit like a traveling medicine proprietor from the early 1900s, and by the obviousness of what he’s saying.

He claims that budgets don’t work. Discipline doesn’t work. Paying bills first and saving some of what’s left doesn’t work. Buying on credit doesn’t work (except in real estate). I agree with him, but the funny thing is that this is not really new to me - my other grandparents (the Donovans) have been telling me and my sister this for years, particularly that last point.

The most important thing about all of this, and the reason why discipline doesn’t matter, is that your finances should be automatic. This is again something the Donovans told me many times. I have a number of automatic financial transactions set up, but not all - some of which are intentional. I’m not sure why, but I thought that having to manually pay my credit card bill every month would give me more control, or at least notice, of my financial situation. It hasn’t. I always pay in full, so that doesn’t vary. It hasn’t made me more aware of my spending, as I’d hoped it would. In all it’s made me more worried about it.

The funny thing is that I could have easily applied the lessons I’ve learned from Agile Programming to my finances, the biggest one being that of automation. In software, you have certain assertions that, given such and such parameters, must be true. There are ways of codifying these assertions into what is usually called a Unit Test. After a while you end up with a suite of tests that, if comprehensive, will tell you how well your software is abiding by its contract. The problem with unit tests is that they atrophy very easily. It’s so easy to forget or intentionally skip the tests when developing. That’s why they need to be automated. When they are automated, you can’t ignore your software’s contract violations because they are in your face. The point is this: automate, or it will never happen.

The same applies to finances, though to a lesser extreme. I do pay my credit card bill even though it’s not automatic because there are serious and immediate consequences to my not doing it.

I’d taken a few steps on my own before I got this book that I think have helped my finances and my sanity:

No, I don’t want a receipt

I tried, unsuccessfully, for a long time to record every financial transaction I ever was involved in. This was detrimental because I spent more time on the overhead of bookkeeping than the money involved was worth. I constantly worried about stacks of unfiled receipts, staring at me on the kitchen table. I wanted to aspire to some level of financial mastery as the banker in Atlas Shrugged, who balanced his huge banking empire down to the penny.

I can’t do that, nor do I really want to anymore. Do I ever really want to find out how much I’ve spent on movie rentals in the last month? Or on eating out this last week with employees from j2? No. Not really. The need has never come up, and I don’t think knowing would benefit my financial situation much.

Quick, what’s the balance?

I wrote a script a while back to get the balance of my Wells Fargo accounts. It worked, but wasn’t that great. I’ve since improved it and added one for my credit card, and both of these show up on my desktop, updated automatically every three hours. This helps me track my finances much better than tediously recording receipts ever could.

Technorati Tags: , , ,

Next Page »