In profiling some code recently, we found a spot where we were using OpenStruct to handle parsed JSON coming back from a web service. OpenStruct is nice, because it allows you to address the returned object like a normal Ruby class using dot notation (IE my_class.name = "foo"). We read the following warning in the docs:
This should be a consideration if there is a concern about the performance of the objects that are created, as there is much more overhead in the setting of these properties compared to using a Hash or a Struct.
However, we didn't realize just how bad performance would be. Check out this benchmark comparing Hash, OpenStruct, Struct, and Class:
OpenStruct uses Ruby's method_missing and define_method to simulate "normal" objects, so if you are using these methods frequently in code, check to see if they are really necessary.
Seeing that classes outperformed hashes by a significant margin, I decided to investigate using Ruby 2.0's new keyword arguments when I needed optional parameters on methods. My assumption was that calling a method and passing in the arguments as a hash would incur some performance penalty as each method invocation would result in a new hash being instantiated.
I was surprised to see that using hashes outperformed Ruby 2.0's native keyword arguments by quite a bit:
# Using Ruby 2.2.1p85 on a MacBook Pro (Retina Early 2015) w/ 3.1 GHz i7
user system total real hash: 7.980000 0.510000 8.490000 ( 8.631545) keywords: 2.940000 0.060000 3.000000 ( 3.038953)
user system total real hash: 5.490000 0.540000 6.030000 ( 6.396844) openstruct: 127.070000 1.610000 128.680000 (131.303417) struct: 3.630000 0.040000 3.670000 ( 3.691814) class: 2.350000 0.010000 2.360000 ( 2.415393)
In honor of Docker Global Hack Day, here's a recipe for running your Ruby tests in parallel using Docker and Rake on your local development machine.
This was inspired by Nick Gauthier's great post detailing how to parallelize Rails tests using a shell script and Docker. My use case was slightly different (Sinatra, not Rails) and I wanted to have my dependencies running in multiple docker containers instead of a single one. I also wanted to avoid the cost of running `bundle install` each time the tests run. This means that the ruby tests will run outside of Docker, pointing to the running Docker containers for their infrastructure dependencies.
To do this, I'm using Vagrant in OSX using the Docker Vagrant distribution. This means that I had to customize the Vagrant VM to include the version of Ruby I needed, install bundler there, and map a directory to my host file system so I could run tests on the same code I am working on.
Clearly, this won't isolate the Ruby environment in its own Docker container. However, we don't typically experience problems that stem from differences in Ruby or Gems, so this tradeoff was one I was willing to make.
Using this technique you can run multiple sets of Docker containers, each running a discrete portion of your infrastructure (in my case it's Redis and Solr). I'm going to assume you have your own Docker images configured in a working Docker installation (I used Docker 0.6 and 0.7 for my tests).
We'll be creating a custom Rake task to start the docker containers and run the tests, here's a sample:
The rake task can be configured to run as many forks as necessary. Remember, your entire test time will only be as short as your longest individual test, so you may want to break test files down to improve the overall efficiency. The Rake task invokes Docker using a system call and I found it was important to let the fork sleep for a bit to allow the process within the container to start.mosh is a very nice way to connect to remote servers via the command line. It works in conjunction with ssh for authentication, but after that it's a totally new client/server protocol.
mosh helps people who move around a lot stay connected and also has some advantages over ssh in terms of responsiveness, which is nice on slower connections (like when you're connecting via your cell on a train).
Getting mosh working on Dreamhost isn't totally straightforward, but it is pretty simple if you are comfortable working on the command line. I'll give a step-by-step guide on installing and running mosh 1.2.2 with Protocol Buffers 2.4.1 below (there is definitely room for improvement, as you'll see, and suggestions are welcome).
Despite trying both .bash_profile and .profile, I couldn't get the export from the last line to stick. Same thing for the path settings. I'm not sure what kind of shell mosh invokes when it tries to run itself on the server, nor do I know how to get the environment variables to work there. Suggestions welcome!
]]>There are several guides out there for running Drush on Dreamhost. Most require that you set an alias and run the drush.php file directly. However, you can do something just as easy and make it so that Drush will run via the provided shell script, which seems safer to me.
You'll want to do the following steps from the command line (skip to Step 3 if you already have Drush downloaded and just need to make it work properly):
Other Dreamhost Guides:
]]>Ever accidentally committed a password in a file that you actually need? One of the advantages of using Git is that you can fix your horrible blunder with relative ease.
This example uses some code from Kevin van Zonneveld's helpful post and some knowledge from GitHub's guide to removing sensitive data. Kevin's example uses a new Git repo, but I used a similar process to fix an existing one.
I'm also using GitHub as my Git host, so there may be changes if you need to do this elsewhere.
WARNING: This can break your stuff. Backup and be careful.
]]>The BioPortal Web UI is primarily built using Ruby on Rails, which is an MVC framework built for the web. The views are plain HTML with the option to include Ruby in tags. We'll walk through a quick tutorial that will enable you to add your own sections of content on top of the existing Web UI. For our example, we'll add a section that will list tools that can be used to work with the ontologies in BioPortal. The Tools section will be a simple list that includes a tool's name, a web address, and a description. This list will be stored in a database and entries can be added, edited, and deleted.
For more guides on working with Rails, see http://guides.rubyonrails.org/v2.3.10.
This tutorial assumes that you have version .4 of the NCBO Virtual Appliance running in your virtualization environment.
cd /srv/ncbo/rails/BioPortal/current
script/generate scaffold Tool name:string website:string description:string
rake db:migrate RAILS_ENV=production
ncborestart
layout "ontology"
directly under the 'class ToolsController < ApplicationController' line. If you refresh the page you can see it now shows the usual BioPortal layout. You can also read more about Rails layouts.
ncborestart
again for these changes to take effect. To avoid this, you can run Passenger in the development environment, which will reload all controllers and templates on each request.Update 11/28/11: Clarified instructions and added a screenshot of the Tools section
Barry Smith, a philosopher turned ontologist, has a two-day course that he's made available online that introduces the concept of ontology and delves into their history, practical application, and connection to computer science. For those wishing to get a firm grasp on what ontologies are and how they're used, I highly recommend that you take the time to give these lectures an in-depth listen.
I find Barry to be a captivating lecturer capable of distilling some very abstract concepts down to understandable, actionable pieces of knowledge. I saw him speak to a group of developers as a part of the National Center for Biomedical Ontology's recent meeting, where he urged all developers to at least give his course some attention. From other people in the ontology community I understand that he has a very particular viewpoint on the ontology world. But I found it useful and hope others will too. Any pointers to other educational pieces will be welcomed in the comments.
The videos, available for streaming below, are part of a two-day course that Barry teaches. Barry indicates that they are free to use in any capacity, so I took the liberty of uploading them to Viddler (the only free streaming video site I could find that would allow long videos with no special account).
http://www.viddler.com/explore/palexander/videos/1/
http://www.viddler.com/explore/palexander/videos/2/
http://www.viddler.com/explore/palexander/videos/7/
http://www.viddler.com/explore/palexander/videos/8/
http://www.viddler.com/explore/palexander/videos/6/
http://www.viddler.com/explore/palexander/videos/5/
http://www.viddler.com/explore/palexander/videos/3/
Hopefully this will be of use for those who are deploying BioPortal on their own servers:
The BioPortal team is happy to announce some new changes that will ease the life of those who are deploying the BioPortal UI application in stand-alone instances. This should make it easier for you to upgrade BioPortal as we release new versions and lays down a framework for us to add new functionality without breaking your installations in the future. Previously, things like the autocomplete, found in the Jump To and Form Complete widgets, were hard-coded to use bioportal.bioontology.org as their AJAX back-end. That's no longer the case.
See more: BioPortal UI Now Supports Easier Deployment -- and Internationalization
]]>
I like using one email account for all of my email needs and one way I try to cut down on possible spam or unwanted emails is to use subaddressing when I sign up for a new service. It's certainly not foolproof but it does provide an easy way to filter mails when that "Unsubscribe Me" link suspiciously fails to work.
Subaddressing is simple to use, that's the main reason I like it. Gmail supports this, as it should since it's included in the email RFC. To use subaddressing, simply add a plus sign after the "local" part of your email (the section before the @domain.com). So, if you were signing up for Great Minds, you could use this email: myemail+greatminds@domain.com.
The problem comes in when people try to validate your email before allowing you to sign up. Many, many web services will not allow plus signs when you sign up, with varying degrees of success when it comes to handling a user who inputs the plus sign. Most services just reject the address as invalid and force you to try again. Some (Safeway) just strip the plus sign out without ever informing you, leaving you with an invalid email address in your account settings and, since you use your email to login, an invalid username as well.
Today I was surprised by a new response: a "helpful" employee went in and changed my email for me, thinking it was a typo.
]]>You didn't receive an email with your decline notice and instructions to fix your [company-name] purchase because there was a typo in the email address you entered, specifically [my-email]@gmail.com (but with "[company-name]" written out in the address). I added the correct email address to your account...
I recently needed to handle some URL changes in a Rails application, basically modifying an existing route to point somewhere else. I wanted to make sure that the old route still existed so that users who might rely on it would not get left behind, not to mention search indexes that might have picked up the old location. I used the method outlined by Andrew Bruce, then tweaked it to make sure that params that were being passed by the user remained intact.
Basically you can follow his technique and use the code below in the controller.
Pretty simple, and very effective when you need to move things around but still provide access to the deprecated URLs.
]]>The following code will allow you to pull a search term and its context out of a larger string. It can be modified to use offsets as the start/finish points for the term rather than looking for the term itself. I'll end up doing this myself as the actual application this will be used in is semantic-based and the term a user searches for could have synonyms or relationships to other words which will also produce results. In this case it would be dangerous to highlight the original search term.
]]>I recently needed to use Thickbox functionality in a place where I wanted the Thickbox to appear without user interaction. This was mainly to keep the appearance of the dialog in line with what we use elsewhere. To do this, it's possible to create a url that will never be called and use Thickbox's built-in functionality for handling inline content.
The variable should store a url just as you would normally create for use with Thickbox links. This can include various options, including height, width, modal state, and the id for the div that holds the inline content.
The first step was to create the HTML portion that would be displayed in the modal Thickbox. For some reason Thickbox wouldn't display the text unless it was contained within paragraph elements.
Next we can use jQuery to create an event trigger for after the document is ready. I found that the Thickbox invocation didn't work, at least not reliably, without waiting for the document ready event, which makes sense if the DOM hadn't been built out yet.
Pretty simple, and very effective for my needs.
]]>