Regular Expressions in Ruby
-
PHP implements both POSIX and Perl-compatible regular expressions. The Perl-compatible regexp functions (which includes all the
preg_*functions) is the preferred library for most developers since it has many features not available in POSIX, and is binary safe.Ruby uses Perl-compatible regular expressions, so if you’re familiar with the
preg_*functions in PHP, you’re already well on your way to learning regular expressions in Ruby. Regular expressions are a complex topic, so we won’t be covering regular expression basics, but will instead focus on translating existing knowledge of Perl-compatible PHP functions to Ruby.Regular Expressions in Ruby
We use regular expression patterns in PHP by passing a string argument to various functions. Ruby treats regular expressions differently. Instead of specifying the pattern within a string, they are objects just like everything else in Ruby.
PHP
$myRegexp = '/[a-z0-9]+\s/mi'; print gettype($myRegexp); // => string
Ruby
my_regexp = /[a-z0-9]+\s/mi p my_regexp.class # => Regexp
We can create regular expressions in Ruby using two different literal syntaxes.
The most common is by enclosing the pattern in forward-slashes, but we can also use an alternate%r{}syntax. We usually use%r{}when the pattern contains a lot of forward-slashes (such as a filepath). Regular expressions can also be explicitly instantiated using the Regexp class.Ruby
/[a-z0-9]+\s/mi %r{/path/to/gif\.gif}mi Regexp.new("[a-z0-9]+\s", Regexp::IGNORECASE | Regexp::MULTILINE)
Comparing Functions/Methods
Now that we’ve seen some basic syntax for regular expression objects in Ruby, let’s take a look at PHP’s PCRE functions, and their closest equivalents in Ruby.
PHP Ruby preg_match String#match preg_match_all String#scan preg_replace String#gsub preg_split String#split preg_grep Array#grep preg_quote Regexp.escape preg_match vs. String#match
We match a pattern in Ruby strings using the
matchmethod. Ruby’smatchmethod works differently thanpreg_matchin how it returns matches. We usually want to know two different things when we match data: If the pattern matched, and what specific strings sections were matched.PHP returns an integer to tell us if the data matched (either
0or1) and populates a matches array by reference. Ruby returns aMatchDataobject when the pattern matches, andnilwhen something doesn’t. We can inspect theMatchDataobject to find the actual string matches.In this example, we try to match the different components of a list of email addresses. Both
preg_matchandString#matchonly match the first occurrence of the pattern.PHP
$string = 'joe@example.com; walter@example.org'; $result = preg_match('/([a-z0-9_.-]+)@([a-z0-9-]+)\.([a-z.]+)/i', $string, $matches); var_export($result); // => 1 var_export($matches); // => array('joe@example.com', 'joe', 'example', 'com')
Ruby
string = 'joe@example.com; walter@example.org' matches = string.match(/([a-z0-9_.-]+)@([a-z0-9-]+)\.([a-z.]+)/i) p !matches.nil? # => true p matches # => #<MatchData:0x1ed138> p matches[1] # => "joe" p matches.to_a # ["joe@example.com", "joe", "example", "com"]
preg_match_all vs. String#scan
PHP returns an integer with the number of matches for
preg_match_alland populates a matches array by reference. Ruby performs multiple matches for a string using thescanmethod. This method returns a nested array of matches or an empty array when no matches are found. Be aware that the nesting of values in this array is different than howpreg_match_allorders matches.In this example, we match components of the email address, and both
preg_match_allandString#scangive us an array of matches that are found.PHP
$string = 'joe@example.com; walter@example.org'; $result = preg_match_all('/([a-z0-9_.-]+)@([a-z0-9-]+)\.[a-z.]+/i', $string, $matches); var_export($result); // => 2 var_export($matches); // => array(array('joe@example.com', 'walter@example.org'), // array('joe', 'walter'), // array('example', 'example'),
Ruby
string = 'joe@example.com; walter@example.org' result = string.scan(/([a-z0-9_.-]+)@([a-z0-9-]+)\.[a-z.]+/i) p result.size # => 2 p result # => [["joe", "example"], ["walter", "example"]]
preg_replace vs. String#gsub
We perform pattern based substitution in Ruby using
gsub, which is equivalent to PHP’spreg_replacefunction. A notable difference is that thegsubmethod is also used for string substitution in Ruby. We do this by simply providing a string instead of the regular expression pattern. This would be like using thestr_replacefunction in PHP.In this example, we want to replace the domain in all emails with
foo.PHP
$string = 'joe@example.com; walter@example.org'; $result = preg_replace('/@([a-z0-9-]+)/', '@foo', $string); var_export($result); // => 'joe@foo.com; walter@foo.org'
Ruby
string = 'joe@example.com; walter@example.org' result = string.gsub(/@([a-z0-9-]+)/, '@foo') p result # => "joe@foo.com; walter@foo.org"
We can use backreferences in our
gsubreplacements just as we would withpreg_replaceby using\1,\2, etc in our replacement string.In this example, we prefix the existing domain with
mail.. Remember to escape backslashes used for the backreference.PHP
// Replace domain with mail.domain $string = 'joe@example.com; walter@example.org'; $result = preg_replace('/@([a-z0-9-]+)/', '@mail.\\1', $string); var_export($result); // => 'joe@mail.example.com; walter@mail.example.org'
Ruby
string = 'joe@example.com; walter@example.org' result = string.gsub(/@([a-z0-9-]+)/, '@mail.\\1') p result # => "joe@mail.example.com; walter@mail.example.org"
preg_split vs. String#split
We split strings by a pattern in Ruby using the
splitmethod. This is pretty much the same as thepreg_splitfunction in PHP. As with gsub, we can also use this same method to split using a string instead of a regular expression. This means thatsplitalso performs the equivalent of theexplodefunction in PHP.In this example, we create an array of the list of emails by splitting the string using the semi-colon and space as the delimiter.
PHP
$string = 'joe@example.com; walter@example.org'; $result = preg_split('/;\s?/', $string); var_export($result); // array('joe@example.com', 'walter@example.org')
Ruby
string = 'joe@example.com; walter@example.org' result = string.split(/;\s?/) p result # => ["joe@example.com", "walter@example.org"]
preg_grep vs. Array#grep
The
preg_grepfunction in PHP is a useful function to find entries in an array that match a given pattern. Ruby does this same operation with thegrepmethod.In this example, we’ll build a new array that only consists of email addresses that end in
.com.PHP
$myArray = array('joe@example.com', 'walter@example.org'); $result = preg_grep('/\.com$/', $myArray); var_export($result); // => array('joe@example.com')
Ruby
my_array = ['joe@example.com', 'walter@example.org'] result = my_array.grep(/\.com$/) p result # => ["joe@example.com"]
preg_quote vs. Regexp.quote
When we use a string as a regular expression, we want to escape the characters that could be interpreted as regexp special characters. PHP does this using
preg_quote, and Ruby has an equivalentRegexp.escapemethod.In this example, we’ll escape any regular expression special character in the given string.
PHP
$string = '[my_file.gif]'; $result = preg_quote($string); var_export($result); // => '\\[my_file\\.gif\\]'
Ruby
string = '[my_file.gif]' result = Regexp.escape(string) p result # => "\\[my_file\\.gif\\]"
Regular Expressions in Rails
Rails uses regular expressions in various places to specify patterns. When we are matching a route in Rails, we can use them to assign a requirement that a route component must match:
Ruby
ActionController::Routing::Routes.draw do |map| map.connect 'teams/:team_id/players/:action/:id, :team_id => /\d+/ end
We can also use regular expressions in our models when we validate the format of data. We pass a regexp to the
:withoption ofvalidates_format_of:Ruby
class Image < ActiveRecord::Base validates_format_of :url, :with => /\.(gif|jpg)/i, :message => "must be a GIF or JPG" end
When we are testing controller code, the
assert_selectmethod will accept a regular expression to match response data according to the given pattern.Ruby
class HomepageControllerTest < ActionController::TestCase def test_greeting get :index assert_select 'div.greeting', /Welcome [a-z0-9-_]+/ end end


6 comments
comment by Php Developer 18 Jan 08
Excellent rails regexp tutorial. YOu can find more about regexp here
http://www.regular-expressions.info/
comment by Markus 23 Jan 08
You said “by enclosing the pattern in backslashes”, but I think you meant the forward slash in this case?
comment by Derek 23 Jan 08
Markus: Thanks, and fixed
comment by junaid 18 Mar 08
Nice article. What is the alternative of preg_replace_callback in ruby?
comment by junaid 18 Mar 08
We can achieve preg_replace_callback functionality in ruby this way
def my_test_method(matches)
return “_pk”
end
my_string = ‘joe@example@example.com; walter@example.org‘
str = my_string.gsub(/@([a-z0-9-]+)/){ |match| my_test_method(match) }
puts str
Thanks for nice article. Its really very helpful to me.
Regards
Junaid malik.
comment by Ryan 31 Mar 08
It is worth adding that you will need to escape a different set of characters on ruby than in php. For instance the curly braces ‘{’, ‘}’, and ‘#’ need to be escaped in ruby because they have special meaning in strings.
Post a comment