DEV Community

David Boureau
David Boureau

Posted on • Originally published at bootrails.com

How to encode an URL String in Ruby

Content originally published here : https://bootrails.com/blog/how-to-encode-url-string-in-ruby/

Short answer, encoding a Ruby String

2024 edition is :

URI::Parser.new.escape(my_url_string)
Enter fullscreen mode Exit fullscreen mode

There's already a full stackoverflow question about this, but it seems that from Ruby 3, the only working version (without heavy tweaks) is the one described above.

Full example :

require 'uri'

my_string = "hello world"
my_url_string = "https://somedomain.com?word=#{my_string}"
URI::Parser.new.escape(my_url_string)

# => "https://somedomain.com?word=hello%20world"

Enter fullscreen mode Exit fullscreen mode

Corner case : unescaped characters

When you try to pass a quote mark however, this doesn't work :

require 'uri'

my_string = "hello?world"
my_url_string = "https://somedomain.com?word=#{my_string}"
URI::Parser.new.escape(my_url_string)

# => "https://somedomain.com?word=hello?world"
Enter fullscreen mode Exit fullscreen mode

This requires a little hack .gsub('?', '%3F')

require 'uri'

my_string = "hello?world"
my_url_string = "https://somedomain.com?word=#{my_string.gsub('?', '%3F')}"
URI::Parser.new.escape(my_url_string)

# => "https://somedomain.com?word=hello%253Fworld"
Enter fullscreen mode Exit fullscreen mode

Why the little hack ? Well I didn't take time to take a deep dive into the escape method, however, I suppose that given the fact that the quote mark is already part of a regular URL (it's a delimiter for the query string), this character is not encoded.

So let's push this assumption by trying to encode :// (I left this part as an exercice for you:)

...

Ok they are not encoded. Which leads me to the last paragraph.

Robustness of URL String encoding in Ruby

I tried some other methods, but this one is definitely the one which works right now. I deliberately mentioned the year in the title of the topic, because I'm not fully sure this answer will be still valid in say, 2 or 3 years, but as of now, it's definitely the most robust way I've found (so far).

Apart from characters that actually belong to a regular URL, I didn't find any other tricky stuff.

Depending on the use case, escaping those characters might be a good or bad idea.

Conclusion

Nothing special in this conclusion, I only hope you have won 5 minutes today because of this article :)

Best,

David.

Top comments (0)