Tail Recursion

Adventures in Erlang

boot args

Have you ever wanted your code to behave differently when in production and development? An easy way to handle this is to add a meaningful boot argument like so:

$ erl -production

Now you can use init to check for the existence of the production argument. In this example, I’ll use this to select my redirect target domain:

Domain =
    case init:get_argument(production) of
        {ok, _} -> “www.example.org”;
        error -> "localhost:8000"
    end

You can also pass in values with your arguments like so:

$ erl -dictionary_size 500

And you can read the value back out with:

> init:get_argument(dictionary_size).
{ok,[["500"]]}

Advertisements

spell correction

Over the past couple of years I’ve done some NLP work in erlang including LSA, text segmentation and automatic spell correction. If you’ve ever thought about creating a spell checker, I’d recommend reading this article. Much can be improved in the implementation presented – chiefly, the brute force search of all conceivable variations of the input text. My own application required automatic correction on the order of a few milliseconds, so that approach was clearly not going to work.

I then found this article, which explains a terrific solution. In summary, if you keep your dictionary in a trie you can narrow your search to only variations which exist in the dictionary. This greatly improves the speed of the algorithm. I implemented the same idea in erlang, and ever since I’ve been doing ~1ms corrections with a dictionary of tens of thousands of words.

While most of my work is application-specific, the trie structure and search algorithm are generic so I decided to open source them here. Like all of my projects, please feel free to send suggestions and pull requests.

utf-8

I recently had a problem where I was rendering an ErlyDTL template and symbols like © and ® were not displaying right. After a little trial and error I realised that the character encoding in my HTML template was set to UTF-8, but erlang’s default encoding is ISO-latin-1 (ISO8859-1). Fortunately the unicode  module provides an easy fix for this:

{ok, Html} = my_template_dtl:render(Data),
unicode:characters_to_binary(Html).

And voilà, we have a UTF-8 binary.

block expressions

I’ve long been annoyed that list comprehensions can only accept a single expression. What if I want to print and perform some computation on each value? What I didn’t know was that expressions can be grouped together as blocks:

> [begin erlang:display(N), N*10 end || N <- lists:seq(1,3)].
1
2
3
[10,20,30]

For more details see this.

where is my priv_dir?

Let’s say you’re writing an application that needs to load a file from its priv directory, so you need know its path. You just call code:priv_dir/1 with the name of your application right? Maybe. The catch is that this function only succeeds if your application is in the code path. As far as I’m able to tell that leaves us with three choices:

1) Make sure your application is installed to somewhere under a directory in $ERL_LIBS. This makes sense for a shared library, but may not make sense for your web server. Furthermore, when working on the code it’s annoying to have to add your development directory to $ERL_LIBS.
2) Just write code like file:read_file("./priv/dictionary"). Unfortunately that will only work if you run it from the application directory. Fail.
3) Use the following code:

{ModPath, _} = filename:find_src(?MODULE),
AppPath = filename:dirname(filename:dirname(ModPath)),
filename:join(AppPath, "priv").

So far that seems to be the winning solution for me. If you have a better solution or you can think of a reason why this doesn’t work please let me know.

edate

Date manipulation in erlang isn’t very pleasant. The calendar module provides a lot of the raw tools but none of the convenience of something like the Date class in ruby, or the time calculation extensions in rails. I borrowed ideas from both and created edate. Check it out and let me know if I’m missing anything that would be helpful. Examples:

> Date = edate:string_to_date("7/15/2010").
{2010,7,15}
> edate:end_of_month(Date).
{2010,7,31}
> edate:shift(Date, 52, weeks).
{2011,7,14}
> edate:day_of_week(Date).
"thursday"

matching with append

I was recently reading this example, and I was totally surprised by the syntax in the is_authorized example. Apparently you can use the ++ operator to to pattern match. For example:

> "pajamas:" ++ Color = "pajamas:blue".

Color now has the value “blue”. Be aware that this trick has it’s limitations – as far as I can tell it only works with a single variable and a single constant in the order given above.

It pays to read other people’s code – you’ll never know what interesting tidbits you may find.