The writings of Peter Stuifzand

Archive for April 2009

This week I have been working on a search engine for my webshop platform. And while a big part of this work is on the layout of the result page, I also have to work on the backend.

The backend that I built is using MySQL fulltext indexing. The default MySQL was not optimal for my goals. Some queries don't return any results. This can can fixed by changing a few config variables in my.cnf.

The two variables that I changed are

  1. ft\_stopword\_file
  2. ft\_min\_word\_len

The first variable tells MySQL which stopword file it should use while creating an index. It will remove the stopwords from that file from the input. Stopwords are common words, that will not be used in many queries, but will increase the size of index. However, this list can contain words that are useful for your problem domain. I set the variable to '' and re-created the index. Now all words can be found, except short words.

The second variable is used to filter out shorter words. The default minimum word length is set to 4. This is to big for my website. So I have set it to 2.

With these two changes the search results show the expected results. These changes will also change the size of the index.

I created two simple mod_perl handlers, that help me find the response time of my web application. See it on github apache2-logging-handler.

The code is so simple that you can recreate them easily yourself. Note the use of the pnotes function.

$r->pnotes('start_time');

With pnotes you can share information between handlers. I had to search for a bit and this works great.

View archived entries