mnoGoSearch performance issues

MySQL performance

MySQL users may declare mnoGoSearch tables with the DELAY_KEY_WRITE=1 option. This will make the updating of indexes faster, as these are not logged to disk until the file is closed. DELAY_KEY_WRITE excludes updating indexes on disk completely.

With it, indexes are processed only in memory and written onto disk as a last resort, by the FLUSH TABLES command or at mysqld shutdown. This can take even minutes and impatient user can kill -9 mysql server and break index files with this. Another downside is that you should run myisamchk on these tables before you start mysqld to ensure that they are okay if something killed mysqld in the middle.

Because of it, we didn't include this table option into the default tables structure. However, as the key information can always be generated from the data, you should not lose anything by using DELAY_KEY_WRITE. So, use this option at your own risk.

Post-indexing optimization

This article was supplied by Randy Winch

I have some performance numbers that some of you might find interesting. I'm using RH 6.2 with the 2.2.14-6.1.1 kernel update (allows files larger than 2 gig) and mysql 2.23.18-alpha. I have just indexed most of our site using mnoGoSearch 3.0.18:

          mnoGoSearch statistics

    Status    Expired      Total
       200     821178    2052579 OK
       301        797      29891 Moved Permanently
       302          3          3 Moved Temporarily
       304          0          7 Not Modified
       400          0         99 Bad Request
       403          0          7 Forbidden
       404      30690     100115 Not found
       500          0          1 Internal Server Error
       503          0          1 Service Unavailable
     Total     852668    2182703

I optimize the data by dumping it into a file using SELECT * INTO OUTFILE, sort it using the system sort routine into word (CRC) order and then reloading it into the database using the procedure described in the mysql online manual.

The performance is wonderful. My favorite test is searching for "John Smith". The optimized database version takes about 13 seconds. The raw version takes about 73 seconds.

Search results: john : 620241 smith : 177096
Displaying documents 1-20 of total 128656 found

You may also take script from our site. It was written by Joe Frost and implements Randy's idea.