Blog Search Update
Posted January 11th, 2006 by Dave([\w]*)[\w =\.\,\-\;]((?:[\w]+ =?){0,50)!
If you want to scare or confuse someone, just paste them some Regular Expressions, which incidentally reminds me a lot of Geek Code. Of course I’m way too busy working on Best of the Web! to worry about crafting my own. The regex code above is actually part of the code that generates the abstract for a search result in the forthcoming Blog Search.
I don’t know about you, but I never gave it much thought when using my search engine of choice; I just type in my query, and get my results. I definitely take for granted the fact that I can click a button, and be shown web sites containing the text I searched for. I have a new appreciation for the whole process though, now that I’ve built my own search engine. It’s not quite billions of pages indexed, but there’s the same kind of thinking in many respects, in terms of the architecture and set up of the whole crawling/indexing/searching/parsing processes.
I just wanted to give you all a little taste of some of the blog search innerds and maybe you’ll have a new appreciation for what it takes — A lot of hard work and geeky looking code! It will be a work in progress for a while, but it is stable enough to launch. I would say time to launch can now be measured in days and hours and not weeks. So be sure to check the Blog Directory often over the coming days. It’s-a-comin’!




