Tuning LAMP Systems
I have been on a website performance and scalability kick lately. I thought I would share some articles I have enjoyed on the topic.
IBM Developer Works has a nice little two part series on Tuning Lamp Systems (pt 2).
Then again, IBM Developer Works often has some awesome articles. Scan it regularly if you can!
Enjoy!
When Stupid People Use Computers (Humor from InfoWorld)
Let’s face it: some people should just NOT be allowed near computers.

Photo by °Florian
InfoWorld, a magazine that is seldom considered a bastion of humor, has a series of hilarious articles with real-life stories about seriously stupid things that IT people and Hackers did with computers.
They are a great way to kill some time and leave you feeling smugly confident that you are, at least, not that stupid.
Stupid user tricks: Eleven IT horror stories
Thai Open Source Initiative Uses… .NET?
From the The Nation:
Soon, a factory will lead the push for development of open-source code software for Thai industries, locally.
The Association of Thai Software Industry (ATSI), the Industrial Promotion Department, the Software Industry Promotion Agency (Sipa), Microsoft (Thailand), Rangsit University and 20 local software companies have joined hands to set up the country’s first software factory.
ATSI president Somkiat Ungaree said the software factory is expected to open in the next two months. It will be housed at Rangsit University. The factory will receive Bt3 million in funding from the Industrial Promotion Department and Bt2 million from Sipa.
The factory’s first project will be developing a prototype of small-size manufacturing resource planning (MRP) software used in small and medium manufacturing plants.
Excellent! Home-grown, open source software so small Thai manufacturing plants won’t have to shell out big-bucks and be locked into proprietary software. GREAT IDEA!
But wait…
He said 100 programmers from the 20 local software companies, would be trained in Microsoft’s .Net platform at the factory. They will then develop open-source code software two days a week at the factory.
Huh? They are developing it in .NET! OK–so application will be open source and the application will be free… you are just locked into an expensive, closed-source, insecure PLATFORM.
I don’t know who dreams this stuff up…
Yahoo Design Patterns - Awesome!
There’s the thing you do all the time. Solutions that you have internalized and you often can look at a problem and you think “oh, that’s a _____ problem and if you do _____ it’s easy to solve.” That’s a design pattern: a generalized problem that has a set of generalized solutions.
Design patterns don’t mean cookie cutter solutions! Design patterns are just generalized solutions to common problems. They are usually a good starting point when tackling complex problems: deconstruct the complex problems to component pieces, look at common solutions, then build something unique and amazing by combining them in new ways and adding your own unique flourishes as appropriate.
When I saw the Yahoo Design Patterns page a light went off. Kind of like a flash bulb. I had been working on a conceptual site design for a client and, looking over the catalog of design patterns, I suddenly saw how I could combine a few pieces to make a pretty darn slick feature for the website.
I am going to make the YDP site a regular stop when I am pondering complex site architecture or UI issues. It’s great to have a simple catalog of basic design patterns to pour over.
It’s easy to get caught up in the complexities of a problem and forget to take a step back and think about it from a simpler, more granular level.
It’s much easier to deconstruct complex problems when you have a bare-bones catalog of simple design patterns staring you in the face. The little nuggets trigger all sorts of “aha” flashbulbs that get the creative juices flowing again.
Awesome.
Encylopedia Britannica Goes Wiki
Well, Encylopedia Britannica has gone wiki.
Not a bad move, but will they be able to be able to build a strong contributor base? One one hand there is some cachet to having your contribution approved by EB, on the other hand, it’s perceived as closed, old-school, stuffy and exclusive–which might put potential contributors off.
I do like the “open for contributions but expert moderated” model. I think it’s the best model for online travel content site and it’s the kind of model I see for travelguide.com’s future.
Barcamp Chiang Mai Is Underway!
Barcamp Chiang Mai is underway. We have 104 people who are attending.

Group Shot. That’s a lot of Barcampers!
Preetam Rai just gave a very interesting presentation on Interesting Web 2.0 Companies in Asia. Draper Fischer Jurvetson just set up an office in Vietnam. Preetam suggested that if companies in Thailand wanted to create a great international site, they focus on their countries innate strengths and pointed to tourism as an excellent existing industry to build great websites around. He also suggested that a sight that was tightly focused ONLY on travel in Thailand that offered good content and community input was badly needed and would probably be quite successful.
2:00 pm I led a discussion on “What’s the Best CMS: Joomla, Drupal, WordPress or Typo3?” Which basically came to the conclusion that there is no one “best” CMS, just many good tools that are better suited for different audiences.
3:00pm Getting ready to do a presentation “Twitter Rules, sudo Sugree.” @Sugree will be co-presenting with me and @molecularck via Twitter!
4:00 Coffee break, distributed T-shirts. The crowd starts to dwindle.
4:30pm My final presentation was “Online Tourism in Thailand: Issues and Opportunities” with Dr. Ken Cosh, the head pf the Payap IT Faculty.
6:00 Supposed to finish, but realize we scheduled one too many session. Great! Enjoying a great discussion on social networks in Thailand.
6:30pm Off to Dayli for an after party!
English Font, Thai Look: AW Siam
What’s not to love about this font?

AW Siam is a free font for Mac and PC that give you English characters with REAL Thai flair.
It even uses a few actual Thai characters. (For example, “a” is “lor ling”, “T” is “sara o” and “n” is “tor ta-haan.”)
The funny thing is, when I showed it to some of my Thai friends that read English, they had a hard time reading it–the “Thainess” of the characters threw them!
Is Twitter a Better Search Engine than Google?
I have had a flurry of thoughts since posted my blog “My New Distributed Brain.”
The result of that epiphany is this: that Twitter has the potential to be a better search engine than Google.

“But,” you say, “Twitter is a microblog? How can it beat Google at search?”
Are you on Twitter?
Try this out: the next time you have a question, post it to Twitter instead of doing a search on Google.
Did you get an answer? Was it The Right Answer?
This doesn’t work everytime–at least not yet. But it works often enough that I use this approach to answer a lot of questions on a daily basis.
It works well enough that I notice a lot of other people that I follow are using it to ask questions–and get answers. (Twitter founder Biz Stone, Jason Calacanis and Chris Pirillo come to mind.)

It works often enough that Google and the other search engines would be well advised to take notice.
Querying Twitter does not always work right now but Twitter is growing fast.
With it’s open and flexible APIs, people are finding more and more ways to use the Twitter platform in new and innovative manners.
Twitter is a great platform to tap the collective intelligence and channel it into enhancing–even transcending the search engine as we know it. It’s not AI, it’s all “I” (real human Intelligence).
Google, Yahoo and Microsoft–watch out!
Why Twitter?
Why Twitter?
Could another messaging / microblogging platform beat Twitter at this game?

Possibly–but Twitter has several advantages at this point:
- Large user base. Twitter has 1 million users and it’s growing 800% annually.
- It seems to have a user base that skews towards openness and community. Good, good.
- Following is open by default; following does not require consensus by both parties; one person can choose to follow another in an non-symmetrical relationship; this makes it easy for people to build a list of people they want to hear from, easy for people to build a following.
- The “Track” feature allows you to track words of interest to you: this is critical.
- SMS integration. Can send and receive tweets via SMS. Perfect for mobile search.
Would people really have time to answer all these questions? Don’t worry, as Clay Shirkey points out: We Have the Time! (Part 2)
Scenario: A Major Search Engine Acquires Twitter
Once this meme catches on, I see a very high potential for this scenario to unravel:
- One of the major search engines moves rapidly to acquire Twitter.
- The search engine uses the Twitter API to post some queries as tweets.
- People start to answer the search engine tweets; they do it for many reasons: ego, community, interest in the topic, self promotion–the reasons are many.
- The search engine uses Ajax to put twitter responses on the results page in real time, augmenting their algorithmic search results. (Thanks for pointing this out, Arthur!)
- The search engine becomes the #1 search engine AND the biggest social network on the planet, dwarfing the Google of today.
There is a whole lot more the Search Engine could do to optimize the process; this is an idea in it’s infancy. Options to increase performance include: caching results of previous similar tweets, using the tweets as another source of signals for standard search results, build and integrate a reputation system so that tweeters are ranked by their accrued trust and accurate ratings (this would help to prevent spam from cropping up in tweet results). And more. A lot more.
I have done a search (on Google) and I have not found a similar system proposed. Hmmm.
I did however, get an answer on a similar system when I tweeted about this idea. (See! See what I mean!)
@tewson pointed out that ChaCha is a search engine that can take queries from users via the web, voice and SMS and a real person compiles an SMS response, but this is no where near as powerful as querying the masses.
Sergei, Larry, if you guys are reading this, follow me on Twitter: @jfxberns. We should talk.
2008.06.18 Update:
@celerachan pointed out this blog on SheGeeks by Alana Taylor that, basically, reaches the same conclusion: She Geeks In Tech - Stop Using Search Engines, Start Twittering
My New Distributed Brain
I have become smarter recently. It is due to my new Distributed Brain.
I have recently begun to distribute my thinking across the globe with Twitter. When a question pops into my head, I first check the local cache (my memory), then I query the global memory (Tweet the thought or question on Twitter) to see if an answer or comment bounces back–and often it does. (And if that fails, there is the fall-back of Googling for an answer.)
What’s so special about Twitter for getting answers? It’s the amount of people that can have access to the data. Currently, there are over one million active tweeters–and it’s growing fast. Anybody who tweets can follow you on Twitter (unless of course, you make your tweets private–but the default is public which is nice.)
Of course not everybody does follow me, after all I have never been that popular, however, I do have a fair amount of like-minded individuals that follow me. But the big difference with Twitter is that people can track words. So if I tweet about Drupal, or Ubuntu or Thailand, anybody tracking those words can capture my tweet.
This Is Your Brain on Twitter
Here is an example:
While writing this blog, I was wondering if there were any stats on the number of twitter users. I did not know the answer, but I figured somebody using Twitter would. So I tweeted the question.

I got the Twitter usage stats I wanted for supporting facts in this article. It took a few seconds but it was EXACTLY the info I needed. My distributed brain is quite smart. Smarter than my local brain by itself.
I see this happening all the time on Twitter. People asking about restaurants in Bangalore, how to fix a code problem, where to buy size 13 shoes in Bangkok–queries that Google would choke on.
Why Is This Different?
Internet users have been using forums, search engines, chat programs and other apps for querying and sharing information for a long time. So, why is this different?
It has a potentially wider reach than regular chat/IRC programs.
If you know where to ask the question on IRC or a specialized chatroom, you can probably get the answer just as easily–but finding that place? That might not be so easy. On twitter, everybody who tracks the terms you use in your tweet can see it–a potentially larger audience.
It’s faster than a forum / mailing list.
With forums, like chat rooms, you need to know where to find the people that know the answer and then you have to wait–usually a few hours or even days–to get your answer. With Twitter, the answer often comes back in seconds.
It’s almost always more accurate than a search engine.
When you Google for an answer, an algorithm determines the response. Google does not (yet) have the technology to understand the question–but it has an algorithm that does a very good–but not perfect job–of figuring out what is relevant to your query.
If you use a human-powered search engine like Mahalo, you don’t get an answer to your question–you get a pre-built set of results based on the fact that your question is like another question their human editors have asked and compiled a result for. Again, it’s not THE answer to YOUR question–but AN answer to a question that is like your question.
Any Venture Capitalists Listening?
Here, potentially, lies the seeds for the Holy Grail of search engine technology: a real-time, human powered search engine. If you can build a user base willing to answer real-time questions (possibly leveraging an existing social network), find a way to at least reasonably filter / channel questions to the people with the answers, and a reputation system to prevent spam answer–then you could have a spectacular product.
But why would people answer other people’s questions for free?
Why do they do it now? It’s a great way to connect with other people with similar interests.
And it does not have to be free. Integrate something that leverages PPC ads and there is the potential for revenue sharing–on the order of billions of dollars of revenue per year. Oh yeah!
The new paradigm is the query becomes the social network.
I must ponder this more…. I feel a larger blog coming on.
Until then, follow me on Twitter: @jfxberns.
Finding and Removing Duplicate Files from Your Digital Photo Library
I needed to sort through a library of 150,000 digital photos that was taking up 150GB of disk space. I knew most were duplicates. (Poor file management practices!)
I needed a tool that would help me find the duplicates so I could delete them.
The problem was, the file were frequently renamed–so two files that were identical often had different file names. I needed a tool that would be able to recognize duplicate digital photos even if the names were different.
I found an awesome Windows program called DoubleKiller that can do just that.
DoubleKiller allows you to find duplicate files by many methods. The method that DoubleKiller offered that worked to solve my problem (duplicate digital photos with differing filenames) was to search the folders I stored my photos in and flag all files that had the same file size AND the same CRC32 signature.
(A CRC32 signature is like a unique serial number that can be calculated from a file. It’s not really unique, but the odds of two files of the same size having the same CRC32 is about 1 in 16,000,000 so it’s a pretty good indicator that the files are identical.)
One you set up the search parameters, tell DoubleKiller what folders to search in and start to run the program, it lists all the duplicate files it finds.
You can then select and delete the duplicates you want to remove.
One suggestion: DoubleKiller has a one-click feature that allows you to select all the duplicates but the first one or all the duplicates but the last one. The result set is ordered in the same order as the folders are entered in the Folders selection section of the Options tab. So when you add folders to search in the Options tab, put the folders where you want to keep files in the order of the priority that you want to keep the files. (Wow. That does not even sound clear to me–but I hope it makes sense once you play with DoubleKiller.)
It took a long time to process 150,000 files (several hours) but I did not have to waste days–or even weeks–doing it by hand and I can be sure I did not accidentally delete every copy destroying valuable work.
And now I have an extra 150GB of disk space. Awesome!
Hurray for DoubleKiller! I wish I have found this gem a long time ago!
Now to tidy up my MP3 collection…
Oh–one more thing, there is also a Pro version of DoubleKiller that runs US$ 19.95. It’s faster and offers more automation options. I think I might have to pick up a copy!
Categories
- Barcamp (8)
- Blogs & Blogging (6)
- Design Patterns (1)
- Desktop Computing (4)
- Digital Photography (1)
- Free & Open Source Software (7)
- Information Architecture (1)
- Online Mapping (1)
- Online Marketing (3)
- Online Travel (2)
- Politics (2)
- Random Musings (12)
- Search Engines (3)
- Social Media / Computing / Networking (14)
- Startups (2)
- Tecnology & Society (6)
- Thailand (8)
- Travel Guidebooks (2)
- User Experience (3)
- Web Development (11)
Monthly Archives
- June 2008 (5)
- May 2008 (4)
- April 2008 (12)
- March 2008 (4)
- February 2008 (20)
- January 2008 (12)
- December 2007 (2)
- March 2007 (1)
- November 2005 (1)
- October 2005 (2)
- September 2005 (2)
- August 2005 (1)
Calendar
July 2008 S M T W T F S « Jun 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31




















