Following the publication of a brief article on Search results design by Adaptive Path, I decided that revising my database search script was a valuable goal. Specifically, meeting the checklist in that result was probably not a bad idea!
It’s not that the previous version was terrible, but I knew perfectly well that it could be much better.
The additions to the script are pretty straightforward:
Additions:
- Added: Made row highlighting available in both tabular and list-based search results.
- Added: Search terms are now highlighted in search results.
- Added: The default sort is now to order results by query relevance.
- Added: Paginated navigation of search results is now available.
- Added: Translation base file [English], so translating the script is easier.
- Added: Basic Spellchecking [English]
- Added: Default stylesheet
Changes:
- Changed: Text excerpts are now truncated at word boundaries, rather than in the middle of words.
- Changed: separated results template information into external include files for easier upgrading or modification.
- Changed: Included the search form as part of the script so that search terms could be automatically returned to the search input.
The spell checking is the most exciting addition in my view. It’s hardly complete, but it’s based on a list of 4,068 common misspellings available from Wikipedia. This addition has significantly bulked up the total download size, since I’m including the spell-checking database as part of the download, but I think it adds a lot of value to the script.
I’ve also added a translation base file to the package, to make it a bit easier for users of the script to port it to their own languages. Unfortunately, I haven’t yet had time to seriously work on the internationalization of the search script itself, so (to be entirely frank) this is an area which the script isn’t really well suited at this time.
Internationalization is next on the list, however. It’s a high priority at this point, since internationalization ranks as one of the most reported problems with the script.
With spell-checking in mind, I think it’s appropriate to provide a healthy reminder of the limitations of spellcheck:
Candidate for a Pullet Surprise
by Mark Eckman and Jerrold H. Zar
I have a spelling checker,
It came with my PC.
It plane lee marks four my revue
Miss steaks aye can knot sea.
Eye ran this poem threw it,
Your sure reel glad two no.
Its vary polished in it’s weigh.
My checker tolled me sew.
A checker is a bless sing,
It freeze yew lodes of thyme.
It helps me right awl stiles two reed,
And aides me when eye rime.
Each frays come posed up on my screen
Eye trussed too bee a joule.
The checker pours o’er every word
To cheque sum spelling rule.
Bee fore a veiling checker’s
Hour spelling mite decline,
And if we’re lacks oar have a laps,
We wood bee maid too wine.
Butt now bee cause my spelling
Is checked with such grate flare,
Their are know fault’s with in my cite,
Of nun eye am a wear.
Now spelling does knot phase me,
It does knot bring a tier.
My pay purrs awl due glad den
With wrapped word’s fare as hear.
To rite with care is quite a feet
Of witch won should bee proud,
And wee mussed dew the best wee can,
Sew flaw’s are knot aloud.
Sow ewe can sea why aye dew prays
Such soft wear four pea seas,
And why eye brake in two averse
Buy righting want too pleas.
There are numerous articles pointing out the business advantages of accessibility. Many of these reflect the similarity between accessibility and SEO. However, despite the close technical relationship between the needs of disabled users and the technical requirements of search engine optimization, the fact remains that the two goals are not the same, are not equivalent, and do not reflect the same ultimate goals.
At their hearts, web accessibility and SEO are focused on optimizing different aspects of your web site: accessibility cares almost exclusively about the disabled user and their experience whereas SEO is focused firmly on your bottom line and your experience, as site owner, in the online aspects of running your business.
Read more: Web Accessibility is not SEO
I actually wrote and published this almost two weeks ago — - unfortunately, I accidentally wrote it as a page instead of a post, and didn’t notice. At any rate, I’m publishing it now, although it’s a bit after the fact…
So, I know that I’ve been more than a little bit quiet lately. I’ve got some things in progress, but it’s been hard to focus in the heat! Nonetheless, I’ve just published an article at Practical eCommerce magazine entitled “Customer Service for the Hearing Impaired,” addressing some issues the deaf and hard of hearing communities encounter when dealing with online merchants.
Comments are accepted at Practical eCommerce, and I’ll be checking in there occasionally, so feel free to make your comments there! (Comments are also moderated at the magazine, and I have nothing to do with that…so you can also feel free to comment here. As you wish!)
The whole world of spam is an accessibility nightmare. The concept behind web accessibility is to ensure that users can access the complete functionality of your web site — but how do you cope with the fact that spambots will happily take advantage of any hole you leave?
Comment forms, contact pages, email addresses and enrollment forms. All methods of giving critical access to previously unidentified users — and all in positions where you just need to find that crucial differentiation between real people and robots.
When you’re talking about functionality which is locked behind a log-in form, there’s not really a huge amount of trouble in defining the security/accessibility conundrum. Require a good, secure password and you’re pretty safe. People with disabilities, for the most part, can use a password field just as effectively as anybody else. Once you’re behind that iron curtain, you can usually stop worrying about the distinction: everybody who has access to your private functionality is a known user. They’ve identified themselves, provided credentials which grant them a certain degree of access, and you can stop worrying about them.
But your front door can be a big problem.
You need to create a doorway which will allow visitors you don’t already know to reach you. They need to be able to contact you in order to initiate business, or enroll in your program, or at least create an account with your site. It’s therefore absolutely critical that you create a form which can be accessed by anybody.
But you still only want people using your form. Robot visitors rarely pay the enrollment fee, so they’re not exactly welcome visitors in every area of your site. You certainly don’t want to be thanking them for contacting you with an offer to enlarge your anatomy!
Spam protection and accessibility have inherent conflicts of interest: the formar goal attempts to prevent a form from being used, the latter promotes it. The two goals aren’t actually antipathetic of each other, but getting the two goals to work collaboratively does require a detailed understanding of what the issues are.
Stopping the Robots
One of the most common solutions to the spam problem is to prevent a problem which a computer can’t solve. The most obvious solutions (pictures of animals, pictures of people, etc.) are inherently flawed because they require specific pieces of information in order to solve. They’ll require correct spelling in the correct language with knowledge of the subject depicted. Although most visitors may be able to identify an elephant, some visitors will inevitably (and correctly_ identify it as an elefant.
Presumed knowledge is a barrier to both humans and computers.
This is what has led to the numerous garishly blurred and colored text images you’ve undoubtedly had to interpret. Computers can use character recognition to examine images and identify the text, so the presentation is warped to decrease the likelihood of recognition. Of course, this also decreases the likelihood that humans will be able to read the image. Humans with disabilities? No chance. Either you include an alt attribute, making the solution trivial for a computer, or you leave it out — making the solution impossible for somebody with a visual disability.
Thus was born the audio CAPTCHA. However, audio CAPTCHA requires specific technology — an audio format must be chosen, and an audio player provided. Additionally, computers are capable of recognizing audio excerpts in much the same way they can recognize images. As a result, the audio output is distorted. I’ve listened to audio CAPTCHAs, and all I can say is that I hope others have better luck than I do. I’ve never passed one.
And, of course, neither of these methods will provide access for anybody who is both hearing and visually impaired.
There are numerous other examples of attempts at accessible CAPTCHAs. Most of them depend on the fact that while robots may be text-aware, they are not necessarily capable of following instructions provided in text. Simple question & answer bot-blocking techniques like:
- Write “human” in the field below.
- What is 3 + 4?
- Is fire hot or cold?
These simple questions can slow spam — these can be considered generic spam prevention methods. They will stop almost all spam which is not specifically targeted at the form. However, if any programmer decides that they want to write a bot to attack your site, it is a trivial problem. Simply put, these kinds of questions generate security through obscurity.
A second class of bot-blocking techniques are found in more complex question & answer sets:
- Write “red” in the 2nd text field on the left.
- Enter your name in the 3rd row, 2nd column.
These programmatically variable questions may also slow a bot, but can also be incredibly challenging — if not impossible — for a human visitor who is not using an visual browser with an output equivalent to the instructions.
Tricking the Robots
Now, robots aren’t terribly intelligent. Usually, their decision making skills are fairly limited. As such, it’s not terribly difficult to simply deceive them. These methods may have some effectiveness at slowing down bots:
- Required selections on option menus. Not that a specific option is required — just anything available in the menu.
- Honeypots — fields which should not be filled in, but probably will be by your average bot in it’s quest to cover all it’s options.
- Limited length fields — if you set this client-side, using the HTML maxlength attribute, a bot can easily limit it’s own input. However, if you set it server-side (at a safe margin for real users) you can stop a few bots which get over-eager.
Mike Cherim has valuable tips on these techniques in his article Protecting Forms from Spam ‘Bots, so I’m not going to elaborate on these points excessively. Again, however, these are all valuable methods within the “security through obscurity” school of protection — no serious protection against a motivated spammer.
Mike’s secure and accessible contact form makes use of a wide variety of techniques and provides thorough accessibility, so if you’re looking for a simple contact form which will block generic spam, it’s a great option.
Behavior Detection
This is a complicated area, which I’m not going to delve into in any significant detail. Primarily because I’m not really qualified. However, it’s an important category of spam control, so it’s worth an overview.
The principle of behavior detection is based on one core observation: bots don’t behave like people. People are, for the most part, a complex blend of random behavior and systematic exploration. Bots are generally much more absolute. When you observe a web site “user” visit every single navigable page of your site at 30 second intervals, that user is clearly not human.
Although the actual interpretation is significantly more complicated, the challenge is simple: look for patterns. If a user’s time on a site matches a mathematical pattern, that’s a signal. The Bad Behavior package works (at least partially) on this general logic: search for indications about the user or user-agent and identify signals which suggest non-human activity.
Requiring Specific Capabilities
Some spam solutions make the choice that they will require specific capabilities from the visitor in order to allow them to make contact. The Wordpress comment spam plugin WP-Spamfree takes this strategy. The first layer of protection for this plugin is to require that any visitor trying to submit a comment have support for Javascript and for cookies enabled.
Immediately, this strategy eliminates the vast majority of bots — and a small minority of humans.
Conclusion
I’m not aware that there’s any solution which has 100% success at differentiating humans from bots. Any barrier put in place to spam will also create a barrier for somebody. However, this is a decision that must be made for any site: when you’re receiving thousands of spam messages a day through an insecure contact form, is it better to stop the occasional human or massively reduce your daily spam-killing time commitment?
Ultimately, there isn’t a real answer. Spam is too great of an issue to simply ignore. However, any time you create a CAPTCHA — of any sort — just remember this: provide an alternative. If you provide a phone number to those who have failed your little test, they may be able to reach you. If somebody needs to reach you, make it possible: even if they’ll have to write you a letter in order to post a comment on your blog.
Giving a talk is an interesting experience. In this case, with a time limit of 15 minutes, the biggest challenge was figuring out what I had time to cover. With a subject like web accessibility, I firmly believe that every aspect is critical — anything I leave out is something that somebody needs to know.
But it’s 15 minutes. You can’t really be effective if you try and cover the entire scope of a subject in 15 minutes.
The first challenge is figuring out the audience. In this case, I was speaking to a group of internet marketing professionals and site owners. For the most part, no programmers, no interface developers — not even people who necessarily have any direct access to the code of their sites. What can you teach them which they’ll be able to apply and understand immediately?
I’ve already given the speech, so I’m not trying to solicit suggestions for this particular event. However, I’m curious to know what you think are the most key issues.
For your reference, I covered three general areas:
- Navigation which can be used by non-visual, non-mouse using groups.
- Content which can be read sensibly by text-aware devices
- On-page navigation which can make the page easier to navigate
I completely ignored HTML validation, web standards, accessibility guidelines, and anything about following technical specifications. For this audience, this didn’t strike me as an actionable conversation. Instead, I focused on practical investigations of site problems: whether the site can be used with a mouse; whether the site makes it’s content available to screen readers (or search engines); and whether standard methods have been employed which will enable disabled users to quickly and easily get around the page.
So I’m curious: what would you have talked about?
I’m looking for people to provide alternate language translations for my Color Contrast Tester. I’ve already got people offering to provide Italian and German language files, but once you’ve gone that far…why not keep going?
If anybody reading this can provide additional translations, let me know in the comments — I’ll respond privately to make arrangements. It’s an easy job; the language file is independent of the rest of the script, so there aren’t any serious challenges in sorting what needs to be done.
Thanks in advance!
Perfecting a web site is a long and involved process. There’s no getting around the fact that if you want every aspect of your site to be right — - accessibility, search optimization, and just all-around pizzazz, you’ve probably got some significant work to do. However, that’s not to say that there aren’t things you can check quickly and efficiently to make sure you’re not making some of the more egregious errors!
Here are 8 speedy checkups (in no particular order) which you can easily perform on your site to inspect it for problems. No methods suggested require special knowledge of HTML or web programming. Excluding acquiring and installing software, these tasks shouldn’t take more than a few minutes for most sites.
That doesn’t include fixing any problems found, of course…
Read more: Web site Tune-up: 8 Quick Checkups
Return to Top