PDA

View Full Version : Still stuck with the "naughty" parser?



Vanguard
29th Jun 2002, 03:19
Great, I see we are still stuck with the inane parser that will blot out what could <i>potentially</i> be naughty words. I can't write, "... as it encompas<i></i>ses all of ..." without using the trick of inserting start/end HTML tags within the "naughty" portion of a word to prevent the parser from noticing the "as<i></i>ses" portion of the word "encompas<i></i>ses" and replacing it with "encomp*****".

So to talk like adults we have to act like self-censoring children even when the censoring doesn't apply? Obviously this is an inept parser since it doesn't know how to recognize whitespace, punctuation, or other normal grammatical parsing rules. You do NOT match on substrings, especially when they do not start on a word boundary (so their starting index or offset is 1).

Peter_Smith
29th Jun 2002, 04:34
I agree completely. I wonder if the system is capable of the flexibility we desire.

It might be much better if, like a spelling dictionary, it could be based on a list of banned words, perhaps with wildcards in some cases. There are problems with the rule you propose, however. What about bad*****?:)

Any comments, Grey Mouser?

Vanguard
29th Jun 2002, 05:17
There are so many ways to get around the naughty parser that it is stupid that it is still being used. Some folks use "!" for "i', "@" for "a", "$" for "s", or insert the asterisk(s) or period(s) themself to minimize how much the naughty parser will delete, as in "encompa*ses" or "encompa.ses". I usually just insert HTML tag pairs, as in "encompas&lt;i&gt;&lt;/i&gt;ses" so what you see is "encompas<i></i>ses". But if HTML were disabled then I'd have to resort to the other tricks to stop having my text screwed up by an ineffective and inaccurate censor function.

The moderators don't even have to read every post to check for profanity, anyway. Now there is a link at the bottom of the post that lets you identify and report an improper post. So users can easily tattle on other users right away.

Peter_Smith
29th Jun 2002, 06:04
It is true that there are ways around the censor with special characters. Those ways are not permitted by the rules, strictly speaking, although there has been a certain tolerance of the milder cases.

The real problem is the trouble you have to go through to spell out perfectly legitimate words, as you have shown above. My favorite is cockpit (aka roosterpit).:)

Edit: gee. Cockpit was allowed here, but not in the old forum. Two goofy parsers, but different.