9.2. mollom.checkContent

Note

The Mollom XML-RPC API interface has been deprecated, and is included here for archival purposes.

To develop clients and services that connect to Mollom, use the Mollom REST API.

mollom.checkContent
Required Name Type Description
required public_key string Site public key
required time string Site server time in this format: yyyy-MM-dd'T'HH:mm:ss-.SSSZ
required hash string HMAC-SHA1 digest
required nonce string One time nonce
optional session_id string Current session ID
optional post_title string Title of submitted post
optional post_body string Body of submitted post
optional author_name string Submitting user's name or nick
optional author_url string Submitting user's URL
optional author_mail string Submitting user's email address
optional author_openid string Submitting user's openID
optional author_ip string Submitting user's current IP
optional author_id string Submitting user's unique ID (on the site)
optional checks string A comma-separated list of checks. Available checks include 'spam', 'quality', 'profanity', 'sentiment' and 'language'.
optional strictness string Allows to adjust content classifier results; i.e., the probability for a spam result. Possible values: 'strict', 'normal', 'relaxed'. Defaults to 'normal'.
optional classifier string Use a custom classifier chain. The value is a comma-separated list of classifiers.
returns spam integer Returns 1 if ham, 2 if spam, 3 if unsure
returns quality double An assessment of the content's quality, between 0 and 1; 0 being very low, 1 being high quality
returns profanity double An assessment of the content's profanity level, between 0 and 1; 0 being non-profane, 1 being very profane
returns sentiment double An assessment of the content's sentiment, between 0 and 1; 0 being a very negative sentiment, 1 being a very positive sentiment
returns language list of structs A list of structs containing pairs of language and confidence values.
returns session_id string Session ID

The mollom.checkContent call is probably the most frequently used Mollom call. It can be used check if a comment is spam or not, detect the language of the comment and to get an assessment of its quality. Several checks can be run in a single call, by providing multiple values in the checks parameter. If no value is set for the checks parameter, only the spam check is executed.

Spam check

With the spam-check, the mollom.checkContent call will return 'ham', 'spam' or 'unsure' (encoded as 1, 2 and 3, respectively) together with a session ID. If Mollom returns 'ham' or 'spam', the content can be safely accepted or rejected, as the case may be. But if Mollom returns 'unsure', an additional check is needed to decide if the content can be accepted or not. Mollom provides CAPTCHA challenges for this check, but other mechanisms could be used. Mollom is designed so that only a small fraction of human-submitted content
will be flagged as unsure.
Note that if Mollom returns 'spam', no CAPTCHA should be shown to the user. Mollom will only return 'spam' if it is 100% sure that the content is spam. It is essential that these attempts are blocked without presenting any CAPTCHA. This allows Mollom to block both spambots trying to hack the CAPTCHAs and human users sending spam.

The behaviour of the spam-check can be influenced by supplying a value for the reputation or the classifier parameters. The classifier parameter, however, does not have any effect at the moment.
The reputation parameter accepts the following values:

  • captcha-blocking-normal
  • captcha-blocking-relax
  • captcha-blocking-strict
  • nocaptcha-blocking-normal
  • nocaptcha-blocking-relax
  • nocaptcha-blocking-strict
  • captcha-nonblocking-normal
  • captcha-nonblocking-relax
  • captcha-nonblocking-strict
  • captcha-blocking-repeated

Quality and profanity checks

The quality and profanity scores returned by mollom.checkContent are real valued between 0 and 1, where 0 denotes very bad quality or not profane, and 1 very high quality or highly profane. Mollom only returns a score, clients must define for themselves the quality or profanity level cutoff between content acceptance and rejection. The scores could also be used to present the content sorted in a way that makes moderation easier.

Language check

The language-check replaces the older mollom.detectLanguage call, which is now deprecated. Given a very limited amount of text (minimum of 15 characters), Mollom can detect its probable language (out of approximately 75 languages) with a high degree of accuracy. This feature can be used to prevent the use of foreign languages on your site, or to automatically segment the content of users based on their posting language.

Each value in the returned result is a struct (see example) that contains two named values: language and confidence. "language" is a string representing either a two-character ISO-639-1 code (if no ISO-639-1 code is available, a ISO-639-3 three letter language code is returned), while "confidence" is a numeric double representing Mollom's confidence in the accuracy of its assessment. Multiple pairs of language and confidence elements may be returned; if so, the elements are arranged in descending order of confidence.

If the language cannot be determined, "zxx" code is returned as the value of the language element, and is defined as "no linguistic content, not applicable". If the text is determined to be too random to be a known language, "und" code is returned as the value of the language element, and is defined as "undetermined".

Results of the language check resemble the following snippet below.

<value>
 <array>
  <data>
   <value>
    <struct>
     <member><name>language</name><value><string>nl</string></value></member>
     <member><name>confidence</name><value><double>0.558</double></value></member>
    </struct>
   </value>
  </data>
 </array>
</value>

Remarks

  1. Apart from the authentication fields, which are compulsory, all other fields are optional. This means that they can either be left out altogether or be empty strings. However, the more information Mollom receives, the more accurate its classification will be.
  2. If multiple OpenIDs are given for a user, they can all be passed into the OpenID field by separating them with white spaces (spaces, tabs or new lines).
  3. If a site has content types that do not map well onto the specified fields (for example, a 'survey' content type), content type fields or data can be concatenated and passed into the post body field.
  4. A unique user ID (user name or numeric ID) can be passed to Mollom. If no user ID is known (for an anonymous user, for example) no value should be passed.