Friday, November 30, 2007

Interesting new blogspam method ....

When you've been writing a blog for a while, you'll discover what appear to be automated blogs that shamelessly steal your posts and repost them as new posts verbatim. I can only imagine that they do this to get high hits (and therefore advertising revenue) from Google, although why someone would take *my* posts for this purpose escapes me.

A new form of blog plagiarism combines the best of spam techniques with post copying, so as to evade the obvious duplicate detection methods. Consider the following examples:

The original post:
There's been a lot of grief over the reduction in price for iPhones only a few months after they were released. Wired magazine interviewed people who don't regret paying the higher price for the virtue of being an early adopter/arbiter-of-cool, and this comment caught my eye
The modified post (and no, I'm not going to link to it):
There's been a number of visibility at the decline master p scholomance for online in very couple weeks and we were released. Wired magazine interviewed people who don't regret paying a low frequency of the world as both an early riser and perhaps comment and the eye
The original post:
Suppose you're given a metric space (X, d) and a parameter k, and your goal is to find k "clusters" such that the sum of cluster costs is minimized. Here, the cost of a cluster is the sum (over all points in the cluster) of their distance to the cluster "center" (a designated point).
The modified post:
Lil romeo you're given a multidimensional space (X, d) and a parameter k, and a buff being use find k "clusters" such as lump- end of the effect is minimized. Here, lil romeo use 70million a target wikipedia, the sum of these items out the amount of the need codify the cluster "center" (a designated point).
And so on. What's more mysterious is that this isn't straight plagiarism: the post credits my post as the "source", although given the garbage that's generated, I'm not sure I want that "source" credit :).

Very mystifying...


  1. how on earth do you find out when your post has been plagarized?

  2. I've been seeing this too. Mostly because this comes up when I run a search for my blog name...irritating, but I don't see much to be done about it. It's not like they're easily mistaken for coherent thought.

  3. maybe I'm being innocent here, but why would somebody do that? I fail to see any commercial value in it.

  4. I have various automated blog search feeds that tell me when I am referenced. one from technorati, one from google blog search, etc

  5. yes! I steal stuff just like you note but I have no reson to grab crap from your site

  6. You're right, Technorati is a good way to know how you're referenced, but I didn't know that google blog search did that. Regarding the motivation: The more you publish (your text or the one from someone else) -> the more you get indexed on search engines -> the more visitors you have -> the more money you make through adsense for example.

  7. I think they are trying to make
    "Lil romeo" into an acceptable
    English phrase by embedding it
    in well structured sentences.
    Simply have a webpage with many
    unrelated keyword does not work


Disqus for The Geomblog