Why spin an article?
You publish articles to get traffic. Most of this traffic comes from search engines, like Google, indexing your article, and showing it to surfers searching for specific keywords and phrases. In the not-too-distant past, search engines began to 'de dupe' their indexes - they only concentrate on one example of any particular article, instead of a whole page of links to the same article on different sites (unless they have nothing else to show you, of course). This makes for a better user experience, which is actually the 'product' a search engine sells.
So nowadays, people spin an article in order to convince search engines to index it individually more than once. With a well-spun article, you can 'dominate' a page of SERPS (fill up a page of search engine results, or at least get several entries on the same page) which means more traffic for you. And more traffic generally means more income for you. Submitting the exact same article to lots of directories is completely pointless nowadays, because only one of them will feature prominently in the search engine listings (usually the ezinearticles one, or whichever directory is currently favored by a particular search engine) for any particular keyword search (unless the search engine doesn't have anything better to show you, of course!).
There's nothing unethical about this - it's called getting maximum bangs per buck from your hard work. A related question would be 'does spinning work'? That question was argued and decided a while ago, and the answer should be self-evident. After all, what would YOU prefer? One identical article published on 500 different sites? Or 500 unique articles published on 500 different sites? If you believe the former, feel free to waste your time spamming your article in identical form all over the web (or 'syndicating' it as the die hard naysayers call it). You'll find out quickly that you might just as well have submitted it once to ezinearticles, or goarticles. At contentBoss we created a website in the top 10,000 US sites (source - Alexa) in less than a year. Our experiences with traditional article marketing prior to this mean that we KNOW it wouldn't have been possible without unique versions of the low numbers of articles we used to achieve it. In fact, we estimated at the time that we'd need to spend about $3,000 per article getting them hand-rewritten in order to achieve the same effect, based on previous sites we'd managed to leverage into top slots.
Is there any 'discounting of backlinks' for duplicate content?
The value of a backlink is determined primarily by the 'weight' of the page it sits on. This is 'Page Rank', as it is still popularly known. If a page is full of duplicate content, it's unlikely to have a high page rank, (usually the entire site it's on is unlikely to have much link juice!), and that of course means that any links from that site or page are effectively devalued by the current algo anyway.
While Google has tried to play down the concept of 'Page Rank' recently, it makes no difference, because whatever you call it, you need to 'rank' a page in some way to be able to order them in your SERPS, and it's this ranking that determines the value of a backlink. What there is evidence for is that pages rammed full of duplicate content without any authority status (i.e. where the page isn't considered by Google to be the 'definitive' version of the content) tend to get dumped down to results page 50 without a second glance. There's no mystery about that - the content already exists on other, more important pages, so it adds no value from Google's point of view. This can happen to entire sites, too - a collection of duplicate content, even if it's on a specific niche, is effectively valueless to Google. Sites full of duplicate content have poor PR because they don't tend to win many natural backlinks, and worse than that, the new enhancements to Google's site analysis process automatically downgrade them. You can try this for yourself - go create a duplicate content page and get it indexed, which isn't hard nowadays. Then see how long you can keep it above page 50. This automated devaluation doesn't affect the 'guts' of Google's algorithm - after all it's a zero sum game (or a '1' sum game, at any rate!).
This is actually the source of the 'duplicate content penalty' issue that people debate so furiously on forums. While it may not actually be a 'penalty' as such (i.e. it is not imposed in order to 'punish'), the end result is the same, and anyone who doesn't think they've been 'penalised' by being dumped into the nether regions by Google isn't actually going to do very well in this business. There's no difference between a duplicate niche site, and an actual page of Google's Serps in terms of what it provides a surfer - access to content on a specific topic. Functionally, they are equivalent, and it wouldn't make any sense for Google to value someone else's collection of links to a set of data above their own set, simply because the competing set was storing a copy of the text locally. Bottom line - if your article is ranked highly on Google for a search, it's passing good link juice. If it's down in the dumps, it ain't.
Does contentboss address sentence spinning?
Yes - if you are generating jet spinner syntax with the 'Auto Jet Spin' page, you can paste in your original text complete with jetspinner syntax indicating the variations in the sentences or paragraphs. ContentBoss will then do the hard work of generating the spin syntax within those brackets, leaving you with a fully-fledged multi-level example of jetspinner syntax. There's no practical limit to how many levels you can add by hand before you click to generate the base level of synonyms, it's down to how much time you want to spend on it, and how good you are at counting brackets. Adding brackets by hand requires care - malformed jetspinner syntax tends to produce less than optimum results.
What percentage uniqueness do you really need?
There's no easy answer to this one because not only do the various search engines have different cut off points for duplication detection at different times (and no one except their employees know exactly what those cut offs are!), but there are also different ways of calculating duplicate percentages. Ultimately, all that matters is how the search engines see your content, NOT how tools like Copyscape and Dupecop see it (These are basically 'plagiarism' detection tools, and do nothing more than make it easier to find examples of where a particular piece of content is being used or abused).
Right now, from a search engine perspective, anything over 30% unique is probably good to go. This is simply because of the way the search engines detect duplication. All of them primarily rely on a process called 'shingling', which means looking for short combinations of words. 'The cat sat' is a three word shingle. Because you can compare two articles on different subjects (e.g compare an article on acne with a totally different article on credit cards) and STILL get a high shingling match (double figures is commonplace, that's just how English works!), it would be dangerous for the engines to require much more than 30% or so, as long as the main technique they use is shingling. Recently, some engines also instituted 'shingle ordering'. This means that not only is the number of matching shingles important, but also the order they occur in. The latest jetspinner enhancement (sentence shuffling) was designed specifically to get round this. To make matters even more interesting, most engines are now implementing a 'pre-test' which looks at a page in situ and without reference to other pages. Computationally, this is far cheaper than trying to match a page to any one of billions of other pages, which is why the engines are so keen on it. Basically, they have realised that they only need to look for malformed grammatical syntax in order to be able to make an initial assessment of a page. Linguistically suspect sentences raise a flag. Raise enough flags relative to the length of the article, and they can discount the whole page immediately without further tests, because it's either VERY badly written, or it's been spun with a cheap spinner.
The 'collaborative' spinners that rely on user input to generate their substitutions are sadly most prone to failing this test, because most of their users either don't have English as a first language, don't care about grammar, or don't have the necessary mental abilities to check it properly. 'The cat sat on the mat.' is a linguistically consistent sentence. 'The cats sit on the mat.' is also consistent. 'The cat sit on the mat.' isn't. It's an easy catch for a search engine, and saves them a lot of work tracking down 'baddass blackhatter wannabees' who think they can game a billion dollar software corporation with a 70 buck piece of amateur software. It's also why 30% unique and linguistically correct is better than 90% unique gibberish. The first one can be indexed and has the chance to earn you some traffic, the second will get you banned.
I was told you need to spin HTML code, add formatting etc to get past Google?
This is nonsense. Whoever told you that is just trying to sell you something. If it was that easy to 'fool' a search engine, everyone would be doing it. Oh, hang on, everyone DID use to do it, back in about 2001, when it actually DID work. Unfortunately, it's about as outdated a concept as 'doorway' pages nowadays. It's VERY easy to remove HTML from a page, and eliminating the HTML first makes it MANY orders of magnitude cheaper to test text than would be the case if the HTML had to be included too. People who claim you need to stuff your article with HTML presumably also think you can 'hide' keywords on a page by changing the font color to 'fool' the engines. Nonsense.
Can Google detect multiple spun versions of an article if they have the same word & sentence count?
In theory, yes, as long as the article is over a certain length. As the word count increases, the likelihood of both sentence and word count being identical decreases rapidly. It's not actually used by the engines, of course, because there are simpler ways to test for spam articles, and besides, it's too easy to add a word here, delete a sentence there and so change the metrics.
Do Spinners leave 'footprints'?
It depends on the spinner. Unmetered spinners (i.e. unrestricted use spinners, typically desktop apps) are easy to detect, because being unmetered, it's possible to suck out the database behind them simply by running numerous articles though them and storing the jetspinner syntax returned. We do this ourselves, and have apps that can tell you with a high degree of probability not only if an article has been spun, but which spinner probably did it. Spinners that cap usage don't leave footprints like this, because any suspiciously high level of throughput raises a flag (not to mention the cost). If WE can do it, you can bet your bottom dollar that Google can too.
How good is Google when it comes to grammar?
Very. They can afford the second best programmers and linguists on the planet. They additionally have the advantage of having an entire copy of the Internet in their database, and thousands of man-years of analysis routines to examine it, which means that they now have a VERY good idea what the probability of any particular word combo (or 'shingle') is. Unusual combos are therefore easy to detect statistically, and are used ruthlessly to eliminate 'spam' material.
I was told that most article directories really don't care about what is submitted and therefore % uniqueness overrides whether the text is legible or not.
Unedited directories (i.e. most of them) will allow almost anything to be published. That's why it's pretty much a waste of time publishing to them. The edited directories (the ones that can actually get you traffic!), as exemplified by ezinearticles.com, are tougher to get into, so you can see that legibility IS important, if you want to generate traffic, and therefore earn money, of course. Interestingly, scripts to quantify legibility are becoming cheaper and more readily available, so at some point soon, even the bottom-end directories will probably implement automated checking features, which will prevent illegible spam being published easily. We're thinking of releasing one ourselves.
Why don't you use a user-generated thesaurus?
ContentBoss uses a different method to generate new versions of sentences. This is deliberate. After all, if I said to you, "hey, how about letting a complete bunch of newbies and amateurs who don't actually speak very good English, and have no interest in anything apart from a quick buck, rewrite your articles for you? It'll only cost $80"… what would you say? Technically, the reasons why user-generated 'statistical' spinning systems are no good for automatic spinning is pretty obvious. Language isn't a statistical construct, it's binary. In other words, a sentence is either 'right' or 'wrong', it's not '75% right'. The sentence 'The cat sat on the mat.' is right. The sentence 'The cat sit on the mat.' is wrong. Attempting to choose synonyms on the basis of statistics (which is what all collaborative spinners do) is doomed to failure for that simple reason. Too many wrong sentences, of course, triggers the search engine alarms, and bang! Your article is toast.
The people who create these spinners are usually just grunt programmers, so they can't be blamed for missing the bigger picture. What they can be blamed for is the shameless exploitation of the poor suckers who pay for their systems. Most of the collaborative spinners rely on the fact that their users don't generally have English as a first language, let alone any written linguistic competency, so the users fail to spot the errors. Also, they tend to present their text as a wild mass of curly brackets and pipes, making it virtually impossible for even a native English speaker to determine whether or not a good job has been done without spinning it several hundred times and laboriously checking it.
Why isn't contentboss just a yearly subscription under $100 like other spinning programs?
You get what you pay for. You pay peanuts, you get content that looks like it was written by the proverbial bunch of monkeys hammering on typewriters. It's expensive to create, maintain and enhance an automated spinner that produces legible text - we should know, we have the only one on the planet right now. It replicates a very time intensive process (manually rewriting an article), and time, as you should be aware by now, is money. Cheap spinners have obviously accepted that their output is not really of any value. If they believed otherwise, they'd charge for it. Obviously.
I want a quick and dirty spinning solution
ContentBoss is a one click spinner, and you don't need to spend 20 minutes per article checking and editing the booboos that cheap spinners give you. How much quicker can it be than 1 click? As for 'dirty'… just use any other spinner. The output is uniformly bad.
My nice cheap spinner shows 90% unique. That's got to be better than 30% unique, right?
Wrong. 30% unique and legible is better than 100% unique and illegible. In fact, 10% unique and legible is better than 100% unique and illegible. If you have to ask why, you need to read up on how search engines work, and what they are trying to offer their users. Cheap spinners rely on the fact that most people haven't thought this through. They also rely on the fact that when they offer you the squiggly line jetspinner output, it's almost impossible (for someone without excellent English and immense powers of concentration) to tell whether it's valid syntax or not. In fact usually the only way is to start generating versions from it, and reading them to see if it actually IS legible. A time-consuming and laborious process. For a few cents, you can, of course, generate it with ContentBoss and save yourself that time and effort. If your time is only worth a few cents an hour, sure, keep doing it manually. Be our guest. On the other hand, if your time is valuable to you...