Well here is some good news. The web harvesting site Tygpress has been temporarily closed down.
I just went to tygpress.com to see if Carol Anne’s (Therapy Bits) posts we’re still showing up. In her post she asked, “If you can check on tygpress for therapy bits and let me know, I’d appreciate it!”
So I did. I typed “tygpress.com” in my browser and this is what came up: Wow! Tygpress.com is “temporarily out of service.” And the out-of-service message even included an apology: “We are extremely sorry to the content owners.” Well I’ll be damned.
We did it. We shut down Tygpress!
Well, at least temporarily! How great is that?
Congratulations to all of you who made your voices heard, who posted complaints to DigitalOcean and to WordPress, who completed the forms, and who used this badge:All of your efforts seem to have worked. Woo hoo!
Now we just need to mobilize like this and maybe Trump will be a one-term President!
I was following this thanks to your reblogs. While it is bad if someone takes our full contents, I was not too concerned when it’s about SEO. I analyzed the site with different tools and I also checked my posts and as expected these spammy sites just can’t outrank your own content in Google. His duplicates of my posts didn’t appear in Google at all. The Google algorithm knows where the content was published first. Apart from that, it’s difficult to rank a scraping site because of the lack of original content. So, the whole idea of this admin was doomed to fail from the beginning.
What remained was content infringement, that’s bad yes. Now, out of a sudden, it’s a technical issue… if we believe the admin, the site was meant to be a blog search that posts only excerpts. Aha! Would be fair, and I guess nobody would complain since the full content could be found on our sites. I am not against blog search engines that only post excerpts of my posts. The only problem is that this story is hard to believe. It’s more likely that the admin realized that he is an idiot because he noticed people can post content like the “not authorized” image and it would still appear on his site. When I noticed this, the first idea I got was to publish a very evil fake story about the site admin and it would still appear on his site. You could have come up with anything, even stuff that could get him arrested in India, and his site would still have picked it up. The thought was funny.
I think he just bailed out with a big “that was not intended” logo on his chute. Anyway, it’s good news… if he now grabs all posts but only the excerpts, it’s fine. Still, this site won’t get anywhere… it will probably disappear within the next 1-2 years when he learned a little bit more about SEO and when he realized how and why Google is only punishing his site.
LikeLiked by 1 person
That’s interesting Dennis. When I saw the “apology” I did wonder if it was a genuine error or a face saving thing. I protested mainly on principle but sadly some bloggers were upset enough about it to shut their own blogs down.
LikeLike
I can’t believe that it was a genuine error. What I found out as well was that he tried to trick search engines by dating all reblogs 3 hours back, attempting to tell search engines that the content was first published on his domain (3 hours earlier than on our sites). That shows he really wanted to rip off people. I question that this was a bug too. However, that was a stupid idea too because that’s not how it works. Search engines get pinged once we publish something. He can date back the reblogs to the 16th century and Google will still know that he just scraped the content from other sites.
Related to the back-pedaling, I guess he didn’t expect that much pressure. People looked him up via WHOIS request, they made reports to the providers and I guess some went through and he got warnings from the providers, I don’t know.
What is funny too is that he thinks by only posting excerpts he will be fine. Writers might be fine with it, and as I previously said I thought I’d be too, but I just realized he’s still duplicating all of our images in the future… and this will upset photographers. I am pretty sure there will be ongoing pressure even with just excerpts.
It’s sad that some people shut their blogs down. But that’s not a good idea.
LikeLiked by 1 person
I don’t suppose these will be the last people to try it. I guess he wasn’t as clever as he thought he was. Can you explain how they would gain financially from what they did because I don’t totally understand that. I didn’t see any advertising on the site.
LikeLiked by 1 person
I didn’t see any ads either but I thought that would be due to my uBlock Origin and NoScript browser extension. I didn’t disable them.
Maybe he would enable ads at a later point, but that would have been a nice attack point for the WordPress community as well since most ad networks have a Terms of Service too, and there would be most likely a paragraph against spam or other things. So, we could get him banned from each of the networks he’s trying to use, with a bit of effort. But the WordPress community is strong as we’ve seen often.
Another motive could have been to use the site to pass on so-called “link juice” to his other projects. But that’s stupid with a spammy site like that because as said, Google algorithm won’t give him much love, and thus there won’t be much link juice to pass on to rank up other sites. I didn’t find any outgoing links to his sites either, so I don’t know what the heck he was doing lol. I gathered many details about him, including real-life information, and I know he has at least 6 more websites. But all of them are scraping sites. One of them is scraping content off from major news sites instead of the blogosphere. And guess what… it’s copying full text as well and it’s still up and running 😉 So much about the “sorry”.
So, it’s either that he’s building a link network at a later point, or he’s monetizing at a later stage… or he’s building the sites to sell them to clueless people at some point because there are markets where you can sell websites or domains. Like all those other SEO scammers from India, he’d probably tout that these sites “could generate great passive income”. Maybe he was also hoping to find buyers for “sponsored posts”. He could add this feature to convince people to spend money, and in return, he would sponsor the posts at the top of the site. But he’d really need to find clueless people in all those cases because you can analyze stats and traffic of any website, and most people would do that before buying one. That’s why I thought, no matter what he’s attempting, the whole idea is doomed because of the spammy nature of the site. So far it didn’t look very well thought out. He didn’t even spend much efforts to stay anonym if you consider that he steals content. He left a long trail of real-life information’s.
But the biggest irony is that the idea is not bad and could be implemented in a genuine way. WordPress blog search is horrible. If he would not just use text excerpts, but also image thumbnails instead of images in original size, writers and photographers wouldn’t see him as content theft, but as a person who really created an alternative WordPress blog search service. I think it would even go under the umbrella of “fair use”. The way he did it, and the way he will continue to do it, will just create enemies because that is content theft.
LikeLiked by 1 person
Thanks Dennis, you certainly found out a lot about him. Many of us wouldn’t have your level of IT knowledge. Sadly, apart from just not writing anything there is not really away to avoid this sort of thing happening is there?
LikeLiked by 1 person
There are many ways to prevent or make it more difficult for content scrapers, but we’re limited here on wordpress.com as we don’t have FTP access to the root folder of our sites. With a self-hosted site there would be all kind of methods. In that case, we could check access logs to find the IP of the scraper and block it. You could also deliver “dummy” content like oversized Lorem Ipsum text, which would basically mess their sites up. You could create infinite loops, redirecting the bot to their own URL, making them scrape their own sites. You could use captchas if one IP does perform actions too fast. You could add invisible honeypot data that human visitors can’t see, to identify scraper IP’s and block it automatically. There are also services that can protect you against scraping, and even plugins. This is where self-hosting is very powerful.
Ironically, devs at wordpress.com could do the same and protect our sites. But you know, they’re more inclined to make WordPress more accessible for the generation of Instagram kids, which seems to be their only priority since years. This is what I dislike because there are tons of more important things to solve here.
So, to really answer your question. No, unless we go self-hosting, we can’t do anything against scraper sites other than unleashing our community power, as happened recently. At least technically, we’re limited on wordpress.com but we can still gather data and file reports and think out other creative ways to fight back. But then again, the Google algorithm is very sophisticated today, and we don’t have to worry that a scraper site does steal our positions in search. It’s unlikely especially for established or long-living sites like ours.
Personally, these scraper sites won’t make me stop blogging.
LikeLiked by 1 person
Good to know Dennis I’d miss your posts and your knowledge if you decided to stop.
LikeLiked by 1 person
I’d miss many of the bloggers I am following too if they’d decide to stop, including you. That’s why I found it sad when I’ve read that some people stopped. I wasn’t affected because I didn’t know the particular bloggers. But I am sure others who knew them were. We create connections here in this community, and even if we are far away, it sometimes feels like we know each other.
It’s probably more likely that I go self-hosting than that I would stop blogging. But even in that case, I would install the WordPress Jetpack plugin, to be able to remain part of this community. It’s possible with the plugin by Automattic. But I don’t want extra work or costs now. Anyway, with each year things get worse here, and they have, for me, irrelevant priorities. So, maybe in some years I have enough of it and host my site on my own.
LikeLiked by 1 person
I’m not going anywhere. WordPress can be annoying but there is a community here that I would miss. If I wanted to change maybe I would try the self hosting idea.
LikeLiked by 1 person