Identi.ca Identi.ca
  • Login
  • Public

    • Public
    • Groups
    • Featured
    • Popular

Conversation

Notices

  1. Evan Prodromou Evan Prodromou

    Somebody went to a huge amount of trouble to set up thousands of accounts. And there's no URLs or spam or whatever. So strange.

    about 5 months ago from web at Belmont Shore, California, United States
    • Stephen Michael Kellat likes this.
    • Stephen Michael Kellat repeated this.
    • Evan Prodromou Evan Prodromou

      It's hard to do, really. You have to solve a captcha and confirm an email address. It's a lot of work for not a lot of gain.

      about 5 months ago
      Jason Riedy likes this.
    • Mike Linksvayer Mike Linksvayer

      bet that amount of work can be had for tiny fraction of $5 and whatever odd objective pursued, worth <$5. just sayin' #fivebucksignup

      about 5 months ago
    • Evan Prodromou Evan Prodromou Mike Linksvayer

      Ah, good point.

      about 5 months ago
    • zoowar zoowar

      They might lie dormant for a couple of months and then activate (based on my Black Friday spam observations).

      about 5 months ago
    • zoowar zoowar Mike Linksvayer

      Sadly $5 USD is a day wage in some countries. I don't like this idea.

      about 5 months ago
    • Evan Prodromou Evan Prodromou

      Well, I've silenced about 5000 already, and there are probably a few more to come. Hope we can handle them by hand.

      about 5 months ago
      Stephen Michael Kellat likes this.
    • foonetic (lnxwalt140) foonetic (lnxwalt140)

      @evan I wasn't counting how many, but I've been silencing them for a few hours now.

      about 5 months ago
      Stephen Michael Kellat likes this.
    • Evan Prodromou Evan Prodromou foonetic (lnxwalt140)

      Thanks a ton. It seems like we're down to just a few.

      about 5 months ago
      Stephen Michael Kellat likes this.
    • Remote profile options...
      Charles Roth Charles Roth

      @evan @evan@identi.ca something similar took down demo.friendika.com as well...

      about 5 months ago
    • Evan Prodromou Evan Prodromou Charles Roth

      Really!? That's bizarre. Same kind of junk posts?

      about 5 months ago
    • Remote profile options...
      Charles Roth Charles Roth

      massive registrations swamped the db and took it down. many of the sn sites also saw massive increases in hits to their…

      about 5 months ago
    • Evan Prodromou Evan Prodromou

      One other thing worth noting is that when someone on identi.ca gets silenced, everyone who registered from the same IP also gets silenced.

      about 5 months ago
    • Evan Prodromou Evan Prodromou

      ...so whoever did this needed tons of IPs to register from.

      about 5 months ago
    • Evan Prodromou Evan Prodromou Charles Roth

      What about Diaspora?

      about 5 months ago
      Jason Riedy likes this.
    • Remote profile options...
      Charles Roth Charles Roth

      @evan apparently not diaspora.

      about 5 months ago
    • Evan Prodromou Evan Prodromou Charles Roth

      I should probably dig up the registration IPs. I wonder if they come from some particular country?

      about 5 months ago
    • Remote profile options...
      foonetic (lnxwalt) foonetic (lnxwalt)

      That has been a concern, that this could spread to joindiaspora.com and rstat.us.

      about 5 months ago
    • Patrick Niedzielski Patrick Niedzielski Charles Roth

      @parlementum @evan Rather suspicious that Diaspora wasn't targeted either, what with all the public hype it has had.

      about 5 months ago
    • Remote profile options...
      Jeremy Pope Jeremy Pope Jeremy Pope

      If you ever need any more modhelper help, my ol’ identica account is @jpopehasmoved@identi.ca ;)

      about 5 months ago
    • Evan Prodromou Evan Prodromou Patrick Niedzielski

      If it's spammers, might make sense. Diaspora doesn't have the same public interface we do. (Public timeline, tag pages, group pages.)

      about 5 months ago
    • Remote profile options...
      Jeremy Pope Jeremy Pope Jeremy Pope

      You should probably direct that notice elsewhere @jpope :/

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca

      *cough* Invite-only *cough* !Identica

      about 5 months ago
      Matías Gabriel Ventura (emegeve) likes this.
    • zoowar zoowar Identi.ca , Samat K Jain

      Spammers are like roaches, once they're in your apartment, they're in.

      about 5 months ago
    • Mike Linksvayer Mike Linksvayer zoowar

      @zoowar gratis with invite. but I know you hate that idea too.

      about 5 months ago
    • zoowar zoowar Mike Linksvayer

      You just created an underground economy selling invites. I don't hate the invite approach, I don't think it solves the problem.

      about 5 months ago
    • zoowar zoowar zoowar

      When g+ was invite only, the only people who couldn't get an invite were people who didn't know anyone with a g+ account.

      about 5 months ago
    • Mike Linksvayer Mike Linksvayer zoowar

      Possibly, but doubtful. AFAIK markets in invites have been at best fleeting as tx cost > value of invite.

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      AFAIK current troubles appear not be the cockroachs in the apartment, but the swarms that keep coming in because the door is open !Identica

      about 5 months ago
      Marjolein Katsma likes this.
    • zoowar zoowar Identi.ca , Samat K Jain

      Spam is the devil we know, an easy target to point our finger. However, spam *volume* is not the issue http://ur1.ca/6ynli

      about 5 months ago
    • zoowar zoowar

      It's not difficult to rent a botnet.

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Spammers could set up an invitation network to distribute invitations among themselves.

      about 5 months ago
      Marjolein Katsma likes this.
    • zoowar zoowar Identi.ca , zoowar

      To fight this one would use the same social classifiers that would also identify dents as spam. Spam filtering is more democratic.

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Been through this already — spammers _will_ do anything. Why is not accepting that and the consequences (a downed site) OK?

      about 5 months ago
      Marjolein Katsma likes this.
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      What's a social classifier?

      about 5 months ago
    • Mike Linksvayer Mike Linksvayer Identi.ca , zoowar

      @zoowar depends on how invites doled out. how many modhelpers are spammers?

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      It's meant to convey a convergence of social graphing and Bayesian classification.

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Does Bayesian classification work when documents are so small? (i.e. 140 characters)

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Also, I don't think anyone has time to implement new code. Site is being overwhelmed _now_

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Also, Bayesian classification would have been worthless for the past downtime. Added accounts apparently had no URLs/spam to classify

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Yes

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Agreed, but I'm not arguing to do anything about spam (right now). I'm arguing that it's not the problem.

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Another spam-herring (misled by spam).

      about 5 months ago
    • Mike Linksvayer Mike Linksvayer Identi.ca , Samat K Jain

      @samatjain maybe no new code temporary solution$ rm actions/register.php

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Don't understand where numbers are from… How do you know whether 5k notices/sec is reasonable? Also, isn't 7k notices/hour = 2 notices/sec?

      about 5 months ago
    • zoowar zoowar Identi.ca , Mike Linksvayer

      I would read a proposal. But besides proof of concept startups and google, what status networks are using invites?

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Have evidence of that? Math doesn't work—not enough feature coefficients. Experience agrees: short spam e-mail doesn't get marked spam

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      360 seconds in an hour. 7200/360 = 20. Even if you don't believe 5k, I know you're not arguing that 20 tps is reasonable.

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      3600 seconds in an hour… http://identi.ca/url/63094227. Was pointing out numbers may be wrong, 20 tps is very reasonable

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Not run a production StatusNet instance myself, but my impression is that isn't well-tuned… Don't expect >400–500 req/s for untuned app

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Also, that's averaging over an hour. What if all those requests came within 5 min, and site unable to recover?

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      You got me. But that only strengthens my argument about spam load not being an issue.

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Mike Linksvayer

      That's extreme—site already has invite functionality. Literally a checkbox or 1 line config change to enable—AND turn off!

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain it should be possible to use Baysian classifications for patterns of use, not just content!

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      How does that work? Not obvious IMHO, and written a Bayesian classifier or two in my time.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain I'd string stuff together and feed that to teh engine - e.g., last N posts (if any), IPs, DNS results, profile data, & repeat

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain it doesn't need to look for patterns in 'content', but patterns in *stuff*, so just make *stuff* into strings it can look at

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Stringing together posts is a nice idea. Not so sure about IP, etc… also, you're not really classifying behavior; still classifying content.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain just giving examples... same act, varying IPs is not content; also look at things like timing between posts

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      How to turn "stuff" into a string? Should point out: billions of $ spent in behavior detection for homeland security. And most doesn't work.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain another pattern I've seen is so many text-only posts, then one with a link... use a sliding window to see how it develops

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain WE can recognize patterns - I think the trick is to encode post metadata across multiple posts in such a way that...

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Not sure what you mean when you say "pattern". Seems like you really mean heuristics? Spammers _will_ defeat heuristics

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain ... a baysian engine can analyze and learn to detect them. there is already plenty of material about patterns WE've seen

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain we've seen, and reported, a lot of patterns already - like 'three links & fill up with hashtags', or '1 dent each hr', etc.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain but most patterns are seen across multiple dents, so you need a sliding window.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain you'd need to experiment but I'm sure it's possble to use a baysian engine for that - plenty of stuff already to feed it to learn

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      That's a heuristic, not a pattern. Heuristic = rule(s) you follow. Problem w/ hard-coded rules is that they are easily defeated…

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      …Show us the code, I guess. Is an !Identica corpus available for download somewhere?

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain no, I *don't* mean hard-coded rules. WE see the patterns, a baysean engine could learn them when fed 'spam' material

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain this is backend stuff - there's a whole big database there... I don't know if things like posting IP are stored though

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain if not, it should be. for IP addresses, add DNS lookups (zombies, proxies etc) before feeding the engine.

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain I *know* some 'pro' software provides the option to cycle through proxies for instance...

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      …Mmm, while it exists, it's not available? Sort of pointless to talk about things _we_ can't actually do (don't expect StatusNet to do it)

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain ...that should be detectable as a pattern if combined with other (meta) data

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Just wondering, do you actually know how Bayesian classification works? "Feeding", "patterns", etc are confusing ways to talk about it

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain @evan mentioned he was working on some sort of baysian engine - I just have some ideas for what to feed it:

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain *just* content of single dents is definitely not enough - you need it to look for patterns across dents (like we do)

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Mm, OK. Bayesian statistics are complicated, mapping to variables to difficult. Mentioned: real-life behavior detection systems don't work

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain from what I know is that you *start* teaching a baysian egine by giving it a bunch of spam and and a bunch of ham

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain define 'don't work' :)

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      All in all, rather see @evan fulfill StatusNet's business plan instead of maintain !Identica. Corp sites don't need heavy spam protection

      about 5 months ago
      Charles Roth likes this.
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      How many such systems (that cost millions) do you know of that have caught any terrorists? Pretty sure if they did, we'd hear about it…

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain that depends - if purely internal, they don't need it - if for customer-facing things like support, they definitely do!

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      For that matter, how many *MUCH* cheaper systems (like surveillance cameras) have caught any terrorists? tl;dr: Not a technology problem

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain no - but here we're talking about patterns WE *can* see - if we can, we can encode their elements to be analyzed

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Mmm I think this discussion is getting to the point: show us the code. You're mixing a lot of different unrelated concepts together

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain what 'unrelated concepts'? spam fighting is all about pattern recognition, and it's never about content alone

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , Samat K Jain

      @samatjain I'm thinking outside of the box, because applying baysian detection to microblogs is (apparently) new... just get started!

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , Marjolein Katsma

      Bayesian classification (e.g. SpamBayes, Bogofilter) is NOT about pattern recognition

      about 5 months ago
    • Remote profile options...
      Charles Roth Charles Roth Marjolein Katsma

      @marjoleink@identi.ca federated statusnet sites seem a better option for a number of reasons. Statusnet is open source …

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Just a good read, http://ur1.ca/6yt2u

      about 5 months ago
    • zoowar zoowar Identi.ca , Marjolein Katsma

      And since most spam wants you to navigate to a url... Monarch http://ur1.ca/6yt2u

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Fascinating! Though, is this proprietary? Paper isn't loading for me (i.e. what are the actual heuristics involved)

      about 5 months ago
    • Samat K Jain Samat K Jain Identi.ca , zoowar

      Nevermind, got the paper. It's very vague. =/

      about 5 months ago
    • zoowar zoowar Identi.ca , Samat K Jain

      Alternate url http://ur1.ca/6yujo

      about 5 months ago
    • Bob Jonkman Bob Jonkman Identi.ca , Samat K Jain

      @samatjain Yes, Bayesian works on small messages, it just takes more of them to get a representative sample.

      about 5 months ago
    • Mike Linksvayer Mike Linksvayer Federated Social Web , zoowar

      Not a proposal, nor even a direct answer to parent, but pure conjecture http://gondwanaland.com/mlog/2011/12/25/fsw-invite/ !fsw

      about 5 months ago
    • Remote profile options...
      maiki maiki Mike Linksvayer

      I make a lot of sites that use invitations for both rationing and/or exclusivity. It helps my hobby projects, because m…

      about 5 months ago
    • Remote profile options...
      maiki maiki Ostatus.org , Mike Linksvayer

      Lately I've been really getting into the use cases for #Diaspora, versus what I am calling the #OStatusphere. I get the…

      about 5 months ago
    • Remote profile options...
      maiki maiki GNU mediagoblin , Mike Linksvayer

      Also, I wonder if we will need to revisit the commercial and invitation system side of this once !MediaGoblin implement…

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , zoowar

      @zoowar a lot of spam is actually profile spam which doesn't want *you* to navigate there, but search engines to index it

      about 5 months ago
    • zoowar zoowar Identi.ca , Marjolein Katsma

      Search engines are free to use monarch in deciding what to index or how to weight results.

      about 5 months ago
    • zoowar zoowar Mike Linksvayer

      A decade ago, invitations were used to generate sales leads.

      about 5 months ago
    • zoowar zoowar zoowar

      Then the open source renaissance put an end to that.

      about 5 months ago
    • Samat K Jain Samat K Jain maiki

      Flickr and Smugmug aren't really free (as in beer) services. But Flickr, at least, does have quite a bit of spam for their free accounts

      about 5 months ago
    • Remote profile options...
      maiki maiki Samat K Jain

      I wasn't thinking of free (beer) services, rather what kind of spam media hosting sites get. I imagine it is mostly com…

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , zoowar

      @zoowar of course ;) but they need a link to find it - which is the main function of profile spam (sometimes a chain of profile links!)

      about 5 months ago
    • Remote profile options...
      lnxwalt-sn lnxwalt-sn Identi.ca , Marjolein Katsma

      A lot of spam is obvious to other users, and leads to still more spam (on vandalized forums and college sites). Their target: search engines

      about 5 months ago
      Marjolein Katsma likes this.
    • Remote profile options...
      lnxwalt-sn lnxwalt-sn Identi.ca , Marjolein Katsma

      Spam links in profiles are one obvious example of this, and the slowness to nofollow all links only hurt #Identica and the SN cloud.

      about 5 months ago
      Marjolein Katsma likes this.
    • Alex Maurin Alex Maurin lnxwalt-sn

      @lnxwalt okay, i'm back. rawr!! let's start disabling those spammer bot accounts!

      about 5 months ago
    • Alex Maurin Alex Maurin Identi.ca , lnxwalt-sn

      @lnxwalt they are link farming. :(

      about 5 months ago
    • Marjolein Katsma Marjolein Katsma Identi.ca , lnxwalt-sn

      @lnxwalt yes, exactly. *sigh*

      about 5 months ago

Site notice

  • API
  • Status

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

Identi.ca is a microblogging service brought to you by Status.net. It runs the StatusNet microblogging software, version 1.1.0-alpha1, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All Identi.ca content and data are available under the Creative Commons Attribution 3.0 license.

Switch to mobile site layout.

Built in Montreal