DISQUS

Scripting News: Twitter does have track (Scripting News)

  • Karoli · 1 year ago
    That's not what track does. For starters, using track doesn't require me to sit on a page and refresh it over and over again. When track is used in combination with IM, I don't even have to be there at all, because it will automatically store to my gmail account.

    We covered track pretty extensively in the 8/20 Newsgang Live episode, where I ranted for the better part of an hour about why I wanted it and what it wasn't.

    To track the term davewiner, all I have to do is give my xmpp client the command to "track davewiner" after that, every post to you, from you, or about you will automatically post to my IM client *at the same time* it is posted. Not later, not with latency (which Twitter search/Summize has plenty of).

    Try it on identi.ca. Add laconica@west.spy.net to your IM client and send the command help. Track your name or some other term. Watch what happens.
  • dave · 1 year ago
    Well I want to figure out whether what you want is possible with what Twitter provides, or if there's something missing. I feel there's been too much ranting and not enough careful whittling down of the problem to the core of what's missing. I had a meeting with Jack Dorsey a few weeks ago and I couldn't tell him what was missing. So...

    In that spirit --

    1. They do provide an RSS feed for search queries.

    2. And there are tools that map RSS on to IM (I use one to get the NewsJunk feed delivered to me in GMail, it works).

    Assuming it works -- what's missing now?
  • Karoli · 1 year ago
    The issue is timing. RSS has latency too. What makes track so dang powerful
    is that ability to discover and engage in real time.
  • danmactough · 1 year ago
    If Twitter were to let me poll the RSS feed for the search queries frequently enough, then you would be correct. But they don't allow that. To get close enough to real-time to enable the kind of discovery and conversation that Steve is looking for (IMO), they would need to allow me (and everyone else) to poll every 5 to 10 seconds. My "track bot" polls every minute, and that's good enough for me, but I've got enough going on with work or the baby (depending on where I am) that I couldn't take advantage of real-time if I had it.

    And truthfully, there is frequently quite a bit a of a lag between the time an update is posted and when it shows up on Twitter Search.

    When track and IM were core to Twitter, it was REAL-TIME.

    My only point is that poll != push and the difference is important.
  • dave · 1 year ago
    Well, they explicitly said we could poll as frequently as we want at Bearhugcamp.

    And why would one assume that the backend of Twitter won't improve?

    As I said elsewhere in this thread, Dorsey asked me the kind of question I would ask -- what exactly is it that we need from them. I couldn't answer it. That's the purpose of this thread.

    So far, having read all the responses, I still don't get it.

    If you want to know what I think -- I think you could have exactly what you want. But you'd have to start with the assumption that it's possible.
  • danmactough · 1 year ago
    As Steve said above, they said at BHC that you could email Alex and ask to be whitelisted to poll as much as you want. My understanding is that they have not approved just such a request by Dustin Sallings, who wants to provide a Twitterbot that provides "track" via IM.
  • dave · 1 year ago
    Then that sucks. No sarcasm, or bullshit.
  • stevegillmor · 1 year ago
    now we're on the same page
  • dustin · 1 year ago
    This is correct.

    Also note the math when you start doing what they say literally. So they said I could poll them ``every five seconds.'' If I have 4,805 queries (that's how many are on identispy right now), and I want to play them against summize, ``every five seconds'' means one of two things:

    1) It takes nearly seven hours to run the same query twice.
    2) I'm running 80MM search queries against twitter a day.

    Neither one of those is particularly productive.
  • dustin · 1 year ago
    They also said I could have the feed so that I could give people what they want as soon as gnip offered it. Gnip offered it, twitter is preventing me from having it.

    I currently have 26 tracks on twitterspy and 50 tracks on identispy. To do it the way you're describing, I'd need 76 browser tabs open concurrently, and bounce around to them to see if anyone's talking about anything I'm interested in. That's worse than the computer polling, because now *I* have to.

    When I was in school and we had irc, anybody who wanted to could write a bot that tracked a channel in *real time*. All irc clients do this now. I can have the thing running in the background, and if someone says something I'm interested in, my client will come and tell me.

    All µblogging systems are just a generalization of irc, but with one big room. If you want to engage in a conversation in this global chat room, your only option is to be able to see people talk about things as they happen.

    All of the required technology exists. And, as I've said, gnip has told me they have their finger on a button pending twitter's approval. Twitter is still silent on the issue.
  • bentrem · 1 year ago
    Long-polling came up in context of FriendFeed's "Real-time". That seems to me a very elegant technique.
  • dave · 1 year ago
    As soon as they come out with the API for that I'm going to be trying to code against it.
  • stevegillmor · 1 year ago
    looking forward to you doing that
  • bentrem · 1 year ago
    Most/all I find on the topic relates to Comet e.g. "Amazon EC2 virtual servers; a single virtual machine was used as the Cometd server". Out of my depth here so wondering how FF deployed this.

    Long shot: Orbited.org - "Orbited is a comet daemon that works on many platforms for many languages. It supports comet style long-polling as well as Iframe streaming. It also has a clear scaling path." Comment by Michael Carter (August 8, 2007) in Ajaxian.com

    Addendum: My spidering has run out of steam on this: "FriendFeed become the latest site to enable real-time updates using the long-polling variant of Comet. The real-time Web was something of a theme at this year’s FOWA, with talks on message queues, XMPP and scaling Comet at Meebo." - SimonWillison . net 16th October 2008
  • Michael Markman (Mickeleh) · 1 year ago
    that's close. But are three things missing. In the old days, following tweets with Track in Gtalk, I could do the following.

    1. Track multiple keywords simultaneously--in effect an OR search. I could track, say scobleizer, iphone, biden. I haven't been able to do that with the Twitter search page.

    2. I didn't have to refresh. My IM window would just continuously present tweets that matched any of my tracked words. (behaving something like "Thwirl", but filtered to match only my tracked words.)

    3. replying was instantly available in the same window. Because I was looking at an IM window, there was always an entry field.

    Here's one convenience that track with IM lacked: There was no one-click reply function. If I wanted to respond with an @ reply, I had to type out the twitter user name of the person I wanted to reply to.

    It was a notably different experience from using the Twitter search page. It was much closer to using election.twitter.com. Except that with track, I could use any keywords I wanted to, rather than the handful that are pre-packaged.
  • dave · 1 year ago
    You could have #1 and #2 right now. #3 would take some work.
  • stevegillmor · 1 year ago
    There may be too much ranting and raving about track, but it's not because I am confused about it, Dave. I've been extremely clear for months about it, and what does not exist is what Twitter provided until it was removed. Many third parties have attempted to provide the service, but have been stymied by Twitter's refusal to allow latency-free service either over XMPP or to another provider who would then pass it along. BearHug Camp successfully proved that Twitter would only commit to private conversations not trackable (pun intended) in the open to arrange for latentcy-free or usable conversation-speed track. None of the requests for such access have been approved, either by Twitter or via third party services such as Gnip who Twitter said could "soon" provide such service.

    While many may not see the usefulness of track as we who care have defined it, or understand the repeated requests for such resumption of service as you suggest Jack Dorsey doesn't, the facts are that the service is unavailable due to business reasons, not technology ones. Perhaps if you asked Jack Dorsey or whoever is currently empowered to speak authoritatively to discuss this in an open forum where our request can be specifically satisfied or rejected, then we can move on. To say that this has not been carefully whittled down is factually mistaken.
  • dave · 1 year ago
    Okay except I didn't say you were confused just not correct. I still think you can have what you want, but not approaching it the way you want to go. Sometimes technology has twists to it, you could let some others play here, and not necessarily go the linear route, if what you want is the functionality. If your goal is to prove that Twitter disabled XMPP for business reasons not technology reasons, I agree with you, I think it's obvious, and I've said so many times, as recently as yesterday in the chat on your GG ustream feed.
  • dave · 1 year ago
    Proving that btw is about as useful as proving that Palin didn't actually say Thanks But No Thanks for the BTNW. They just keep saying the same thing and their voters don't care, just as users don't care.
  • stevegillmor · 1 year ago
    Not trying to prove that, just running the misdirections to ground so that real answers will either be forthcoming or alternates will appear.
  • stevegillmor · 1 year ago
    If you can model something that provides realtime track as Michael Markman and Karoli reply below, then I'm all ears. My bet is that getting close will only increase the number of cutoffs of access to disable said functionality. I've been gnawing on this long enough to start beliving my lying eyes.
  • Jackson Miller · 1 year ago
    For me, Track was always about receiving notifications via SMS. I was able to track my username to get the missing replies. I was able to track my company for mentions. I was able to track my hometown to find new twitterers or visitors passing through.

    I hope I am wrong, but I don't think there is any way to do that with Twitter.

    Adding insult to injury is that every once in a while I get a SMS from twitter with one of those Tip: messages at the bottom encouraging me to send "Track subject" via SMS.

    So, I am with Steve. Track does not exist.
  • naveen · 1 year ago
    yes, this is exactly what i loved about track too.

    without it, it's been impossible to follow conversations (@replies). sure, i can use a client like twitterific while mobile - but doing that requires that i leave the client running or that i relaunch every few minutes (sometimes seconds) to get updates.
  • Richard Fisher · 1 year ago
    Current client implementations of search are just that, "search". They have a search page where you enter a term and they return a results page. There's nothing (that I can see) stopping a client from using the search API to provide track via polling. Looks like the requests support the OR operator so no problem there. Poll the search API for tracked terms along with the regular API for updates. It wouldn't be realtime obviously but you could provide track that updates as often as any client is currently providing regular updates. I'm not convinced that realtime vs 2min polling provides enough advantage to justify the headache of 1 person wanting to track thousands of terms.
  • dustin · 1 year ago
    Yeah, this is what twitterspy (xmpp:twitterspy@jabber.org)has been doing since early July.

    There are several users with many queries (currently 494/1793). Two minute latency is still over a million queries a day, or about 15 queries per second. That's pretty much the rate of tweets through twitter. It only needs to get slightly larger before I'm performing more queries for data than I would be if they just sent me stuff. If twitterspy were as big as identispy, I'd be querying them more than twice for every tweet during peak traffic.

    *OR* they could just send out data to aggregators when it comes in and *not* have all of these services polling, not have users waiting for two minutes (or closer to 15-20 as it shows up for real from twitter *after* they do whatever indexing they do). On a small scale (e.g. if you were to grab twitterspy and run it for yourself), the index polling might be OK if you don't mind the latency. On the large scale, it's just stupidly expensive and slow.
  • Richard Fisher · 1 year ago
    If it were implemented on the client side ie. in Twhirl it would eliminate the need for polling on such a large scale by a single point. I haven't tested the number of terms you can include in a single query but ideally a client could poll for the dozen or so terms that any given user is tracking with a single request. Currently Twhirl and similar clients poll twitter around every 3 mins for a users friends timeline, they could poll the search APi at the same time for tracked words and add them to the timeline.
  • dustin · 1 year ago
    That's far, far worse.

    Twitterspy aggregates queries such that (for example), the 9 other users who are tracking "xmpp" on twitter along with me share a single search API hit.

    Take this to an extreme. Let's say half a million twhirl users want to know when someone mentions cottage cheese. That's approaching 3,000 queries per second against twitter *just in case* someone says something about cottage cheese (of course, with that much interest, there'll probably be a high probability of it, but stay with me for a moment...) At peak, that's 100 queries for every message that's sent out, on avergage, it's over 200. (numbers from http://twitter.com/loiclemeur/statuses/911563484 and http://tweetrush.com/ ).

    Now, you should probably note that twhirl integrates with my identispy service for tracking of some other systems over xmpp. In this model, twhirl users just indicate what they want, and the server does the work. If users aren't saying interesting things, the clients are *idle*. It is not until someone actually says "cottage cheese" that action is taken.

    twitterspy is a hybrid model. This is a dumb way to do things, but is overall less expensive for everyone. I reduce the number of queries and alter poll frequencies by interest and user availability and deduplicate them so the half million queries for cottage cheese above turns into a single fairly frequent query. The bandwidth utilization is less for the end users and less for twitter.

    I have the same three minute personal timeline poll in twitterspy, but even that is ridiculous. It's almost 2009. Why are we writing software that polls?

    For my friends who have device updates on, I get the message on my telephone up to three minutes before I get it on my computer. When I first began using twitter, I would get messages from my friends around the same time that they said stuff, and my client wasn't polling.

    The thing that leaves me baffled is that so many people think it's cheaper/easier/faster to service 240,000,000 ``are we there yet'' queries per day (the above example) than it is to just let people know when interesting things happen. It's a different problem, for sure, but the worst implementation I can think of will leave you with fewer queries and a significant reduction in bandwidth costs.
  • Richard Fisher · 1 year ago
    I'm not arguing that it is cheaper/easier/faster to use polling, it's clearly not, what I'm arguing is that it's possible to achieve track using the currently available api's. The overhead for twitter doesn't really matter to me. It's possible to do it and if it reached a point where the overhead mattered to twitter then they'd have to do something about it.
  • dustin · 1 year ago
    Well, you're free to run twitterspy on your own if you'd like to tweak update frequencies and stuff. Do note that they're pretty vague about their search API limits (and I've hit them). If they were to enforce their normal API limits of about 70 reqs/hour and you had 26 tracks, as I do, that means that a given term could only be checked about once every half hour without asking for an extension.

    This is kind of a frustrating part of the polling. They can arbitrarily require some stuff to slow down for some users sometimes. A push feed is on or off. For example, some people started out fairly happy with twitterspy, but wanted it to go a bit faster. Twitter wanted it to go slower (because it's expensive for them), so the latency increased.
  • Chris Kelley · 1 year ago
    That's sort of what I think track should do and used to do. But the one thing that doesn't so is push the results of the track to me. Before twitter turned it off, I would get an SMS anytime one of my track results came up. I used it quite a bit to find new people to follow in the early days. I do use the search and find it valuable, but its something I have to find time to do rather than having the service working for me in the background all the time. The search based track fill-in always makes me feel like I am missing something.