Stoppt die 
Vorratsdatenspeicherung! Jetzt klicken & handeln!

So I had the idea to generate fake twitter pages from a users old tweets. I wrote a bunch of Perl scripts to do just that. It worked out well.

The process essentially consists of four steps:

  1. Acquire a users tweets.
  2. Train a fourth order markov model with those tweets.
  3. Generate new tweets by having the markov model spew out new chains.
  4. Generate a twitter page from those tweets.

Except for the last step, I’ve written scripts to do this. (Writing one for the last step wouldn’t be hard, but boring, and it’s easy enough to do by hand)

Step 1: Obtaining tweets.

Essentially, it’s a loop that gets the tweets via twitters api, page for page. Simple enough. The only real problem is twitters rate limiting, so grabbing more than one user per hour does not work.

#!/usr/bin/perl

my $max_page = 200;
my $start_page = 0;
my $user = "username";

for( my $page = $start_page; $page < $max_page; $page++ ) {
        my $cmd = "wget http://twitter.com/statuses/user_timeline/$user.json?page=$page -O - >> tweets.json";
        `$cmd`;
}

Step 2 and 3: Training and generating.

This script takes json as output by the script in step 1 as the input, and outputs generated, fake tweets, one per line.

Since I was too lazy to implement a markov chain myself, I used a library off CPAN to do the heavy lifting.

#!/usr/bin/perl

use JSON;
use Encode;
use Algorithm::MarkovChain;

# Parse JSON
my @tweetsJsonA = <>;
my $tweetsJson =decode_utf8( join( "", @tweetsJsonA ) );
$tweetsJson =~ s/\]\[/,/gi;
my $tweets = decode_json( $tweetsJson );

# Train
my $user = Algorithm::MarkovChain::->new();
foreach my $tweet (@{$tweets}) {
    my @symbs = ("START", split( " ", $tweet->{text}), "END" );
    $user->seed(
        symbols => \@symbs,
        longest => 4
    );
}

# Generate 20 tweets
binmode STDOUT, ":utf8";
for( my $i = 0; $i < 20; $i++ ) {
    my @generated = ("START");
    my $l = 1;
    while( $generated[-1] ne "END" ) {
        @generated = $user->spew(
            length => $l,
            complete => \@generated
        );
        $l++;
    }
    @generated = @generated[1..(@generated-2)];
    print join( " ", @generated ) . "\n";
}

Step 4: Generating a fake twitter page.

This consists of two parts, making the tweets into twittery html, and adding what comes before and after the tweets in a twitter HTML page. For the former, I wrote a small script, again, which mostly just concatenates text a lot, I put it here if you want it (Save as “mktwpage.pl”).

The second part, I’ve done by hand, thus far, assisted by my browsers “Save page” feature. Too lazy to automate.

And, there you have it: Autogenerated fake twitter pages. Halfway convincing, too. Go generate your own!

Something I sort of completely forgot about.

At the GulaschProgrammierNacht 10 - the entropia ev/CCC Karlsruhes annual hacker event - I held a talk about modern OpenGL. While the talk itself wasn’t held very well (I rushed through everything aaaaaaaaah oh god), the base material is, actually, a pretty okay introduction to modern OpenGL - without fixed function pipeline, as it should be.

You can download the english version of the slides here, and - probably more interestingly, can grab a copy of the demo program I made, which is very well-commented and does nothing but simply render a rotating cube, as seen below.

If you’re the type that learns best from examples, have a look.

Also, if you need more writing and explanation, have a look at Joes blog, where a longer explanation of basically the same thing is provided. :)

Cube. Lighted.

(Additional links to a video recording of the talk [german] and german version of the slides after the image)

re: 2 (view/add your own)  / about : , ,