Author
Message
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Polygon Marathon uid effort [one triplicate found] Right, I couldn't find where I posted all the stuff labout this last night on one of the other threads, so decided to start a new one to keep things relevant.
Basically there are two things we should try to solve, one of which will make the other redundant.
First, it would be really helpful if we found out what kind of hashing algorithm they used to generate the user ids for each participant as well as what information was hashed. If we solve this, then we don't need to take the second step, (although I am somewhat halfway through it, but don't mind giving it a rest as it is killing my bandwidth).
I tried some preliminary stuff yetserday, but nothing exhaustive so we can try and work on this some more.
Here is a nice tool that can hash some text using common algorithms.
http://www.paulschou.com/tools/xlate/
The second part is the one I have been working on since last night.
Basically, I made a java app that collected user ids from their site and placed them in my SQL database. So far I have all the user ids from years 268 and 267 for the main runners and puzzle runners, and the year 266 as well for the puzzle runners.
(I need to collect the rest of the user ids today). The reason I am doing this is, because I assume that the user id for each person stays constant throughout the years. If you look the url for the image that represents their name it has no year in the query string.
This means that if the same person has participated more than once in the yearly race, then a duplicate entry would be made in my database, and it would be a simple task to extract their user id.
Now the part where I need your help. I have created some SQL queries that should help me do this, but since they were created last night, I have no idea if they do what I meant them to do, therefor if any of you can correct them I would be very much appreciate it.
The table fo the normal runners is called 'runners' and the table for the puzzle runners is called 'puzruns'. Both tables have one column named 'uid' which stores a user id.
Code:
SELECT COUNT(r.uid) FROM (SELECT DISTINCT(uid) FROM runners) AS r;
This query is supposed to count the number of unique user ids in the runners table.
Code:
SELECT uid FROM runners GROUP BY uid HAVING COUNT(uid) > 1;
This query is supposed to select all user ids that appear more than once in the table.
Code:
SELECT COUNT(u.uid) FROM (SELECT DISTINCT t.uid FROM (SELECT uid FROM runners UNION ALL SELECT uid FROM puzruns) AS t) AS u;
This query unions the main runners and puzzle runners tables together and then counts the number of unique user ids.
Using these queries I should be able to see a discrepancy between the number of unique ids and the total number which should point out that there are some duplicates.
One confounding factor I realised this morning however is this. The results of this years run have not been put up (so if this was Miranda's second participation, then we won't find her).
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 6:36 am
Last edited by Sasuntsi Davit on Sun May 07, 2006 7:00 pm; edited 2 times in total
SteveC
Unfettered
Joined: 05 May 2005 Posts: 381
My opinion is that you won't find any duplicate UIDs - they're unique in the DB and represent that individual's entry in that year, and their unique completion time.
No problem with trying to de-hash the ids though - that could result in a name/year/some data if we're lucky... That or this is just a set up for the story - OR the note number is relevant... somehow...
Posted: Sun May 07, 2006 7:27 am
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Right, I have this kind of working now, (havn't tested it thoroughly, so there may be some bugs I havn't noticed).
Please download and run the app only if you have broadband and a fairly high bandwidth allowance as the app will download about 60 megs worth of html pages, and thats just for the year 268 for the main runners. (Thats all this app will do for the moment as I need to export the rest of the data from my database).
Once the zip file is downloaded, unzip it to any directory. Open up a command prompt and run the jar file from it (don't double click it, this is because it does not have a gui and therefore you won't be able to shut it down unless you ctrl+del it).
Thats all there is to it.
It will show a progress bar so that you can see if it has crashed or not (it won't, hopefully), and give some info about what happening. Once it has finished the process, zip up the userinfo.dat in the data directory of the unzip folder and send it to me at pxcsas(a big !%#$ u to spambots)hotmail.com.
Now I need to get the other data out of my db, oh joy.
Description
Download
Filename
mainrunners268.zip
Filesize
411.99KB
Downloaded
145 Time(s)
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 3:12 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Hmm, looks like on some people's computers the command line output isn't showing up properly. I'm not sure why that is, but it may be something to do with the way the jar file was launched.
Hopefully this will help.
Type this command in
Code:
java -jar PolygonDataMiner.jar
if you use javaw then the command line interface won't show.
I haev attached an image of what it should look like when running (and also gives me a chance to show off my nice desktop ).
Description
Filesize
198.79KB
Viewed
177 Time(s)
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 3:34 pm
fabs_uk
Boot
Joined: 03 Dec 2004 Posts: 59 Location: .cam.ac.uk
you have mail
The unserinfo.dat only came out to 1MB though, which seemed slightly odd. 1.00MB infact.
Oh well, you're more in a position to work it out than I am!
_________________
fabs_uk / hawk
www.perplexcitytrades.com/hawk
11612100440011465211545511441111541410544001470101753701561{50}0
(1) Alternate lines and 3's will get you within two steps
(2) Watch out for the special case! (3) /0
Is no-one gonna try this?
Posted: Sun May 07, 2006 5:08 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
I have so far found one duplicate entry.
Arianna Saldana in 265
Arianna Saldana in 266
EDIT: might be db error
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 5:54 pm
Joe
Guest
just random.. because it may not matter, but on both pages for the duplicate it shows her a being 31 years old which wouldn't be possible unless the race was held at different times of the year.. (or maybe that's her current age..)
Posted: Sun May 07, 2006 6:23 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Joe wrote:
just random.. because it may not matter, but on both pages for the duplicate it shows her a being 31 years old which wouldn't be possible unless the race was held at different times of the year.. (or maybe that's her current age..)
or she lied.
dun dun dun..............
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 6:29 pm
PiratePete
Guest
Arianna also finished in exactly the same time in 264 in position 1516th
in 265 in 1505th
in 266 in 1479th
strange how they are all the same time and the user id does not change.
Posted: Sun May 07, 2006 6:49 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
PiratePete wrote:
Arianna also finished in exactly the same time in 264 in position 1516th
in 265 in 1505th
in 266 in 1479th
strange how they are all the same time and the user id does not change.
That is weird, it might be enough to confirm that it is not a random error in the db...
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 6:54 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Since we seem to have come to a dead end of sorts with this, for the moment, here are some random facts taken from my db about the year 268 competitors in the main run.
Thanks Hawk for processing the year 268 main runners file.
out of all the participants, only 38 people were not taking any forms of legal drugs
(observed due to bug in my processing code).
oldest participant is Kristina Medina at 56 years old
youngest participant is Celine Hays at 20 years old
more to come as I think of more 'interesting' facts to find...
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 7:18 pm
Last edited by Sasuntsi Davit on Mon May 08, 2006 9:20 am; edited 1 time in total
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Heres the file for the main runners in 267 if anyone feels like processing it.
Description
Download
Filename
mainrunners267.zip
Filesize
412.67KB
Downloaded
147 Time(s)
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 7:54 pm
justdig
Boot
Joined: 14 Aug 2005 Posts: 29
I'm pretty sure these "duplicates" are just errors. They're the same entry linked from two different pages. Go to anybody's page and change any of the values, like year or type of race, except the uid, and it'll stay on the same page.
EDIT: Uh, whoops, guess I was wrong about that.
Posted: Sun May 07, 2006 8:00 pm
Last edited by justdig on Sun May 07, 2006 9:13 pm; edited 1 time in total
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
justdig wrote:
I'm pretty sure these "duplicates" are just errors. They're the same entry linked from two different pages. Go to anybody's page and change any of the values, like year or type of race, except the uid, and it'll stay on the same page.
Not really. If you try years 267/268 for Arianna, you get a blank page.
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 8:04 pm
Sasuntsi Davit
Unfettered
Joined: 04 Dec 2004 Posts: 352 Location: London, UK : Yerevan, Armenia
Here are two more files to process.
Description
Download
Filename
mainrunners266.zip
Filesize
411.73KB
Downloaded
153 Time(s)
Description
Download
Filename
mainrunners265.zip
Filesize
411.44KB
Downloaded
155 Time(s)
_________________Sasuntsi Davitł
*Fake kloo inserter guy*
Posted: Sun May 07, 2006 8:28 pm
Display posts from previous: All Posts 1 Day 1 Week 2 Weeks 1 Month 3 Months 6 Months 1 Year Sort by: Post Time Post Subject Author Ascending Descending