RE: User Hash

From: Ben Grodsky <>
Date: Thu, 16 Aug 2007 18:02:51 -0700

I forgot to mention that I did follow-up with our engineering team and here are the answers to the questions I told you I wasn't sure about:

        We are applying killwords to the raw data feeds that you are getting. The MS/Microsoft Office results you were seeing are just an example of killwords that weren't in our system at the time that feed was recorded for you.
        No, we can't provide you the eDonkey server IP in another field.
        Yes, we're still filling in the Kad Hash User IDs. It turns out there was a byte order mismatch so the 15 overlapping hash user IDs you saw between MD and BayTSP were just statistical aberrations (called 'collisions'). Yesterday, 16 Aug 2007, is the first day the byte order for the hash IDs was properly unscrambled in your feed. We are, as I said, working backwards to fill in Hash User IDs as far back as we have data -- this is a slow process, but the August data will be corrected first, then the portion of July data that we already provided will be corrected, then we will continue filling in that data field as far back as we have it.


From: Skinner, Andrew (NBC Universal) []
Sent: Thu 16-Aug-07 17:43
To: Ben Grodsky
Subject: User Hash

Hi Ben,


It looks like my email is working again. Yay!



I just double checked what type of hash Bay is recording. Can you guys start capturing the eMule user hash instead of the KAD hash? Here is what they said,


"It is the eMule user hash. It is not specific to either the

traditional or Kad network, rather it is the client software's own.

It is valuable because it doesn't change from session to session in the application. Unless the user removes the configuration files or completely uninstalls and reinstalls the application, the eMule user hash remains constant.



This is what needs to be recorded.



Received on Fri Sep 14 2007 - 10:55:56 BST

This archive was generated by hypermail 2.2.0 : Sun Sep 16 2007 - 22:19:46 BST