FW: MediaDefender Proposal: Web Crawler

From: Ben Grodsky <grodsky_at_mediadefender.com>
Date: Mon, 21 May 2007 10:49:25 -0700

Jay and Randy,
 
Some notes from today's IFPI call. I'm CCing Neil as he is likely to be more in the loop on this project as it progresses.

*
        IFPI wants our web crawler's structure to act in practice similar to what the Leaks team does manually.

                *
                        they're concerned about Leaks analysis automated on larger volume.
                *
                        as "joe public" would look for leaks.

*
        ifpi has 25k pre-release tracks
*
        4-5mm tracks in whole catalog
*
        MD would get access to IFPI db and post back results in same way that we pull it from their system.
*
        MD would use google and yahoo to populate system periodically, as those systems scour the whole internet.
*
        MD's system would then iterate through a small subset of the internet (blogs identified by MD's Leaks team, IFPI's team, and other sources) much more frequently.
*
        ifpi wants us to build it, and test it to show the proof is in the pudding about the Searcher and Matcher
*
        they're concerned about any logic based on proximity of words (e.g., within 40 words of each other may be too broad and pick up too much crap, or within 5 words may be too narrow and miss too much). they didn't accept our response that we're aware of this issue and would tweak accordingly.
*
        they'll give us 5 test artist-albums
*
        business terms: Jeremy and Randy to speak "off-line" about how to move along

                *
                        ifpi has spoken with 3 other vendors about this.
                *
                        they don't want to pay for something sight unseen

-Ben
 
________________________________

From: Jeremy Banks [mailto:Jeremy.Banks_at_ifpi.org]
Sent: Mon 07-May-07 14:28
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko; Rosemary Nolan
Subject: Re: MediaDefender Proposal: Web Crawler

Ben

Sorry, 17th does not work. Let's go for the 21st.

Kind regards

Jeremy

-----Original Message-----
From: Ben Grodsky <grodsky_at_mediadefender.com>
To: Jeremy Banks
CC: Randy Saaf <randy_at_mediadefender.com>; Jay Mairs <jay_at_mediadefender.com>; Mumith Ali; Amaechi L. Okonko <amaechi_at_mediadefender.com>; Rosemary Nolan
Sent: Mon May 07 20:52:10 2007
Subject: RE: MediaDefender Proposal: Web Crawler

Jeremy,

Would it be possible to reschedule our 18 May call for 17 May, when Octavio and Randy will be in London? The time is fine, it's just the date that has become undoable. They would be able to participate in the conversation from your office. I realize the 17th was not one of the days you said was able for you/your team. Otherwise, could we reschedule for 21 May?

Cheers,
Ben

________________________________

From: Jeremy Banks [mailto:Jeremy.Banks_at_ifpi.org]
Sent: Tue 01-May-07 16:30
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko; Rosemary Nolan
Subject: RE: MediaDefender Proposal: Web Crawler

OK, call details:

Tel: +44 (0) 1452 555 499

Participant pin: 115126#

Thanks

Jeremy

________________________________

From: Ben Grodsky [mailto:grodsky_at_mediadefender.com]
Sent: 01 May 2007 22:41
To: Jeremy Banks
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko; Rosemary Nolan
Subject: RE: MediaDefender Proposal: Web Crawler

18th is good.

________________________________

From: Jeremy Banks [mailto:Jeremy.Banks_at_ifpi.org]
Sent: Tue 01-May-07 14:16
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko; Rosemary Nolan
Subject: Re: MediaDefender Proposal: Web Crawler

Hi Ben

Ufortunately Mo and I are both out of the office 10th and 11th May. Do the 14th, 16th or 18th at the same time work for you?

Kind regards

Jeremy

-----Original Message-----
From: Ben Grodsky <grodsky_at_mediadefender.com>
To: Jeremy Banks
CC: Randy Saaf <randy_at_mediadefender.com>; Jay Mairs <jay_at_mediadefender.com>; Mumith Ali; Amaechi L. Okonko <amaechi_at_mediadefender.com>; Rosemary Nolan
Sent: Tue May 01 19:49:51 2007
Subject: RE: MediaDefender Proposal: Web Crawler

Jeremy,

Would you be available for a call 10 May at 530 GMT? That would be 930 our time and we'll be able to sort the appropriate people to be on standby for a call.

Cheers,
Ben

________________________________

From: Jeremy Banks [mailto:Jeremy.Banks_at_ifpi.org]
Sent: Sat 28-Apr-07 11:38
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko; Rosemary Nolan
Subject: Re: MediaDefender Proposal: Web Crawler

Ben

At this stage we are most interested in the 'Data Collection' and 'Automated' sections of your e-maill. Specifically we are interested in learning in detail about your methodology and capacity with regards searching for and identifying links to infringing content. Of particular interest is where you see the boundaries between automated systems when dealing with minimum of 20,000 keywords.

Kind regards

Jeremy

-----Original Message-----
From: Ben Grodsky <grodsky_at_mediadefender.com>
To: Rosemary Nolan
CC: Randy Saaf <randy_at_mediadefender.com>; Jay Mairs <jay_at_mediadefender.com>; Mumith Ali; Amaechi L. Okonko <amaechi_at_mediadefender.com>; Jeremy Banks
Sent: Mon Apr 16 15:46:59 2007
Subject: Re: MediaDefender Proposal: Web Crawler

Jeremy and Rosemary,

MediaDefender's availability depends on the details of the call, so Jeremy once we hear from you we will know who needs to participate and will at that time be able to provide availability.

-Ben

----- Original Message -----
From: Rosemary Nolan <Rosemary.Nolan_at_ifpi.org>
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali <Mumith.Ali_at_ifpi.org>; Amaechi L. Okonko; Jeremy Banks <Jeremy.Banks_at_ifpi.org>
Sent: Mon Apr 16 07:36:16 2007
Subject: RE: MediaDefender Proposal: Web Crawler

Dear Ben,

Jeremy is no longer able to make a conference call today, please advise of your availability for later this week - Jeremy will come back to you with regards to the details of the conference call.

Kind regards

Rosemary Nolan

Team Secretary

IFPI

10 Piccadilly

London

W1J 0DD

Tel: +44 (0)20 7878 7952

Fax: +44 (0)20 7878 6832

Email: rosemary.nolan_at_ifpi.org

________________________________

From: Ben Grodsky [mailto:grodsky_at_mediadefender.com]
Sent: 13 April 2007 16:04
To: Rosemary Nolan
Cc: Randy Saaf; Jay Mairs; Mumith Ali; Amaechi L. Okonko
Subject: Re: MediaDefender Proposal: Web Crawler

Rosemary,

MediaDefender still has not been provided any information regarding the specific subject matter of this phone call. We would need this information in order to plan properly who should be available for the phone call. Is this a technical phone call? An implementation phone call? A business terms phone call? A phone call regarding some legal issue? A phone call regarding something else altogether? We have experts in different fields and wouldn't want to waste IFPI's time being unprepared on the call because that individual was not available to field your questions?

Cheers,
Ben

----- Original Message -----
From: Rosemary Nolan <Rosemary.Nolan_at_ifpi.org>
To: Ben Grodsky
Cc: Randy Saaf; Jay Mairs; Mumith Ali <Mumith.Ali_at_ifpi.org>
Sent: Wed Apr 11 00:21:53 2007
Subject: RE: MediaDefender Proposal: Web Crawler

Dear Ben,

Further to your recent email correspondence with Jeremy Banks, please
advise of your availability to participate in a conference call during
week commencing 16 April.

I look forward to hearing from you shortly.

Kind regards

Rosemary Nolan
Team Secretary

IFPI
54 Regent Street
London
W1B 5RE

Tel: +44 (0)20 7878 7952
Fax: +44 (0)20 7878 6832
Email: rosemary.nolan_at_ifpi.org

-----Original Message-----
From: Jeremy Banks
Sent: 11 April 2007 06:47
To: 'grodsky_at_mediadefender.com'
Cc: 'randy_at_mediadefender.com'; 'jay_at_mediadefender.com'; Mumith Ali;
Rosemary Nolan
Subject: Re: MediaDefender Proposal: Web Crawler

Hi Ben

Thanks for your note, I have cc'd Rosie on this note so she can arrange
a call to discuss.

Regards

Jeremy

-----Original Message-----
From: Ben Grodsky <grodsky_at_mediadefender.com>
To: Jeremy Banks; Mumith Ali
CC: Randy Saaf <randy_at_mediadefender.com>; Jay Mairs
<jay_at_mediadefender.com>
Sent: Wed Apr 11 04:48:38 2007
Subject: RE: MediaDefender Proposal: Web Crawler

Jeremy or Mo,

We were wondering whether you've had time to consider the below. Please
let us know your thoughts.

Cheers,
Ben

________________________________

From: Ben Grodsky
Sent: Wed 21-Mar-07 20:43
To: jeremy.banks_at_ifpi.org; mumith.ali_at_ifpi.org
Cc: Randy Saaf; Jay Mairs
Subject: MediaDefender Proposal: Web Crawler

 Jeremy and Mo,

Per our previous conversation, outlined herein is MediaDefender's
proposed method to gather and transmit information to the IFPI about
illegal website sources for musical tracks.

* Data Collection: MediaDefender will search Google (www.google.com)
for known keywords in order to generate a list of websites leading to
mp3 files, using both an Automated and Human approach. While
MediaDefender will endeavor to develop automated tools, so that a high
volume of searching can be accommodated, MediaDefender recognizes the
inherent limitations in a fully automated system and will rely heavily
on human input.

        * Automated

                        * That list will then be a Focused List of
sites ("Focus List") that MediaDefender iterates through at a high rate
for additional known keywords.
                        * MediaDefender's system will be able to
iterate through a list of over 20,000 key words, provided by IFPI.

                * Human

                        * MediaDefender Data Analysts ("Data
Analysts") will also search High Priority keywords ("High Priority")
several times daily, taking special note to update the Focus List as new
sites generate online chatter/buzz.
                        * Data Analysts will be advanced to user
verification systems, or other tests designed to circumvent automated
website parsing, to facilitate thorough searching on more advanced
websites.

* Reporting: MediaDefender will report to Customer via an XML feed to
Customer's specifications. Said feed will include the artist, album,
source website, time, date and verify the link was accessible at the
time crawled.

Please let us know what your thoughts about this proposal are.

Thanks,

Ben Grodsky
Director of Operations
MediaDefender, Inc.
W: 310.956.3355 M: 323.394.6637
AIM: grodskymd
grodsky_at_mediadefender.com <mailto:grodsky_at_mediadefender.com>
Received on Fri Sep 14 2007 - 10:56:07 BST

This archive was generated by hypermail 2.2.0 : Sun Sep 16 2007 - 22:19:48 BST