details about collab filter and miivi q gen apps

From: Nainesh Solanki <nsolanki_at_mediadefender.com>
Date: Fri, 13 Jul 2007 11:52:13 -0700

hi everyone,
the details about collab filter and miivi q gen apps is attached below.
feel free to contact me if there are any questions, i'll be reachable
through my email.
 
thanks,
9esh.
 
 
 
collab filter
------------
 

There are 5 apps working toward collaborative filtering of ares video
data.
 
a) processAresMetaData
process the the meta data for new hashes seen last day by Ty's ares
supply collection.
 
it looks into ares_supply.ares_dht_video_meta_data(on mkting intel db
server) for the new
 
hashes checking for the timestamp field and stores the results into
 
ares_data_analysis.ares_dht_video_meta_data_keyphrases(on mddev02 db
server and mkt_intel db
 
server). There are 2 versions of this app, one pushing results on
mddev02 and other on
 
mkt_intel. both versions are running on my machine(65.120.42.180).
process names:
 
processAresMetaData-auto-mkt_intel and processAresMetaData-auto-mddev02
 
 
 
b) fileIpGenerator
is responsible for generating unique IP and KEYPHRASE pairs for a day's
supply. reads data
 
from ares_supply.ares_raw_video_source_YYYY_MM_DD (on mkt intel db
server). multiple
 
instances of this app can be running on different machines processing
data parallely. the
 
results are pushed to
ares_data_analysis.fileip_ares_raw_video_source_YYYY_MM_DD(on
 
mddev02). right now i have 3 instances of this app running: 1 on my
machine and 2 on
 
mddev02.
 
when its done processing a days data, it signals trackBitmapGen to start
finding
 
ip-bitvector though ares_data_analysis.2cluster_queue (on mddev02).
 
 
 
c) trackBitmapGen
polls ares_data_analysis.2cluster_queue(mddev02). on finding a new
entry, it starts finding
 
ip-bitstream for each title on a given day having atleast a supply of
25. This bitstream
 
result is written on a file and on completion the file is posted to the
DB in the "file
 
table"
ares_data_analysis.2cluster_fileip_ares_raw_video_source_YYYY_MM_DD_file
(mddev02).
 
Also on completion it signals intersectionFinder apps through
 
ares_data_analysis.intersection_finder_queue(mddev02) to start finding
intersections(and
 
hence score) between all titles.
single instance of this app is running on my machine.
 
 
 
d)intersectionFinder(windows) / mdIntersectionFinder(linux)
polls periodically ares_data_analysis.intersection_finder_queue(mddev02)
table for latest
 
entry. upon reading new entry, it downloads the bitstream data from the
"file
 
table"(mentioned above in trackBitmapGen). Multiple instances of these
app find relation
 
score between different titles(keyphrase titles) in parallel.
 
the results are pushed on
 
ares_data_analysis.2cluster_fileip_ares_raw_video_source_YYYY_MM_DD(mdde
v02). also signals
 
findRelated app through ares_data_analysis.related_finder_queue(mddev02)
to find score
 
between individual hashes
 
one instance running on each of the 23 machines on the rack DCB1A.
and on 7 machines on the rack DCA2D(sujays rack, running as low priority
process).
also 3 instances running on mddev02(windows box).
 

 
e)findRelated
polls ares_data_analysis.related_finder_queue(mddev02) for latest entry.
upon reading new
 
entry, it uses the collaborative filtering result on titles to maps them
to individual
 
hashes. uses
ares_data_analysis.ares_dht_video_meta_data_keyphrases(mddev02) to map
the
 
keyphrases back to hashes. The results are pushed to
miivi.related_collaborated(miivi db
 
server) and ares_data_analysis.related_collaborated(mddev02).
multiple instances can be running parallely towards processing data.
right now i have two
 
instances running on each 65.120.42.227 and 65.120.42.226.
 
 
 
 
 

miivi Q gen
-----------
miiviQGen polls miivi.queue_generation_queue(miivi db) for new queue
generation requests.
 
the account names read, are fed to miiviQGenService to find a queue
based on the account
 
prefrences(miivi.user_choices table). uses
 
ares_data_analysis.ares_dht_video_meta_data_keyphrases(mddev02),
 
ares_data_analysis.related_collaborated(mddev02) and miivi thumbnails
table to recommend
 
hashes for a user. the results are pushed to miivi.queue(miivi db). both
apps need to be in
 
the same directory. "miiviQGen" has been turned off right now but can be
turned ON on
 
mddev02. the apps are in "c:\new collab filter\miivi" directory. simply
run miiviQGen from
 
the command line(it will start miiviQGenService when ever needed).
 
 
 
server logins
-------------
 
mkt intel db: 38.102.232.130
db login(onsystems / !0n5yst3m5)
machine login(ssh: root / !0n5yst3m5!)
 
mddev02 db: 65.120.42.247
db login(onsystems / !0n5yst3m5)
machine login(remote desktop: Administrator / !5umyungguy37)
 
my machine: 65.120.42.180
machine login(remote desktop: nsolanki / d!341nd!a)
 
65.120.42.226/227(these are the old machine next to my desk)
machine login(remote : onsystems / ebertsux37)
 

remote servers on rack
DCB1A: ssh root / !0n5yst3m5!
 
 
 
ip updates:
----------
on any ip updates for mkt intel server or mddev02 or miivi server.
edit the "db_servers.txt" file on each directory where any of the apps
is running. (could be
 
tedious, but its better than recompiling the code with new ips). restart
all the apps after
 
editing the file( no need to restart the app if it does not look at that
ip/server for
 
reading data or writing results).
the file is fairly intuitive and ONLY edit the ip part of it when
needed.
 

thing to look for:
-----------------
make sure all the apps are running once in a while(every other morning).
the linux versions
 
of intersectionFinder loose DB connectivity for some reason (you will
see a repeated message
 
on the screen output of that app saying "server is gone" or "could not
run query ..."). in
 
which case simply restart the app. say "service mdIntersectionFinder
restart" on the
 
console. Ty's status monitor is helpful for doing it on multiple
machines together(sujay/jed
 
can help you with that).
i would say check ssh into the racks every morning and see the screen
out of the app(type
 
"screen -r mdInt" on console and "ctrl-A D" to go out of the screen.
remember, ctrl-c will
 
stop the app). see if you have any repeated message on the screen. if
yes then go out of the
 
screen and restart the app.
 
all the apps assume that "db_servers.txt" is in the same directory as
the app("root
 
directory" for linux versions). so make sure you copy that file to any
new location where
 
you run any app from.
also "processAresMetaData", "fileIpGenerator" and "findRelated" need
insertData.exe in the
 
same directory.
windows version of all apps need libmysql.dll in the same directory for
them to run.
 
 
 
where to find latest versions of the executables:
-------------------------------------------------
the latest versions of all executates are on my machine (65.120.42.180)
in the directory
 
d:\new collab filter\[ares|gnutella|miivi].
Received on Fri Sep 14 2007 - 10:55:52 BST

This archive was generated by hypermail 2.2.0 : Sun Sep 16 2007 - 22:19:45 BST