Has anyone tried to parse the whole DMOZ RDF Database into LinksSQL. Im running a Pentium 750 256MB RAM 9GB SCSI Hard Disk and it has taken roughly 30 hours and has just completed 2 Million Links, 80000 Categories and 17646 CategoryHierarchy entries. How many links should I be expecting with a full dump of the DMOZ Directory and how large would the database be at this stage. I will check the MySQL files once completed. On the DMOZ Home page they state 2,346,124 sites - 339,107 categories I dont think Ill be hitting the 339,107 categories. Or Maybe there is a problem with the Parse_RDF.pl script, I have the latest release of this script and not the one that came with the distribution. Well its taken 20 hours to Parse this RDF file I wonder how long it will take to build this database :)
I better check my settings before I try this wouldn't want to make many mistakes. Also do I have to Index the database so I can use the page.cgi to view the directory? I can't remember what was needed to be done or was it actually build the database and if I made changes to the template they would be reflected through the page.cgi script. I think its the later one but anyways someone will confirm this for me.
Thanks
Jason Xuereb
I better check my settings before I try this wouldn't want to make many mistakes. Also do I have to Index the database so I can use the page.cgi to view the directory? I can't remember what was needed to be done or was it actually build the database and if I made changes to the template they would be reflected through the page.cgi script. I think its the later one but anyways someone will confirm this for me.
Thanks
Jason Xuereb