Here's some quick-and-dirty parsing code for parsing the recently 'released' AOL search data. Instructions are specifically for SQL Server 2005 (as CSV import is a nightmare), but it should be ok for SQL Server 2000 and you can modify the Perl for MySQL I'd imagine.
1) Create a database called AOL
2) Create the table:
CREATE TABLE [dbo].[AllData](
[AnonID] [varchar](10)
COLLATE Latin1_General_CI_AS NULL,
[Query] [varchar](150)
COLLATE Latin1_General_CI_AS NULL,
[QueryTime] [varchar](50)
COLLATE Latin1_General_CI_AS NULL,
[ItemRank] [varchar](50)
COLLATE Latin1_General_CI_AS NULL,
[ClickURL] [varchar](250)
COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY]
3) Create a DSN for your database:
Control Panel -> Administrative Tools -> Data Sources (ODBC)4) Install ActivePerl or your favourite perl interpreter if you haven't already5) Install DBI and DBD::ODBC if you haven't already by running ppm3-bin from your Perl\bin directory (in the case of ActivePerl anyway) and type:install dbiinstall dbd-odbc6) Copy the following Perl into a file and run it. Change the filename (user-....) for each of the ten files and amend as required.use DBI;
use DBD::ODBC;
$dbh = DBI->connect('DBI:ODBC:aol', "", "");
if (!$dbh)
{
print "z $DBI::err\n$DBI::errstr\n$DBI::state";
}
my $sth = $dbh->prepare(<<SQL);
INSERT INTO AllData
(AnonID, Query, QueryTime, ItemRank, ClickURL)
VALUES (?, ?, ?, ?, ?)
SQL
$FILE = "user-ct-test-collection-01.txt";
open FILE;
$sText = <FILE>;
print "Entering loop\n";
while ($sText ne "")
{
my ($v1, $v2, $v3, $v4, $v5) = split /\t/, $sText;
#print "v1:$v1 v2:$v2 v3:$v3 v4:$v4 v5:$v5\n";
$sth->execute($v1, $v2, $v3, $v4, $v5);
$sText = <FILE>;
}
close FILE;
print "End\n";