Discussion:
Cleanse bookmark file
(too old to reply)
parv
19 years ago
Permalink
(Originally, i posted the below to o.general; later somebody
suggested to post it here, w/ which i agreed.)

Below is a small perl program to scrub the bookmark file. Currently,
it lowercase-s & shortens the bookmark names|titles to 36 characters,
removes icon references, creation dates, & description. Adjust as you
like.

- parv

#!/usr/local/bin/perl -i--OLD
# Make backup of the given file by adding '--OLD' to it.

use warnings; use strict;

die "No Opera bookmark files given\n" unless scalar @ARGV;

my $max_title_length = 36;

while (<>)
{
# Remove some items.
next if m/\b CREATED=/x
or m/\b ICONFILE=/x
or m/\b DESCRIPTION=/x
;

# Don't touch anything other than bookmark entry text including 'Trash'.
if ( $_ !~ m/\b NAME=/x or m/\b NAME= (?i:Trash) /x )
{ print; next; }

chomp;

# Try to minimize loss of meaningful text.
s/(?:problem|bug) \s+ (?:report|ticket)s?/pr/xi;
s/\. (?:org|com|net|gov|mil) \b//xi;

s/(?<=NAME=)(.{1,$max_title_length}).*/lc $1/e;

s/\s+$//;

print $_ , "\n";
}
__END__
--
As nice it is to receive personal mail, too much sweetness causes
tooth decay. Unless you have burning desire to contact me, do not do
away w/ WhereElse in the address for private communication.
parv
19 years ago
Permalink
in message <***@localhost.holy.cow>,
wrote parv ...
Post by parv
Below is a small perl program to scrub the bookmark file. Currently,
it lowercase-s & shortens the bookmark names|titles to 36 characters,
removes icon references, creation dates, & description. Adjust as you
like.
The program has been updated to delete 'ID' which opera (8.5)
updates by itself; put the regexen in hash mainly to be
compiled before entering the loop.

- parv


#!/usr/local/bin/perl -i--OLD

use warnings; use strict;

die "No Opera bookmark files given\n" unless scalar @ARGV;

my $max_title_length = 36;

my $end_space = qr/\s+$/;
my %regex =
( 'delete' => qr/\b (?:I(?:D|CONFILE)|DESCRIPTION|CREATED) =/x

, 'trash' => qr/\b NAME= (?i:Trash) /x

, 'name' => qr/\b NAME=/x
, 'name_value' => qr/(?<=NAME=)(.{1,$max_title_length}).*/
, 'bug_report' => qr/(?:problem|bug) \s+ (?:report|ticket)s?/xi
, 'domain' =>
qr/\.(?:org|com|net|gov|mil| co\.(?:uk|jp|in) )\b/xi
) ;

while (<>)
{
# Remove some items.
next if m/$regex{'delete'}/ ;

# Don't touch anything other than bookmark title or if it is 'Trash'.
if ( $_ !~ m/$regex{'name'}/ or m/$regex{'trash'}/ )
{ print; next; }

chomp;

# Try to minimize loss of meaningful text.
s/$regex{'bug_report'}/pr/;
s/$regex{'domain'}//;

s/$regex{'name_value'}/lc $1/e;

s/$end_space//;

print $_ , "\n";
}

__END__
--
As nice it is to receive personal mail, too much sweetness causes
tooth decay. Unless you have burning desire to contact me, do not do
away w/ WhereElse in the address for private communication.
n***@slaskpost.se
19 years ago
Permalink
Why delete ID if Opera sets new ID:s at next startup?
parv
19 years ago
Permalink
Post by n***@slaskpost.se
Why delete ID if Opera sets new ID:s at next startup?
Mainly to compact the file; that was the also the reason to delete
other lines identified by $regex{'delete'} as i see it.

I think i will write a Opera bookmark (8.5) parser based on
Parse::RecDescent module Real Soon so that others can modify the
file as they see fit.

- parv
--
As nice it is to receive personal mail, too much sweetness causes
tooth decay. Unless you have burning desire to contact me, do not do
away w/ WhereElse in the address for private communication.
Continue reading on narkive:
Loading...