Raphael Matthias Krug wrote:
> Sasha
>
>> P.S. I have a theory that a habit of printing computer documentation
>> is a road block to becoming a "guru". At least, I have not yet
>> encountered a "guru" that printed much, while at the same time it
>> seems like a struggling user prints a lot. You cannot be 100% sure
>> about the cause and effect relationship, though, but trying to go
>> printless might activate something that speeds up skill acquisition.
>
>
> I just printed the soundex-parts. This was ten lines :-). For
> understanding my problem, see the text below.
> Shawn & Sasha
> I am working with medieval sources, so called taxbooks. They contain
> names, taxamounts and other administrative entries. For my research I
> took nine of these taxbooks. One of my aims is to find out, if many
> taxpayers died or moved or simply stayed, e.g. with diseases. For this
> purpose, I inserted every taxbook in one table.
> To compare the persons in this book, a friend created a php-script/file
> which takes from one book the names and compares them with the other
> books using right now a normal select-statement. The result is on the
> left a name and then as a table for each taxbook a row and if the name
> appears a 1.
Ralph:
I believe it is possible to get the results you want using an SQL query, but you
would need to organize your data in a different way. You will need to write a
script either in PHP or in some other language that will parse out your files
and index the soundexes (or some other phonetic encodings) of the names. You
will need a structure that looks something like this:
create table soundex_idx(col_soundex char(4) not null, doc_id int not null,
num_instances int not null, unique key(col_soundex,doc_id));
create table name_soundex(col_soundex char(4) not null, name varchar(30) not
null, key(col_soundex));
Then, for example, you want to see all the names in document 1 that also occur
in the document 2 (with the soundex defined equivalence) you can do the following:
select distinct name_soundex.name from soundex_idx a, soundex_idx b,name_soundex
where a.doc_id = 1 and b.doc_id = 2 and a.col_soundex = b.col_soundex and
name_soundex.col_soundex = a.col_soundex
--
Sasha Pachev
Create online surveys at http://www.surveyz.com/