MySQL Lists are EOL. Please join:

List:Internals« Previous MessageNext Message »
From:Erik Wetterberg Date:December 10 2006 8:52pm
Subject:Re: load XML data
View as plain text  
*MySQL XML import test drive*
Here is a quick test drive for the load xml command, just to show you how it
works.

*Creating the test tables*

Start by creating the test table with the following command:

create table test.person(
person_id int not null,
fname varchar(40) null,
lname varchar(40) null,
created timestamp,
primary key(person_id));

Now we are ready for some testing!

*First test: data as attributes*

For this test we use file person.xml, with the following content:

<list>

<person person_id='1' fname="Erik" lname="Wetterberg"/>

<person person_id='2' fname="Sven" lname="Svensson"/>

</list>

To read this into MySQL, we use the following command:

mysql> load xml local infile 'person.xml' into table test.person

-> rows identified by '<person>';

Query OK, 2 rows affected (0.08 sec)

Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

Check the result with the following command:

mysql> select * from test.person;

+-----------+-------+------------+---------------------+

| person_id | fname | lname | created |

+-----------+-------+------------+---------------------+

| 1 | Erik | Wetterberg | 2006-12-05 18:17:22 |

| 2 | Sven | Svensson | 2006-12-05 18:17:22 |

+-----------+-------+------------+---------------------+

2 rows in set (0.00 sec)

The attribute values have been matched against column values, and records
inserted. The clause "l rows identified by '<person>'" is used to identify
the tags that should match a row, one row is created for every <person> tag.
Columns not included in the xml will be filled with their default values.
XML attributes not in the database table will be ignored.

*Second test: data as tags*

But XML files have different formats, some use attributes for values, some
use tags and some use both. Our second test illustrates that. For this we
use the test file person2.xml, whith the following content:

<?xml version="1.0"?>

<list>

<person
person_id='3'><fname>Björn</fname><lname>Bergman</lname></person>

<person
person_id='4'><fname>Sture</fname><lname>Larsson</lname></person>

</list>

To run it we use (almost) the same command:

mysql> load xml local infile 'person2.xml' into table test.person

-> rows identified by '<person>';

Query OK, 2 rows affected (0.00 sec)

Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

To check the result, we use the following command:

mysql> select * from test.person;

+-----------+--------+------------+---------------------+

| person_id | fname | lname | created |

+-----------+--------+------------+---------------------+

| 1 | Erik | Wetterberg | 2006-12-05 18:17:22 |

| 2 | Sven | Svensson | 2006-12-05 18:17:22 |

| 3 | Björn | Bergman | 2006-12-05 18:27:46 |

| 4 | Sture | Larsson | 2006-12-05 18:27:46 |

+-----------+--------+------------+---------------------+

4 rows in set (0.00 sec)

As you can see, both xml attributes and tags are matched, and the new
records are added.

*Third test: export and import*

Third test involves dumping data and re-importing it into our table. To
create the dump xml file use the following command:

mysql -X >persondump.xml

select * from test.person;

ctrl-C

This will create the file persondump.xml, with the following content:

<?xml version="1.0"?>

<resultset statement="select * from test.person">

<row>

<field name="person_id">1</field>

<field name="fname">Erik</field>

<field name="lname">Wetterberg</field>

<field name="created">2006-12-05 18:17:22</field>

</row>

<row>

<field name="person_id">2</field>

<field name="fname">Sven</field>

<field name="lname">Svensson</field>

<field name="created">2006-12-05 18:17:22</field>

</row>

<row>

<field name="person_id">3</field>

<field name="fname">Björn</field>

<field name="lname">Bergman</field>

<field name="created">2006-12-05 18:27:46</field>

</row>

<row>

<field name="person_id">4</field>

<field name="fname">Sture</field>

<field name="lname">Larsson</field>

<field name="created">2006-12-05 18:27:46</field>

</row>

</resultset>

We now empty our test table, and reimport it:

mysql> truncate table test.person;

Query OK, 0 rows affected (0.02 sec)

mysql> load xml local infile 'persondump.xml' into table test.person

-> rows identified by '<row>';

Query OK, 4 rows affected (0.05 sec)

Records: 4 Deleted: 0 Skipped: 0 Warnings: 0

Check the result:

mysql> select * from test.person;

+-----------+--------+------------+---------------------+

| person_id | fname | lname | created |

+-----------+--------+------------+---------------------+

| 1 | Erik | Wetterberg | 2006-12-05 18:17:22 |

| 2 | Sven | Svensson | 2006-12-05 18:17:22 |

| 3 | Björn | Bergman | 2006-12-05 18:27:46 |

| 4 | Sture | Larsson | 2006-12-05 18:27:46 |

+-----------+--------+------------+---------------------+

4 rows in set (0.01 sec)

As you can see, our records have been reinserted.

*Fourth test: more complicated XML file*

For our forth test, we need another database table. We create it with the
following command:

mysql> create table test.address(

-> address_id int not null,

-> person_id int null,

-> street varchar(40) null,

-> zip int null,

-> city varchar(40) null,

-> created timestamp,

-> primary key(address_id));

Query OK, 0 rows affected (0.08 sec)

We have a xml file with both person and address data:

<?xml version="1.0"?>

<list>

<person person_id='5'>

<fname>Svennis</fname><lname>Eriksson</lname>

<address address_id='1' street='Desolation Boulevard' zip='11111'

city='Memphis'/>

<address address_id='2' street='Main Street &lt;USA&gt;' zip='11112'

city='Smalltown'/>

</person>

<person person_id='6'>

<fname>Linda</fname><lname>Larsson</lname>

<address address_id='3' street='River Road' zip='11113'

city='New Jersey'/>

<!-- <address address_id='4' street='Wall Street' zip='11114'

city='New York'/> -->

</person>

</list>

To import the person we use the same old command:

mysql> load xml local infile 'address.xml' into table test.person

-> rows identified by '<person>';

Query OK, 2 rows affected (0.02 sec)

Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

This creates two records, because there are two <person> tags in the file.
Check the result:

mysql> select * from test.person;

+-----------+---------+------------+---------------------+

| person_id | fname | lname | created |

+-----------+---------+------------+---------------------+

| 1 | Erik | Wetterberg | 2006-12-05 18:17:22 |

| 2 | Sven | Svensson | 2006-12-05 18:17:22 |

| 3 | Björn | Bergman | 2006-12-05 18:27:46 |

| 4 | Sture | Larsson | 2006-12-05 18:27:46 |

| 5 | Svennis | Eriksson | 2006-12-05 18:51:29 |

| 6 | Linda | Larsson | 2006-12-05 18:51:29 |

+-----------+---------+------------+---------------------+

6 rows in set (0.01 sec)

Now, we want our address records imported too. For that we run the same file
again, but with another table and tag:

mysql> load xml local infile 'address.xml' into table test.address

-> rows identified by '<address>';

Query OK, 3 rows affected (0.04 sec)

Records: 3 Deleted: 0 Skipped: 0 Warnings: 0

Check the result:

mysql> select * from test.address;

+------------+-----------+----------------------+-------+------------+----------
-----------+

| address_id | person_id | street | zip | city | created |

+------------+-----------+----------------------+-------+------------+----------
-----------+

| 1 | 5 | Desolation Boulevard | 11111 | Memphis | 2006-12-0 5 18:54:41 |

| 2 | 5 | Main Street <USA> | 11112 | Smalltown | 2006-12-0 5 18:54:41 |

| 3 | 6 | River Road | 11113 | New Jersey | 2006-12-0 5 18:54:41 |

+------------+-----------+----------------------+-------+------------+----------
-----------+

3 rows in set (0.01 sec)

Note that person_id is inserted, even though it is not in the <address> tag.
The xml file is a hierarchical structure, and all tags on a higher level
than our address tag are also matched against the fields in the address
table, to preserve connections between objects, in our case the connection
between person and address..

Also note that the address with id 4 is not inserted, since it is commented
out in our xml file. And the xml escape sequences &lt; and &gt; are replaced
by < and >.

Erik Wetterberg

Thread
RE: load XML datamailbox@wetterberg.com10 Dec
  • Re: load XML dataErik Wetterberg10 Dec
  • Re: load XML dataMasood Mortazavi2 Apr