From: Date: December 10 2006 9:52pm Subject: Re: load XML data List-Archive: http://lists.mysql.com/internals/34170 Message-Id: <5c3017630612101252k6e643950k6295aea9bb8ac3f8@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_10348_18481879.1165783952295" ------=_Part_10348_18481879.1165783952295 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline *MySQL XML import test drive* Here is a quick test drive for the load xml command, just to show you how i= t works. *Creating the test tables* Start by creating the test table with the following command: create table test.person( person_id int not null, fname varchar(40) null, lname varchar(40) null, created timestamp, primary key(person_id)); Now we are ready for some testing! *First test: data as attributes* For this test we use file person.xml, with the following content: To read this into MySQL, we use the following command: mysql> load xml local infile 'person.xml' into table test.person -> rows identified by ''; Query OK, 2 rows affected (0.08 sec) Records: 2 Deleted: 0 Skipped: 0 Warnings: 0 Check the result with the following command: mysql> select * from test.person; +-----------+-------+------------+---------------------+ | person_id | fname | lname | created | +-----------+-------+------------+---------------------+ | 1 | Erik | Wetterberg | 2006-12-05 18:17:22 | | 2 | Sven | Svensson | 2006-12-05 18:17:22 | +-----------+-------+------------+---------------------+ 2 rows in set (0.00 sec) The attribute values have been matched against column values, and records inserted. The clause "l rows identified by ''" is used to identify the tags that should match a row, one row is created for every tag= . Columns not included in the xml will be filled with their default values. XML attributes not in the database table will be ignored. *Second test: data as tags* But XML files have different formats, some use attributes for values, some use tags and some use both. Our second test illustrates that. For this we use the test file person2.xml, whith the following content: Bj=F6rnBergman StureLarsson To run it we use (almost) the same command: mysql> load xml local infile 'person2.xml' into table test.person -> rows identified by ''; Query OK, 2 rows affected (0.00 sec) Records: 2 Deleted: 0 Skipped: 0 Warnings: 0 To check the result, we use the following command: mysql> select * from test.person; +-----------+--------+------------+---------------------+ | person_id | fname | lname | created | +-----------+--------+------------+---------------------+ | 1 | Erik | Wetterberg | 2006-12-05 18:17:22 | | 2 | Sven | Svensson | 2006-12-05 18:17:22 | | 3 | Bj=F6rn | Bergman | 2006-12-05 18:27:46 | | 4 | Sture | Larsson | 2006-12-05 18:27:46 | +-----------+--------+------------+---------------------+ 4 rows in set (0.00 sec) As you can see, both xml attributes and tags are matched, and the new records are added. *Third test: export and import* Third test involves dumping data and re-importing it into our table. To create the dump xml file use the following command: mysql -X >persondump.xml select * from test.person; ctrl-C This will create the file persondump.xml, with the following content: 1 Erik Wetterberg 2006-12-05 18:17:22 2 Sven Svensson 2006-12-05 18:17:22 3 Bj=F6rn Bergman 2006-12-05 18:27:46 4 Sture Larsson 2006-12-05 18:27:46 We now empty our test table, and reimport it: mysql> truncate table test.person; Query OK, 0 rows affected (0.02 sec) mysql> load xml local infile 'persondump.xml' into table test.person -> rows identified by ''; Query OK, 4 rows affected (0.05 sec) Records: 4 Deleted: 0 Skipped: 0 Warnings: 0 Check the result: mysql> select * from test.person; +-----------+--------+------------+---------------------+ | person_id | fname | lname | created | +-----------+--------+------------+---------------------+ | 1 | Erik | Wetterberg | 2006-12-05 18:17:22 | | 2 | Sven | Svensson | 2006-12-05 18:17:22 | | 3 | Bj=F6rn | Bergman | 2006-12-05 18:27:46 | | 4 | Sture | Larsson | 2006-12-05 18:27:46 | +-----------+--------+------------+---------------------+ 4 rows in set (0.01 sec) As you can see, our records have been reinserted. *Fourth test: more complicated XML file* For our forth test, we need another database table. We create it with the following command: mysql> create table test.address( -> address_id int not null, -> person_id int null, -> street varchar(40) null, -> zip int null, -> city varchar(40) null, -> created timestamp, -> primary key(address_id)); Query OK, 0 rows affected (0.08 sec) We have a xml file with both person and address data: SvennisEriksson
LindaLarsson
To import the person we use the same old command: mysql> load xml local infile 'address.xml' into table test.person -> rows identified by ''; Query OK, 2 rows affected (0.02 sec) Records: 2 Deleted: 0 Skipped: 0 Warnings: 0 This creates two records, because there are two tags in the file. Check the result: mysql> select * from test.person; +-----------+---------+------------+---------------------+ | person_id | fname | lname | created | +-----------+---------+------------+---------------------+ | 1 | Erik | Wetterberg | 2006-12-05 18:17:22 | | 2 | Sven | Svensson | 2006-12-05 18:17:22 | | 3 | Bj=F6rn | Bergman | 2006-12-05 18:27:46 | | 4 | Sture | Larsson | 2006-12-05 18:27:46 | | 5 | Svennis | Eriksson | 2006-12-05 18:51:29 | | 6 | Linda | Larsson | 2006-12-05 18:51:29 | +-----------+---------+------------+---------------------+ 6 rows in set (0.01 sec) Now, we want our address records imported too. For that we run the same fil= e again, but with another table and tag: mysql> load xml local infile 'address.xml' into table test.address -> rows identified by '
'; Query OK, 3 rows affected (0.04 sec) Records: 3 Deleted: 0 Skipped: 0 Warnings: 0 Check the result: mysql> select * from test.address; +------------+-----------+----------------------+-------+------------+-----= ----- -----------+ | address_id | person_id | street | zip | city | created | +------------+-----------+----------------------+-------+------------+-----= ----- -----------+ | 1 | 5 | Desolation Boulevard | 11111 | Memphis | 2006-12-0 5 18:54:41 | | 2 | 5 | Main Street | 11112 | Smalltown | 2006-12-0 5 18:54:41 | | 3 | 6 | River Road | 11113 | New Jersey | 2006-12-0 5 18:54:41 | +------------+-----------+----------------------+-------+------------+-----= ----- -----------+ 3 rows in set (0.01 sec) Note that person_id is inserted, even though it is not in the
tag= . The xml file is a hierarchical structure, and all tags on a higher level than our address tag are also matched against the fields in the address table, to preserve connections between objects, in our case the connection between person and address.. Also note that the address with id 4 is not inserted, since it is commented out in our xml file. And the xml escape sequences < and > are replace= d by < and >. Erik Wetterberg ------=_Part_10348_18481879.1165783952295--