From: Derek Schaefer Date: July 31 2009 5:43pm Subject: GSoC Week 7 - Awesome CSV Import Module is Awesome List-Archive: http://lists.mysql.com/soc/429 Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hey all, checking in. Key Accomplishments Last Week ========================= - Improved upon the existing CSV import module. Users can now create tables based on a CSV import file. Before, you would have to create the table schema yourself beforehand, which required extra time as well as greater MySQL knowledge. Now it is handled automatically. Presently, the module imports all the CSV content as one table. There is not a reliable way to detect where one table starts and another ends in order to import them separately, but in any case, it is something I will be pondering in the coming weeks. I toyed with splitting the contents into separate tables wherever the column count differed, but if there were two or more tables adjacent to one another with the same column count they would be mashed together. If a do implement a way to detect different tables, it will likely be based on such a method. I feel this is something I should attempt to do, as phpMyAdmin allows exporting not only multiple tables to one CSV file, but also multiple databases. So there is the potential for lots of tables in one CSV file, and if a user wanted to import that, he/she would be required to split the file by hand. Yuck. - Put the XLS module on the proverbial back-burner and began work on an XLSX module in its stead, as per Michal's advice. Key Tasks that Stalled Last Week ========================= - The XLS module. As I discussed in my last post, to import XLS files I made use of the phpExcelReader library, which I ended up not being able to include due to incompatible licenses with phpMyAdmin. Rewriting the BIFF library myself would be too big an unscheduled task for an import module which is not all that important. If I have extra time at the end of GSoC, I will rewrite the library for use in PMA. Key Concerns ========== - Finishing the XLSX module. Tasks in the Upcoming Week ===================== - Determine if there are any additional options or enhancements that the CSV module could make use of, and implement them. - Continue work on the XLSX module. It is of a completely different format and arrangement than ODS. For one thing, I needed only concern myself with one of the ODS XML files, mainly "content.xml", whereas here the necessary data is spread across many different files in several directories within the ZIP container. - Move quickly so that I may be able to write my own PHP XLS reader class. While XLS compatibility is not necessary, I feel it would still make for a good addition. -Derek Schaefer