List:Summer of Code« Previous MessageNext Message »
From:Derek Schaefer Date:July 31 2009 5:43pm
Subject:GSoC Week 7 - Awesome CSV Import Module is Awesome
View as plain text  
Hey all, checking in.

Key Accomplishments Last Week

 - Improved upon the existing CSV import module. Users can now create
tables based on a CSV import file. Before, you would have to create
the table schema yourself beforehand, which required extra time as
well as greater MySQL knowledge. Now it is handled automatically.
Presently, the module imports all the CSV content as one table. There
is not a reliable way to detect where one table starts and another
ends in order to import them separately, but in any case, it is
something I will be pondering in the coming weeks. I toyed with
splitting the contents into separate tables wherever the column count
differed, but if there were two or more tables adjacent to one another
with the same column count they would be mashed together. If a do
implement a way to detect different tables, it will likely be based on
such a method. I feel this is something I should attempt to do, as
phpMyAdmin allows exporting not only multiple tables to one CSV file,
but also multiple databases. So there is the potential for lots of
tables in one CSV file, and if a user wanted to import that, he/she
would be required to split the file by hand. Yuck.
 - Put the XLS module on the proverbial back-burner and began work on
an XLSX module in its stead, as per Michal's advice.

Key Tasks that Stalled Last Week

 - The XLS module. As I discussed in my last post, to import XLS files
I made use of the phpExcelReader library, which I ended up not being
able to include due to incompatible licenses with phpMyAdmin.
Rewriting the BIFF library myself would be too big an unscheduled task
for an import module which is not all that important. If I have extra
time at the end of GSoC, I will rewrite the library for use in PMA.

Key Concerns

 - Finishing the XLSX module.

Tasks in the Upcoming Week

 - Determine if there are any additional options or enhancements that
the CSV module could make use of, and implement them.
 - Continue work on the XLSX module. It is of a completely different
format and arrangement than ODS. For one thing, I needed only concern
myself with one of the ODS XML files, mainly "content.xml", whereas
here the necessary data is spread across many different files in
several directories within the ZIP container.
 - Move quickly so that I may be able to write my own PHP XLS reader
class. While XLS compatibility is not necessary, I feel it would still
make for a good addition.

-Derek Schaefer
GSoC Week 7 - Awesome CSV Import Module is AwesomeDerek Schaefer31 Jul