Listing 5.2 department_infile.txt
42 Finance
128 Research and Development
NULL Human Resources
NULL Marketing
This file is in the default LOAD DATA INFILE format, with each row listed on a separate line with tabs between column values. (This is configurable and we will see how in a minute.)
We can load this information into the department table with the following LOAD DATA INFILE statement:
load data local infile 'department_infile.txt'
into table department;
This facility is particularly useful for converting data from another database format, spreadsheet, or CSV (comma-separated values) file.
The LOAD DATA INFILE statement requires the FILE privilege, "Managing User Privileges," for further information, especially if you are having trouble running this command. The privilege to perform this is often restricted for good security reasons—to stop people from loading in /etc/passwd, for example.
The general form of the LOAD DATA INFILE statement is as follows:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'fileName.txt'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[FIELDS
[TERMINATED BY '\t']
[[OPTIONALLY] ENCLOSED BY '']
[ESCAPED BY '\\' ]
]
[LINES TERMINATED BY '\n']
[IGNORE number LINES]
[(col_name,...)]
The optional clauses are as listed here:
The LOW PRIORITY clause works the same way it does in the INSERT statement by waiting for other clients to stop reading from the table. CONCURRENT, on the other hand, allows other clients to read from the table while the bulk insert is going on.
In our example, we specified the optional keyword LOCAL, meaning that the data file is on the client machine. If this is not specified, MySQL will look for the infile on the server.
If you have key clashes while inserting data, REPLACE and IGNORE provide two methods for dealing with this. Specifying REPLACE tells MySQL to replace the old row with the new row, while IGNORE tells MySQL to keep the old row.
The FIELDS and LINES clauses specify how the data in the infile is laid out. The values in the general form are the defaults—each row on a new line, column values separated by tabs. We can also enclose column values in quotes and use the backslash character to escape any special characters (like single quotes) that might confuse MySQL.
The IGNORE number LINES clause tells MySQL to ignore the first number lines in the infile.
The final clause allows you to specify that you only want to read data into some of the table's columns.
A common format to receive data in is CSV or comma-separated values. Many programs can read and write files of this type, but one notable example is Microsoft Excel. Listing 5.3 shows a small CSV file saved from Excel.
Listing 5.3 new_programmers.csv
Name,Job,DepartmentID
Julia Lenin,Programmer,128
Douglas Smith,Programmer,128
Tim O'Leary,Programmer,128
We can load this data into the employee table with the following query:
load data infile 'e:\\new_programmers.csv'
into table employee
fields terminated by ','
lines terminated by '\n'
ignore 2 lines
(name, job, departmentID);
You can see that we have used more options to load this data than we did when the data was in the default format. A few points are worth noting:
Because we have used a Windows/DOS-style path that includes a backslash, we need to escape the backslash. Our path therefore became 'e:\\new_programmers.csv'.
It possibly goes without saying that the fields in a CSV file are terminated by commas, but we need to specify it.
We do not need to specify that lines are terminated by a newline character, but we have chosen to.
This file has a header, so the first two lines do not contain data and should be ignored.
The data in this file does not contain employeeIDs, so to allocate the three columns of data into the four columns in the database, we need to specify what columns (in order) the data will be mapped to. In this case, we have specified (name, job, departmentID).
No comments:
Post a Comment