SQL Server Bulk Insert CSV Reader
By Peter Bromberg
Have you ever received CSV delimited text files that you need to import into SQL Server? You use the wizard, and everything seems to be fine during the import and then it stops dead in its tracks with an error message, usually about data being truncated. The SQL Server Import Data Wizard works fine for most files, but when you get one of these your development efforts come to a screeching halt. Yes, you can try to edit the SSIS package but often that still won't fix the problem.
What you really need is a CSV parser-reader that allows for settings that will overcome
these read errors. I looked around and tried a few; the one I like the best is
called Lumen.Works.Framework.IO. It has a lot of configuration options and by
coupling this with the SqlBulkCopy class in .NET, you can very quickly create
a utility that can solve most data import problems.
This Windows
Forms Utility allows you to select multiple CSV files (assuming the schema is
the same) in one go. It iterates over the files list, loading and parsing one
file at a time. Every 5000 rows, it performs a SqlBulkCopy insert into your table
and updates the UI so you can see the progress.
In order to use
this, you need to:
1) Create the target table first in your SQL Server
database.
2) Set the connection string in the App.Config file.
3)
Supply the target table name in the form, and
4) Select the file(s) you
want to import.
I've supplied a sample SQL Script you can use
that will create a target table "TEST" in a database named "TEST",
as well as a sample 10,000 row CSV file that is TAB delimited and has a header
row with the column names. The data is simply some voter records that are publicly
available in many states.
The utility is pretty fast - about as
fast as the SQL Server Import Wizard.
Here's the class for
the single Form. The code should be self-explanatory and has comments at important
areas:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Configuration;
using System.Data;
using System.Data.SqlClient;
using System.Drawing;
using System.IO;
using System.Text;
using System.Windows.Forms;
using LumenWorks.Framework.IO.Csv;
using Microsoft.ApplicationBlocks.Data;
namespace CsvReaderDemo
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private string tableName = String.Empty;
private void button1_Click(object sender, EventArgs e)
{
tableName
= txtTableName.Text;
if (tableName.Length <1)
{
MessageBox.Show("Must provide Table Name");
return;
}
DialogResult res = openFileDialog1.ShowDialog();
string[] files = openFileDialog1.FileNames;
foreach
(String file in files)
{
label2.Text
= "INSERTING FROM " + file;
StringReader
rdr = new StringReader(file);
DataTable
dt = CreateDataTable(tableName );
string
cnString = ConfigurationManager.ConnectionStrings["test"].ConnectionString;
int
ctr = 0;
// NOTE: there are several constructor options. This one specifies that there is
a header line ( "true"), and the delimiter is a TAB character.
using (CsvReader csv = new CsvReader(new StreamReader(file), true, '\t'))
{
// Note various behavioral properties that can be set:
csv.SupportsMultiline
= false;
csv.DefaultParseErrorAction
= ParseErrorAction.AdvanceToNextLine;
csv.MissingFieldAction
= MissingFieldAction.ReplaceByNull;
string
s;
string[]
stuff = new string[dt.Columns.Count];
while (csv.ReadNextRecord())
{
csv.CopyCurrentRecordTo(stuff);
object[]
parms = new object[stuff.Length];
for (int i = 0; i < parms.Length; i++)
parms[i]
= stuff[i];
DataRow
row = dt.NewRow();
row.ItemArray
= parms;
dt.Rows.Add(row);
ctr++;
if (ctr % 5000 == 0)
{
InsertRows(dt, tableName
);
// inserted 5000 rows, clear the DataTable for the next batch
dt.Rows.Clear();
this.label1.Text = ctr.ToString();
Application.DoEvents();
}
}
label1.Text
= "COMPLETED.";
}
}
}
private DataTable CreateDataTable( string tableName)
{
SqlConnection
cn = new SqlConnection(ConfigurationManager.ConnectionStrings["test"].ConnectionString);
DataTable
dt = new DataTable();
SqlDataAdapter
da = new SqlDataAdapter("SELECT TOP 1 * FROM " + tableName, cn);
da.Fill(dt);
dt.Rows.Clear();
return dt;
}
private void InsertRows( DataTable dt, string tableName)
{
using (SqlConnection connection =
new SqlConnection(ConfigurationManager.ConnectionStrings["test"].ConnectionString))
{
SqlBulkCopy
bulkCopy =
new SqlBulkCopy
(
connection,SqlBulkCopyOptions.Default ,null
);
// set the destination table name
bulkCopy.DestinationTableName
= tableName;
connection.Open();
// write the data in the "dataTable"
try
{
bulkCopy.WriteToServer(dt);
}
catch
(Exception ex)
{
}
}
}
}
}
You
can download the complete Visual Studio 2010 solution here.
Popularity (1709 Views)
 |
| Biography - Peter Bromberg |
Peter Bromberg is a C# MVP, MCP, and .NET expert who has worked in banking, financial and telephony for over 20 years. Pete focuses exclusively on the .NET Platform, and currently develops SOA and other .NET applications for a Fortune 500 clientele. Peter enjoys producing digital photo collage with Maya,playing jazz flute, the beach, and fine wines. You can view Peter's UnBlog and IttyUrl sites.
|  |
|
|
Article Discussion: SQL Server Bulk Insert CSV Reader
Ayaz Dadla replied
to Peter Bromberg at Sunday, October 30, 2011 12:22 PM
Nice example for SQL Server Bulk Insert - But there is a logic flaw here .. I have modded the code to...
private void button1_Click(object sender, EventArgs e)
{
tableName = txtTableName.Text;
if (tableName.Length <1)
{
MessageBox.Show("Must provide Table Name");
return;
}
DialogResult res = openFileDialog1.ShowDialog();
string[] files = openFileDialog1.FileNames;
DataTable dt = CreateDataTable(tableName);
int ctr = 0;
foreach (String file in files)
{
label2.Text = "INSERTING FROM " + file;
StringReader rdr = new StringReader(file);
string cnString = ConfigurationManager.ConnectionStrings["test"].ConnectionString;
ctr = 0;
// NOTE: there are several constructor options. This one specifies that there is a header line, and the delimiter is a TAB character.
using (CsvReader csv = new CsvReader(new StreamReader(file), true, '\t'))
{
csv.SupportsMultiline = false;
csv.DefaultParseErrorAction = ParseErrorAction.AdvanceToNextLine;
csv.MissingFieldAction = MissingFieldAction.ReplaceByNull;
string s;
string[] stuff = new string[dt.Columns.Count];
while (csv.ReadNextRecord())
{
csv.CopyCurrentRecordTo(stuff);
object[] parms = new object[stuff.Length];
for (int i = 0; i < parms.Length; i++)
parms[i] = stuff[i];
DataRow row = dt.NewRow();
row.ItemArray = parms;
dt.Rows.Add(row);
ctr++;
if (ctr % 5000 == 0)
{
InsertRows(dt, tableName );
// inserted 5000 rows, clear the DataTable for the next batch
dt.Rows.Clear();
this.label1.Text = ctr.ToString();
Application.DoEvents();
}
}
label1.Text = "COMPLETED.";
}
}
InsertRows(dt, tableName);
// inserted 5000 rows, clear the DataTable for the next batch
dt.Rows.Clear();
this.label1.Text = ctr.ToString();
Application.DoEvents();
}