The Bulk Copy Course (BCP) let…
The Bulk Copy Course (BCP) lets you import and also export data to along with from SQL Server via the Command Line Software (CLI) or a order file.
But BCP does not surround text career fields with quotation marks anytime exporting data. Through quotation marks I am talking about double quotation scratches ” “. Should you export text field data which includes interruption to a Comma Divided Value (CSV) file the additional commas is going to be misinterpreted as area delimiters by a CSV interpreter. A single common solution is to add the surrounding line marks in just a View’s or even Query’s SQL chain. Including quotation marks within a View makes it redundant for virtually every purpose in addition to bulk copy export products. A more sophisticated fix for your problem is to use the SED. exe application.
The SOZIALISTISCHE EINHEITSPARTEI DEUTSCHLANDS. exe program may surround the text career fields within our data download with quotation scratches. SED. exe is really a Windows Application ported by Unix. It is possible to acquire it right from SourceForge:
The application of SED and also SED. exe are similar by exceptions, one of which I will address afterward in this post. From here on I will refer to SED. exe as SOZIALISTISCHE EINHEITSPARTEI DEUTSCHLANDS.
Our very first step is to pick a rarely employed ASCII character for a field delimiter for the data export.
A Pipe or perhaps Hash character is an effective choice because characters rarely show up within text fields. Our example use a Pipe character, |, as a discipline delimiter. We will export our data to a file named pre_process_file. dat within the temp folder on this C drive. Our command shall be:
bcp our_database_name.. our_view_name OUT m: temppre_process_file. dat /S our_server_name /P /t “|” /c
Let assume we have down loaded the following internet car radio station files:
12343|mtaafm|Africa|East Africas leading online radio station blazing nothing but the finest afrikan vybezz|
4543|Hot Combine Radio|France|HOT MIX Radio rapid House Dance plus DJ’s – The Hot Mixes Live through Paris France – something like 20 DJs Rsidents|
56254|BeirutNights. com|Lebanon|High Strength Italo, Eurodance, Mediterranean and beyond and Trance mixed with French, Arabic, Hebrew, Ancient greek language, Turkish, Real spanish, Armenian|
2522|Hot Sauce Radio|US/Cajun|If You Like Warmth With Flavor, Get one of these Taste Of Some Spicy Jazz, Incredibly hot Funk, Saucy R&B, Tasty Fusion and Sizzling Speak Radio station. |
The last two records possess copious commas inside their description job areas. For CSV interpretation applications like Microsoft Excel this becomes a issue. We will produce SED instructions in a file to place quotation marks and commas where our Pipe area delimiters are situated. Hence, the commas inside the text data by itself will be contained within quotation marks and may not be misunderstood as industry delimiters. The format for using SED is actually:
SED -f sed_script_file download_files > output_file
The actual -f switch means: Read the SED instructions from the file following the transition. The next justifications list all the files we wish to apply the instructions to. The consequence of SED’s finalizing is written to the output document. We will write SED’s instructions into a file called instructions. sed and write down thier results to productivity. csv. We have found our revised command line to HAMBRE:
SOZIALISTISCHE EINHEITSPARTEI DEUTSCHLANDS -f instructions. hambre pre_process_file. dat > productivity. csv
== The SED Software ==
A SOZIALISTISCHE EINHEITSPARTEI DEUTSCHLANDS script can be quite a single regular expression. Like if we like to change the word The african continent with Afrika our SED script would resemble this:
s/Africa/Afrika/g
Typically the syntax for this order is:
h = Replacement /Searched Phrase /with Replacement Word g = Internationally, wherever the looked word it is throughout each document.
I will instruct SED utilizing a series of three regular expressions to get our goal. Processing our next line of data is going to demonstrate the effects of each exercising:
2522|Hot Marinade Radio|US/Cajun|If You Like Heat With Flavor, Try A Taste Of A few Spicy Jazz, Warm Funk, Saucy R&B, Delightful Fusion and Sizzling Talk Airwaves. |
Our first regular expression Armenian singles replaces all Pipe characters with commas surrounded by quote marks. Our first expression looks like this:
s/|/”, “/g
SED reads our expression because:
s sama dengan Substitute /| Water pipe characters / together with “, inches g = Around the globe.
The end result upon our capture is actually:
2522″, “Hot Spices Radio”, “US/Cajun”, “If You love Heat With Flavour, Try A Style Of Some Spicy Brighten, Sizzling Funk, Saucy R&B, Delightful Fusion and Sizzling Talk Radio. inches, ” http://www.unclebrutha.com Our next expression will put a quotation mark at the beginning of each of our history.
s/^/”/
ings = Substitue
/^ quick the record /” having a quotation bench mark.
No need for any international replacement as there is only one starting off of our document. This second manifestation operates upon the effect of our earliest expression:
“2522″, “Hot Sauce Radio”, “US/Cajun”, “If You want Heat With Flavor, Try A Tastes Of Some Spicy Jazz music, Sizzling Funk, Saucy R&B, Delicious Fusion and Sizzling Talk Radio. “, ” http://www.unclebrutha.com Our record now starts with a quotation make.
Our final expression puts a price mark at the end of each of our record. Still there was a great oversight in the transformation from Unix SED to be able to Windows SED that our expression need to cater. Underneath the Unix system the particular end-of-line character is ASCII-12. When producing records within Unix, SED takes out the ASCII-12 character from your equation. Otherwise, Windows uses a couple of characters to identify the final of a set, ASCII-13 as well as ASCII-10.
Our last expression underneath Unix would be as follows:
s/$/”/
Unix version:
s = Substitue
/$ the finish of the track record.
/” having a quotation indicate.
This works because SED removes typically the ASCII-12 character before handling. The result departs our end-of-line identifier unchanged:
inches, ” http://www.unclebrutha.com”[ASCII-12] Similarly, the ported House windows SED only recognises one character as being an end-of-line designation. ASCII-10. Producing our last expression once we would in Unix results in any quotation mark before the ASCII-10 character only. This splits Glass windows end-of-line identifier that is the concatenation regarding ASCII-13 as well as ASCII-10. In this way:
inches, ” http://www.unclebrutha.com[ASCII-13]“[ASCII-10] The end result to outcome. csv is a concatenation of records together with unprintable characters sprinkled throughout. To overcome this concern we require reorder the quotation mark and ASCII-13 to ensure that ASCII-13 appears following the quotation recognise. To accomplish this we affect the original ASCII-13 with a quotation mark walking an ASCII-13 charm. With an ASCII-13 we work with W in our term which means non-alphanumeric identity.
Our last expression, under Home windows, seems as if:
s/W$/”r/
t = Exchange /W$ A new non-alphanumeric character at the end of the particular record (remember SED stripped the past non-alphanumeric ASCII-10 in the equation)
/”r swap with a quotation draw trailing an ASCII-13 dynamics (r).
As soon as SED has processed the past expression it will place a great ASCII-10 character onto the bottom of the report causing:
inch, ” http://www.unclebrutha.com”[ASCII-13][ASCII-10] Thus our polished result is definitely:
“2522″, “Hot Spices Radio”, “US/Cajun”, “If That suits you Heat With Flavor, Try A Flavor Of Some Spicy Jazz music, Popular Funk, Saucy R&B, Yummy Fusion and Sizzling Talk Radio. inches, ” http://www.unclebrutha.com” Just what we required! Of course all of the records will be highly processed that same way within our BCP data obtain.
Right now you understand just how this works let’s recap the necessities.
Our own SED file, referred to as instructions. hambre, holds three standard movement:
s/|/”, “/g
s/^/”/
s/W$/”r/
The phone call to SED to get quotation marks about all of the domains in our BCP info download is definitely:
HAMBRE -f instructions. hambre pre_process_file. dat > outcome. csv
For more information on SED there are several online tutorials available and guides written. It might be advantageous to understand the format of regular expressions used to instruct SOZIALISTISCHE EINHEITSPARTEI DEUTSCHLANDS.
Tags: CSV interpretation applications, SED Software, single regular expression, Hot Combine Radio, double quotation scratches, quotation marks
Recent Comments