How to delete the carriage return / line feed from CSV - EXCEPT at the end of ea...
source link: https://www.codesd.com/item/how-to-delete-the-carriage-return-line-feed-from-csv-except-at-the-end-of-each-line.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
How to delete the carriage return / line feed from CSV - EXCEPT at the end of each line?
Is it possible using a batch file or powershell to remove carriage return/line feed from a CSV without removing those at the natural end of each record.
Essentially I have a file like this:
a1, a2, a3, a4,aaa
aaa a5, a6, a7,aaa aa
a8
b1,b2,b3,b4,b5,b6,b7,b8
c1,c2,c3,c4,c5,c6,c7,c8
d1,d2,d3,d4,d5,d6,d7,d8
e1,e2,e3,e4,eee
e5,e6,e7,e8
As an example, columns 5 and 8 "may" contain a carriage return/line feed. I would like to remove these so the file is 1 line = 1 record.
Is this possible? I am already formatting the file with a batch file so I would like to use this for all of the formatting if possible. I am considering moving to powershell so if it is easier there please let me know (absolute powershell noob).
NP EDIT - each line has the same amount of columns. In this example, 8.
Tricky, but a nice challenge I had to bear... although you did not show any own efforts to solve it...
Here is a script that combines lines of CSV data in case the number of elements does not comply the predefined one. It does not handle the elements individually, it merely appends lines to reach the proposed number. The data must not contain any global wild-card characters like *
and ?
. There should also not appear any quotation marks, unless they are doubled like ""
. Here it is:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "FILE_I=%~1" & rem // (specifies the input CSV file)
set "FILE_O=%~2" & rem // (specifies the output CSV file)
set "SEPARATOR=," & rem // (is the separator used in the CSV data)
set "REPLACE=" & rem // (is the relacement string for each line-break)
set "NUMITEMS=8" & rem // (is the proposed number of elements per line)
rem // Validate given input and output CSV files:
if not exist "%FILE_I%" (< "%FILE_I%" set /P ="" & exit /B 1)
if not defined FILE_O set "FILE_O=con"
rem // Initialise data collector and counter for elements:
set "PREV=" & set /A "COUNT=0"
rem // Iterate through lines of input file:
for /F delims^=^ eol^= %%L in ('
rem/ /* Read input file, output dummy line and deplete output file: */ ^& ^
type "%FILE_I%" ^& ^> "%FILE_O%" break ^& echo/^& ^
for /L %%J in ^(2^,1^,%NUMITEMS%^) do @^< nul set /P ^=","
') do (
rem // Store currently read line:
set "LINE=%%L"
rem // Toggle delayed expansion in order not to lose `!`:
setlocal EnableDelayedExpansion
rem // Add number of elements of current line to the counter:
for %%I in ("!LINE:%SEPARATOR%=","!") do (
endlocal
set /A "COUNT+=1"
setlocal EnableDelayedExpansion
)
rem // Check whether counter reached given number of elements per line:
if !COUNT! LEQ %NUMITEMS% (
rem /* Either proposed number of elements not reached, hence store data
rem and wait for next line to have enough elements;
rem or number is reached but still wait for the next line, because it
rem could be a single element to be appended to the previous line;
rem hence the data output is actually delayed by one loop iteration;
rem so to not lose the last line, the said dummy line is needed: */
set "PREV=!PREV!%REPLACE%!LINE!"
rem // Transport data collector over `endlocal` barrier:
for /F delims^=^ eol^= %%K in ("!PREV!") do (
endlocal
set "PREV=%%K"
setlocal EnableDelayedExpansion
)
rem /* Decrement counter because a single element is considered
rem to be part of the last element of the previous line: */
endlocal
set /A "COUNT-=1"
setlocal EnableDelayedExpansion
) else (
rem /* Proposed number of elements exceeded, hence output currently
rem collected data, reset collector and counter for elements: */
if defined REPLACE set "PREV=!PREV:*%REPLACE%=!"
>> "%FILE_O%" echo !PREV!
endlocal
rem // Store current line in data collector and subtract
rem the number of output elements from counter: */
set "PREV=%REPLACE%%%L"
set /A "COUNT-=%NUMITEMS%"
setlocal EnableDelayedExpansion
)
endlocal
)
endlocal
exit /B
Supposing the script is saved as concat-csv-lines.bat
, the input CSV file is called broken-lines.csv
and the output file is concatenated.csv
, run it by the following command line:
concat-csv-lines.bat broken-lines.csv concatenated.csv
With broken-lines.csv
containing the sample data from the question, concatenated.csv
will hold:
a1, a2, a3, a4,aaaaaa a5, a6, a7,aaa aaa8 b1,b2,b3,b4,b5,b6,b7,b8 c1,c2,c3,c4,c5,c6,c7,c8 d1,d2,d3,d4,d5,d6,d7,d8 e1,e2,e3,e4,eeee5,e6,e7,e8
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK