10

How to delete the carriage return / line feed from CSV - EXCEPT at the end of ea...

 3 years ago
source link: https://www.codesd.com/item/how-to-delete-the-carriage-return-line-feed-from-csv-except-at-the-end-of-each-line.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

How to delete the carriage return / line feed from CSV - EXCEPT at the end of each line?

advertisements

Is it possible using a batch file or powershell to remove carriage return/line feed from a CSV without removing those at the natural end of each record.

Essentially I have a file like this:

a1, a2, a3, a4,aaa
aaa a5, a6, a7,aaa aa
a8
b1,b2,b3,b4,b5,b6,b7,b8
c1,c2,c3,c4,c5,c6,c7,c8
d1,d2,d3,d4,d5,d6,d7,d8
e1,e2,e3,e4,eee
e5,e6,e7,e8

As an example, columns 5 and 8 "may" contain a carriage return/line feed. I would like to remove these so the file is 1 line = 1 record.

Is this possible? I am already formatting the file with a batch file so I would like to use this for all of the formatting if possible. I am considering moving to powershell so if it is easier there please let me know (absolute powershell noob).

NP EDIT - each line has the same amount of columns. In this example, 8.


Tricky, but a nice challenge I had to bear... although you did not show any own efforts to solve it...

Here is a script that combines lines of CSV data in case the number of elements does not comply the predefined one. It does not handle the elements individually, it merely appends lines to reach the proposed number. The data must not contain any global wild-card characters like * and ?. There should also not appear any quotation marks, unless they are doubled like "". Here it is:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "FILE_I=%~1"  & rem // (specifies the input CSV file)
set "FILE_O=%~2"  & rem // (specifies the output CSV file)
set "SEPARATOR=," & rem // (is the separator used in the CSV data)
set "REPLACE="    & rem // (is the relacement string for each line-break)
set "NUMITEMS=8"  & rem // (is the proposed number of elements per line)

rem // Validate given input and output CSV files:
if not exist "%FILE_I%" (< "%FILE_I%" set /P ="" & exit /B 1)
if not defined FILE_O set "FILE_O=con"

rem // Initialise data collector and counter for elements:
set "PREV=" & set /A "COUNT=0"
rem // Iterate through lines of input file:
for /F delims^=^ eol^= %%L in ('
    rem/ /* Read input file, output dummy line and deplete output file: */ ^& ^
        type "%FILE_I%" ^& ^> "%FILE_O%" break ^& echo/^& ^
        for /L %%J in ^(2^,1^,%NUMITEMS%^) do @^< nul set /P ^=","
') do (
    rem // Store currently read line:
    set "LINE=%%L"
    rem // Toggle delayed expansion in order not to lose `!`:
    setlocal EnableDelayedExpansion
    rem // Add number of elements of current line to the counter:
    for %%I in ("!LINE:%SEPARATOR%=","!") do (
        endlocal
        set /A "COUNT+=1"
        setlocal EnableDelayedExpansion
    )
    rem // Check whether counter reached given number of elements per line:
    if !COUNT! LEQ %NUMITEMS% (
        rem /* Either proposed number of elements not reached, hence store data
        rem    and wait for next line to have enough elements;
        rem    or number is reached but still wait for the next line, because it
        rem    could be a single element to be appended to the previous line;
        rem    hence the data output is actually delayed by one loop iteration;
        rem    so to not lose the last line, the said dummy line is needed: */
        set "PREV=!PREV!%REPLACE%!LINE!"
        rem // Transport data collector over `endlocal` barrier:
        for /F delims^=^ eol^= %%K in ("!PREV!") do (
            endlocal
            set "PREV=%%K"
            setlocal EnableDelayedExpansion
        )
        rem /* Decrement counter because a single element is considered
        rem    to be part of the last element of the previous line: */
        endlocal
        set /A "COUNT-=1"
        setlocal EnableDelayedExpansion
    ) else (
        rem /* Proposed number of elements exceeded, hence output currently
        rem    collected data, reset collector and counter for elements: */
        if defined REPLACE set "PREV=!PREV:*%REPLACE%=!"
        >> "%FILE_O%" echo !PREV!
        endlocal
        rem // Store current line in data collector and subtract
        rem    the number of output elements from counter: */
        set "PREV=%REPLACE%%%L"
        set /A "COUNT-=%NUMITEMS%"
        setlocal EnableDelayedExpansion
    )
    endlocal
)

endlocal
exit /B

Supposing the script is saved as concat-csv-lines.bat, the input CSV file is called broken-lines.csv and the output file is concatenated.csv, run it by the following command line:

concat-csv-lines.bat broken-lines.csv concatenated.csv

With broken-lines.csv containing the sample data from the question, concatenated.csv will hold:

a1, a2, a3, a4,aaaaaa a5, a6, a7,aaa aaa8
b1,b2,b3,b4,b5,b6,b7,b8
c1,c2,c3,c4,c5,c6,c7,c8
d1,d2,d3,d4,d5,d6,d7,d8
e1,e2,e3,e4,eeee5,e6,e7,e8


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK