convert html to csv

There are many scripts using perl,php,python etc. that will do this for you
but the way you are about to see will make you smile of the simplicity of it .
instead of going over the file line by line and search inside , i am going to use
a tool that is going to do that for me . this tool is lynx , the console browser .
and here is how it works :

lynx -dump file_name.html

now, lets say our table looks like this :

1 2 3 4
5 6 7 8
9 10 11 12

to create a csv file from it , one would do something like this :
use ‘tr’ command to fold all spaces

tr -s " "

now , lets use sed to do the rest of the work for us .
this sed command will remove the first space/tab from the beginning of the lines

sed  's/^[ t]*//'

this sed command will place comma “,” as delimiter instead of space delimiter

sed  's/ /,/g'

So in the end we will end up with a simple one line command that creates a csv from html

lynx -dump file_name.html | tr -s " "|sed -e 's/^[ t]*//' -e 's/ /,/g' > file_name.csv

 
* note : the method shown here can work as long as there are no spaces in cell data