Page 1 of 1

pdf to excel conversion using tabula

Posted: Sat Nov 05, 2022 2:06 am
by sadia20002
Hi!
i am using following code to convert pdf to excel using tabula liberary as tb

# Read pdf into list of DataFrame
df = tb.read_pdf("Result.pdf", pages='20')

# convert PDF into CSV file
tb.convert_into("Result.pdf", "File10.csv", output_format="csv", pages='20-30')

# convert all PDFs in a directory
#tabula.convert_into_by_batch("input_directory", output_format='csv', pages='all')

I wanted to attach base pdf, code.py and output excel file...but while attaching, it says invalid file extension. kindly guide type of file can be attach.

The pdf has two columns on each page, so excel also has two columns on each page, i want excel in one column in such a way that second column is
pasted under the first column following the natural series for each page...before going to 2nd page....as mentioned below
page 1
1 6
2 7
3 8
4 9
5 10
page 2
11 16
12 17
13 18
14 19
15 20

should be converted like in excel as
1
2
3
4
5
6
7
8
9
10
11
12
13


Kindly suggest modification in python code...so it should be done automatically without involving posting processing in excel

regards