Velvet Star Monitor

Standout celebrity highlights with iconic style.

news

How can I read pdf in python? [duplicate]

Writer Mia Lopez

How can I read pdf in python?I know one way of converting it to text, but I want to read the content directly from pdf.

Can anyone explain which module in python is best for pdf extraction

0

2 Answers

You can USE PyPDF2 package

#install pyDF2
pip install PyPDF2
# importing all the required modules
import PyPDF2
# creating an object
file = open('example.pdf', 'rb')
# creating a pdf reader object
fileReader = PyPDF2.PdfFileReader(file)
# print the number of pages in pdf file
print(fileReader.numPages)

Follow this Documentation

5

You can use textract module in python

Textract

for install

pip install textract

for read pdf

import textract
text = textract.process('path/to/pdf/file', method='pdfminer')

For detail Textract

2